Beginner ML Project | Classification using Random Forest | Dataset: RadioML 2016.10A
Satellite communication systems operating under the DVB-S2X standard dynamically switch between multiple modulation schemes β including QPSK and 8PSK β based on prevailing channel conditions. Accurate and automatic identification of these modulation schemes at the ground receiver is essential for reliable signal synchronization. Traditional manual and rule-based modulation recognition methods are computationally expensive and error-prone under noisy conditions.
This project presents an Automatic Modulation Recognition (AMR) system that uses a Random Forest classifier to identify DVB-S2X modulation schemes from raw In-phase and Quadrature (I/Q) signal samples. The system is trained and evaluated on the RadioML 2016.10A dataset β a synthetic dataset generated using GNU Radio, containing 11 modulation types across SNR levels from -20 dB to +18 dB, with 1,000 samples per modulation-SNR pair. Statistical features including amplitude, phase, and kurtosis are extracted from I/Q samples and used for classification. Model performance is evaluated using classification accuracy and confusion matrices across multiple SNR levels.
AMR-DVB-S2X/
β
βββ data/
β βββ RML2016.10a.pkl # Downloaded dataset (place here)
β
βββ src/
β βββ load_data.py # Load and filter dataset
β βββ feature_extraction.py # Extract features from I/Q samples
β βββ train_model.py # Train Random Forest classifier
β βββ evaluate.py # Accuracy + confusion matrix
β
βββ models/
β βββ random_forest_model.pkl # Saved trained model
β
βββ results/
β βββ confusion_matrix.png # Output plot
β βββ accuracy_per_snr.png # Output plot
β
βββ requirements.txt # Python dependencies
βββ README.md # This file
| Field | Detail |
|---|---|
| Name | RadioML 2016.10A |
| Source | DeepSig Inc. (GNU Radio synthetic) |
| Modulation Types | 11 total (we use QPSK, 8PSK) |
| SNR Range | -20 dB to +18 dB (2 dB steps) |
| Samples per class/SNR | 1,000 |
| Total Samples | 220,000 |
| Format | .pkl (Python pickle) |
| Signal Format | Raw I/Q (In-phase & Quadrature) samples |
Direct download:
http://opendata.deepsig.io/datasets/2016.10/RML2016.10a.tar.bz2
After downloading, extract and place RML2016.10a.pkl inside the data/ folder.
| Layer | Technology |
|---|---|
| Language | Python 3.8+ |
| Data Loading | pickle, numpy |
| Feature Extraction | numpy, scipy |
| ML Model | scikit-learn (Random Forest) |
| Evaluation | scikit-learn, matplotlib, seaborn |
| Model Saving | joblib |
git clone https://github.com/your-username/AMR-DVB-S2X.git
cd AMR-DVB-S2Xpip install -r requirements.txt# Download
wget http://opendata.deepsig.io/datasets/2016.10/RML2016.10a.tar.bz2
# Extract
tar -xvjf RML2016.10a.tar.bz2
# Move to data folder
mv RML2016.10a.pkl data/python src/load_data.py
python src/feature_extraction.py
python src/train_model.py
python src/evaluate.pyRaw Dataset (RML2016.10a.pkl)
β
Load & Filter
(Keep only QPSK, 8PSK)
β
Feature Extraction from I/Q
- Mean Amplitude
- Variance of Amplitude
- Kurtosis
- Mean Phase
- Variance of Phase
- Max Amplitude
β
Train/Test Split (80/20)
β
Train Random Forest Classifier
β
Evaluate:
- Overall Accuracy
- Accuracy per SNR level
- Confusion Matrix
β
Save Model β models/random_forest_model.pkl
| Metric | Expected Value |
|---|---|
| Accuracy at high SNR (β₯ 10 dB) | > 90% |
| Accuracy at low SNR (β€ -10 dB) | 50β65% |
| Overall Accuracy | ~75β85% |
| Classes | QPSK, 8PSK |
numpy
scipy
scikit-learn
matplotlib
seaborn
joblib
pickle5
This project is designed to be extended using Claude Code β Anthropic's agentic CLI coding tool.
Claude Code is a command-line tool that understands your codebase and helps you develop faster using natural language. It can read files, write code, run commands, and handle Git workflows autonomously.
macOS / Linux / WSL:
curl -fsSL https://claude.ai/install.sh | bashWindows PowerShell:
irm https://claude.ai/install.ps1 | iex
β οΈ Requires a Claude Pro, Max, Team, Enterprise, or Console account. The free plan does not include Claude Code access.
After installation, log in:
claude
# Follow the browser prompts to authenticateNavigate to the project folder and launch Claude Code:
cd AMR-DVB-S2X
claudeExplore the dataset:
"Load the RML2016.10a.pkl file and show me the structure,
available modulation types, and sample shapes"
Build feature extraction:
"Write a Python script that extracts amplitude mean, variance,
kurtosis, and phase statistics from I/Q signal samples in the dataset"
Train the model:
"Train a Random Forest classifier on the extracted features,
split 80/20 train/test, and print the accuracy"
Add more modulations:
"Update the pipeline to also include AM-DSB and WBFM
modulations from the dataset alongside QPSK and 8PSK"
Try other models:
"Replace the Random Forest with an SVM classifier using
an RBF kernel and compare accuracy with Random Forest"
Plot results:
"Generate a confusion matrix heatmap and an accuracy vs SNR
line plot, and save them to the results/ folder"
Improve accuracy:
"Add more features like spectral kurtosis and autocorrelation
to the feature extraction script and retrain the model"
Save and load the model:
"Save the trained Random Forest model using joblib to
models/random_forest_model.pkl and write a load function"
| Week | Task | Status |
|---|---|---|
| Week 1 | Project planning, problem definition, tooling setup | β Done |
| Week 2 | Functional modeling, use case diagrams, actor identification | β Done |
| Week 3 | Dataset download, exploration, preprocessing | π² Pending |
| Week 4 | Feature extraction from I/Q samples | π² Pending |
| Week 5 | Model training (Random Forest) | π² Pending |
| Week 6 | Evaluation, confusion matrix, accuracy vs SNR | π² Pending |
| Week 7 | Model optimization, compare with SVM/KNN | π² Pending |
| Week 8 | Final report and documentation | π² Pending |
| Role | Responsibility |
|---|---|
| Project Manager | Planning, tracking, documentation |
| Signal Processing Engineer | Dataset handling, feature extraction |
| ML Engineer | Model training, evaluation, optimization |
| Testing Lead | Validation, results verification |
- T.J. O'Shea, "RadioML2016.10A", DeepSig Inc., 2016
- ETSI EN 302 307-2 β DVB-S2X Standard Specification
- Scikit-learn Documentation: https://scikit-learn.org
- Claude Code Documentation: https://docs.anthropic.com/en/docs/claude-code/overview
- You do not need to understand all the signal processing theory to run this project
- Claude Code can explain any part of the code to you β just ask it in plain English
- Start by running the scripts one by one and reading the output
- If anything breaks, paste the error into Claude Code and ask it to fix it
Project developed as part of academic coursework on Automatic Modulation Recognition for satellite communication systems.