August 13, 2022

Machine Learning Dataset for Radio Signal Classification

This RF signal dataset contains radio signals of 18 different waveforms for the training of machine learning systems. The dataset enables experiments on signal and modulation classification using modern machine learning such as deep learning with neural networks.

The data has been created synthetically by first modulating speech, music and text using standard software. Then the signals are cut into short slices. Each slice is impaired by Gaussian noise, Watterson fading (to account for ionospheric propagation) and random frequency and phase offset. This process generates data, that is close to real reception signals.

Block diagram with steps for dataset generation for RF signal recognition
Dataset Generation Steps

The RF signal dataset “Panoradio HF” has the following properties:

  • 172,800 signal vectors
  • Each signal vector has 2048 complex IQ samples with fs = 6 kHz (duration is 340 ms)
  • The signals (resp. their actual bandwidths) are centered at 0 Hz (+- random frequency offset, see below)
  • random frequency offset: +- 250 Hz
  • random phase offset
  • signal power is normalized to 1
  • SNR values: 25, 20, 15, 10, 5, 0, -5, -10 dB (AWGN)
  • fading channel: Watterson Model as defined by CCIR 520
  • 18 Transmission Modes / Modulations (primarily appear in the HF band):
Mode NameModulationBaud Rate
Morse CodeOOKvariable
RTTY 45/170FSK, 170 Hz shift45
RTTY 50/170FSK, 170 Hz shift50
RTTY 100/850FSK, 850 Hz shift850
Olivia 8/2508-MFSK31
Olivia 16/50016-MFSK31
Olivia 16/100016-MFSK62
Olivia 32/100032-MFSK31
Navtex / Sitor-BFSK, 170 Hz shift100
Single-Sideband (upper)USB
Single-Sideband (lower)LSB
AM broadcastAM
SIgnal Types

Some exemplary IQ signals of different type, different SNR (Gaussian) and different frequency offset

Exemplaray RF signals in the machine learning dataset

The RF signal dataset “Panoradio HF” is available for download in 2-D numpy array format with shape=(172800, 2048)

Related Publications: