May 25, 2022

Machine Learning Dataset for Radio Signal Classification

This RF signal dataset contains radio signals of 18 different waveforms for the training of machine learning systems. The data has been created synthetically by applying Gaussian noise and Watterson fading (to account for ionospheric propagation) as channel model, plus random frequency and phase offsets. The original signals consist of speech, music and text, that were modulated using standard software. The signal types primarily appear in the HF bands. The dataset enables experiments on signal and modulation classification using modern machine learning such as deep learning with neural networks.

The RF signal dataset has the following properties:

  • 172,800 signal vectors
  • Each signal vector has 2048 complex IQ samples with fs = 6 kHz (duration is 340 ms)
  • The signals (resp. their actual bandwidths) are centered at 0 Hz (+- random frequency offset, see below)
  • random frequency offset: +- 250 Hz
  • random phase offset
  • signal power is normalized to 1
  • SNR values: 25, 20, 15, 10, 5, 0, -5, -10 dB (AWGN)
  • fading channel: Watterson Model as defined by CCIR 520
  • 18 Transmission Modes / Modulations: Morse, PSK31, PSK63, QPSK31, RTTY 45/170, RTTY 50/170, RTTY 100/850, Olivia 8/250, Olivia 16/500, Olivia 16/1000, Olivia 32/1000, DominoEx 11, MT63/1000, Navtex, USB audio, LSB audio, AM audio, HF fax
Dataset Generation Steps

Some exemplary IQ signals of different type and different SNR (Gaussian)

Exemplaray RF signals in the machine learning dataset

The RF signal dataset is available for download in 2-D numpy array format with shape=(172800, 2048)

Related Publications: