April 18, 2024

Practical RF Machine Learning for Signal Recognition

This article investigates how a deep neural network for RF signal classification performs in a real-world application. The approach uses synthetical data for training and then tests the trained networks against real-world data. The results show how deep learning successfully works for practical RF machine learning applications.

This article is a shortened version of my paper

RF Signal Classification with Synthetic Training Data and its Real-World Performance (2022)

Motivation

Some years ago, the large interest on deep learning for signal classification (and especially modulation recognition) started. This interest is driven by the revolution deep learning introduced to image processing and machine translation.

Deep learning for RF signals is often only evaluated in academic setups with well-defined synthetic test data and no data from practical real-world applications. Often training datasets have biases, that may provide good results, when evaluated with similar data – but fail, when the trained neural networks face practical operation. Therefore doubts have occurred, whether deep learning can provide good accuracy in real-world applications.

In this article, I will demonstrate for a 20-class RF automatic signal classification problem, that deep neural networks can perform well in a practical application, if it is trained with “good” data.

Classification Task

The example task is RF signal classification, where the type or mode of a wireless signal is to be identified. In this setting, the systems shall decide among 20 RF signal types shown in the table below. The signal types include a large fraction of digitally modulated signals, such as radioteletype, Navtex, PSK modes, multiple FSK and multi-carrier modes. In addition, analog modulated signals, such as AM broadcasting, single-sideband (SSB) voice and HF fax are included. These signals are commonly used by commercial, amateur and governmental operators across the shortwave band between 3-30 MHz.

Signal TypeModulationBaud RateUsers
AM broadcastAManalogBroadcasting
Morse CodeOOKvariableGeneral
PSK31PSK31Amateur Radio
PSK63PSK63Amateur Radio
RTTY 45/170FSK, 170 Hz shift45Amateur Radio
RTTY 50/450FSK, 450 Hz shift50Civil Service
RTTY 75/170FSK, 170 Hz shift75Amateur Radio
Navtex / Sitor-BFSK, 170 Hz shift100Commercial / Civil Service
Olivia 4/5004-MFSK125Amateur Radio
Olivia 8/2508-MFSK31Amateur Radio
Olivia 16/50016-MFSK31Amateur Radio
Olivia 32/100032-MFSK31Amateur Radio
Contestia 16/25016-MFSK16Amateur Radio
MFSK-1616-MFSK16Amateur Radio
MFSK-3216-MFSK31Amateur Radio
MFSK-6416-MFSK63Amateur Radio
MT63 / 500multi-carrier5Amateur Radio
Single-Sideband (upper)USBanalogGeneral
Single-Sideband (lower)LSBanalogAmateur Radio
Wefax/HF-FaxRadiofaxanalogCivil Service
Wireless Signal Types

Training Data

The training datasets contain RF signals from the 20 classes and have been created synthetically, i.e. using simulation software. A complete dataset contains 120,000 synthetically generated signals for training and another 30,000 for training validation. Each signal consists of 2048 complex IQ samples with a sampling frequency of 6 kHz. This results in a signal duration of approximately 340 ms.

A dataset consists of “clean” undistorted signals, that are then distorted by simulation in order to obtain realistic signal waveforms. This distortion includes different impairments, that are present in a RF transmission system: different channel conditions (noise and fading), offsets of carrier frequency and phase, offset of ADC/DAC sample frequency, band-limiting filters, etc. A good overview on the different impairments and how they can be included in a training dataset can be found in my article RF Training Data Generation for Machine Learning.

All the aforementioned impairment can be included in one dataset for training. However, in this article we want to gain more insight in how the different impairments in the training data influence the actual performance of a neural network in a practical application. Therefore we now consider seven separate training datasets, where each training dataset includes a different amount of impairments:

Training Dataset NameFrequency OffsetPhase Offsetfs OffsetNoiseRX filterFading
Clean (no impairments)
+ Frequency Offset x
+ Phase Offset x x
+ fs Offset x x x
+ Noise x x x x
+ Filter x x x x x
+ Fading (all impairments) x x x x x x
Different Datasets used for Training

The details of the impairments are:

  • Random frequency offset: +/- 250 Hz
  • Random phase offset: 0-360°
  • Random sample rate (fs) offset: +- 1 %
  • Random noise: -15 to +25 dB SNR
  • Random fading: Watterson Models according to ITU 1487

Each of the seven datasets are now used to train a neural network. This results in seven trained nets, where each networks is trained by signals with different amount of impairments. While the first net (called “clean”) can only learn clean signals, the last and most impaired dataset (called “+ fading”) learns also to ignore many irrelevant signal components originating from distortion.

Network Architecture

The neural network architecture is a 9 layer CNN with the structure shown below. The input data is IQ, where the real and imaginary parts are represented as two channels.

convolutional neural network for radio signal recognition
CNN for RF signal classification

Real-World Test Data

After training each net is evaluated with real-world data captured “in the field”. This is in contrast to the usual approach, where a trained network is tested against a subset of the dataset, called validation set, which usually has similar data distribution as the training data. The approach of using real-world data enables to assess the accuracy of a neural network in the practical RF machine learning application.

The data for real-world testing has been captured with the Twente Websdr shortwave receiver over a period of several months. For each mode, numerous real reception records are available, that originate from different transmitter hardware, transmitter locations and different time of day and season. Accordingly, the data includes a high variability in channel conditions and received signal waveforms.

Results

The accuracies for the real-world test data are provided in the figure below. Each plot corresponds to one neural network trained by one of the different training datasets. The achieved high accuracies around 95 % demonstrate, that deep learning can work well in a practical RF machine learning setup.

plot that show the performance of practical RF machine learning for signal recognition
Accuracy vs SNR for each of the training datasets with different amount of impairments measured with real-world test data

From the figure the following main results can be read out:

  • A very good accuracy of around 95 % can be achieved for real-world signals – if impairments are properly included in the training dataset.
  • Even for low SNR values of -10 to -5 dB, the accuracy is between 50 % and 80 %.
  • Most important impairments are frequency offset, noise and fading.

Further interesting observations:

  • The introduction of a random phase did not improve the accuracy.
  • The introduction of a small sample rate (fs) offset has only minor influence.
  • The inclusion of band-limiting filtering improves the accuracy in the low SNR region. This is because the filtering has most effect on signals with a larger amount of noise, i.e. at low SNR.
  • Interesting: Even for clean training data with no impairments, the network has a considerable accuracy of 30 – 50 % for SNR > 0 dB. This is an indication, that the CNN is able to generalize quite a bit from clean to real-world signals even with simple training data.

One thought on “Practical RF Machine Learning for Signal Recognition

  1. Great work over many years!! Keep going! Very interested in SigID and methods to generate the metadata for IQ files. Look at https://www.iqengine.org/ for a signal database – could be very useful for signal ID classification that can be used for tagging ML data.

Leave a Reply

Your email address will not be published. Required fields are marked *