Builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques

Overview

README

This project builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques. A dataset containing signals collected from 60 LoRa devices is also provided. The detailed collection settings for the different sub-datasets can be found in Section Dataset Introduction. The section of Code Example introduces the usage of some important functions, for more detailed usage please read the code comments carefully.

Citation

If the part of the dataset/codes contributes to your project, please cite:

[1] G. Shen, J. Zhang, A. Marshall, and J. Cavallaro.   “Towards Scalable and Channel-Robust Radio Frequency 
Fingerprint Identification for LoRa,” IEEE Trans. Inf. Forensics Security, 2022.
@article{shen2021towards,
  title={Towards Scalable and Channel-Robust Radio Frequency Fingerprint Identification for LoRa},
  author={Shen, Guanxiong and Zhang, Junqing and Marshall, Alan and Cavallaro, Joseph},
  journal={arXiv preprint arXiv:2107.02867},
  year={2021}
}

Dataset Introduction

Experimental Devices

There are 60 commercial-off-the-shelf LoRa devices (LoPy4, mbed SX1261 shields, FiPy, Dragino SX1276 shields) included in the experiments. The table below provides more details of them.

Device index Model Chipset
1 - 45 Pycom LoPy4 SX1276
46 - 50 mbed SX1261 shield SX1261
51 - 55 Pycom FiPy SX1272
56 - 60 Dragino SX1276 shield SX1276

All the LoRa packets are captured by a USRP N210 software-defined radio (SDR).

Dataset Structure

The dataset consists of 26 sub-datasets, each of which is an HDF5 file. Each HDF5 file contains a number of LoRa signals (IQ samples of preamble part) and corresponding device labels. As HDF5 does not support complex numbers, we concatenate the signal I-brach (real part) and Q-branch (imaginary part) and then save it. Figure below shows the structure of the raw HDF5 dataset.

Training Datasets

The following table summarizes the basic information of each training dataset. All the training datasets were collected in a residential room with a line of sight (LOS) between the transmitter and receiver.

Training dataset path Devices Number of packets per device Augmentation
Dataset/Train/dataset_training_aug.h5 1 - 30 1,000 Yes, both multipath & Doppler
Dataset/Train/dataset_training_aug_0hz.h5 1 - 30 1,000 Yes, only multipath ($f_d$ = 0 Hz)
Dataset/Train/dataset_training_no_aug.h5 1 - 30 500 No

Test/Enrollment Datasets

The test/enrollment datasets were collected in a residential room, an office building and a meeting room. The floor plan is provided in the following figure:

The following table summarizes the basic information of each test/enrollment dataset.

Test dataset path Devices Number of packets per device Collection env.
Dataset/Test/dataset_seen_devices.h5 1 - 30 400 Residential room, LOS, stationary
Dataset/Test/dataset_rogue.h5 41 - 45 200 Residential room, LOS, stationary
Dataset/Test/dataset_residential.h5 31 - 40 400 Residential room, LOS, stationary
Dataset/Test/dataset_other_device_type.h5 46 - 60 400 Residential room, LOS, stationary
Dataset/Test/channel_problem/A.h5 31 - 40 200 Location A, LOS, stationary
Dataset/Test/channel_problem/B.h5 31 - 40 200 Location B, LOS, stationary
Dataset/Test/channel_problem/C.h5 31 - 40 200 Location C, LOS, stationary
Dataset/Test/channel_problem/D.h5 31 - 40 200 Location D, NLOS, stationary
Dataset/Test/channel_problem/E.h5 31 - 40 200 Location E, NLOS, stationary
Dataset/Test/channel_problem/F.h5 31 - 40 200 Location F, NLOS, stationary
Dataset/Test/channel_problem/B_walk.h5 31 - 40 200 Location B, LOS, object moving
Dataset/Test/channel_problem/F_walk.h5 31 - 40 200 Location F, NLOS, object moving
Dataset/Test/channel_problem/moving_office.h5 31 - 40 200 LOS, mobile in the office
Dataset/Test/channel_problem/moving_meeting_room.h5 31 - 40 200 NLOS, mobile in the meeting room
Dataset/Test/channel_problem/B_antenna.h5 31 - 40 200 Location B, LOS, stationary, parallel antenna
Dataset/Test/channel_problem/F_antenna.h5 31 - 40 200 Location F, NLOS, stationary, parallel antenna

Code Example

1. Before Start

a) Install Required Packages

Please find the 'requirement.txt' file to install the required packages.

b) Download Dataset

Please downlaod the dataset and put it in the project folder. The download link is https://ieee-dataport.org/open-access/lorarffidataset.

c) Operating System

This project is built entirely on the Windows operating system. There may be unexpected issues on other operating systems.

2. Quick Start

After installing packages of correct versions and downloading the datasets, you can directly run the 'main.py' file for RFF extractor training/rogue device detection/classification tasks. You can change the variable 'run_for' in line 364 to specify which task to perform. For example, the program will train an RFF extractor and save it if you set the 'run_for' as 'Train'.

3. Load Datasets

It is recommended to use our provided 'LoadDataset' class function to load the raw HDF5 files. You need to specify the dataset path, device range, and packet range before running it. Below is an example of loading an HDF5 file:

import numpy as np
from dataset_preparation import LoadDataset

LoadDatasetObj = LoadDataset()
data, label = LoadDatasetObj.load_iq_samples(file_path = './dataset/Train/dataset_training_aug.h5', 
                                             dev_range = np.arange(30,40, dtype = int), 
                                             pkt_range= np.arange(0,100, dtype = int))

This example will extract ($10\times100=1000$) LoRa signals in total. More specifically, it will extract 100 packets from each device in range. The function 'load_iq_samples' returns two arrays, data and label. The data is a complex128 array of size (1000,8192), and label is an int32 array of size (1000,1). The figure below illustrates the structures of the two arrays.

Note that the loaded labels start from 0 but not 1 to adapt to deep learning. In other words, device 1 is labelled 0 and device 2 is labelled 1 and so forth.

4. Generate Channel Independent Spectrograms

The channel independent spectrogram helps mitigate the channel effects in the received signal and make LoRa-RFFI systems more robust to channel variations. We provide functions to convert an array of IQ samples to channel independent spectrograms. The following code block gives an example:

from dataset_preparation import ChannelIndSpectrogram

ChannelIndSpectrogramObj = ChannelIndSpectrogram()
# The input 'data' is the loaded IQ samples in the last example.
ch_ind_spec = ChannelIndSpectrogramObj.channel_ind_spectrogram(data)

The returned 'ch_ind_spec' is an array of size (1000,102,62,1). Note that the size of the array is affected by the STFT parameters, which can be changed in code. Please refer to our paper or code comments to find the detailed derivation of channel independent spectrograms.

5. Train an RFF Extractor

The function 'train_feature_extractor()' can train an RFF extractor using triplet loss.

import numpy as np
from deep_learning_models import TripletNet, identity_loss
from sklearn.model_selection import train_test_split
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
from keras.optimizers import RMSprop

feature_extractor = train_feature_extractor()

You can also specify the training dataset path, training device range, training packets range and SNR range during augmentation. Otherwise, the default values will be used. Following is an example:

feature_extractor = train_feature_extractor(file_path = './dataset/Train/dataset_training_aug.h5', 
                                            dev_range = np.arange(0,10, dtype = int), 
                                            pkt_range = np.arange(0,1000, dtype = int), 
                                            snr_range = np.arange(20,80)):

6. Rogue Device Detection

The function 'test_rogue_device_detection()' performs the rogue device detection task. You MUST specify the RFF extractor path before running the function. See the example below:

import numpy as np
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import roc_curve, auc

fpr, tpr, roc_auc, eer = test_rogue_device_detection('./models/Extractor_1.h5')

This function returns false posive rate (FPR), true positive rate (TPR), area under the curve (AUC) and equal error rate (EER). These are all important evaluation metrics in rogue device detection task. Please refer to our paper for their definitions.

The following lines of code plot the ROC curve using the returned results:

import matplotlib.pyplot as plt

# Plot the ROC curves.
plt.figure(figsize=(4.8, 2.8))
plt.xlim(-0.01, 1.02)
plt.ylim(-0.01, 1.02)
plt.plot([0, 1], [0, 1], 'k--')
plt.plot(fpr, tpr, label='Extractor 1, AUC = ' 
         + str(round(roc_auc,3)) + ', EER = ' + str(round(eer,3)), C='r')
plt.xlabel('False positive rate')
plt.ylabel('True positive rate')
plt.title('ROC curve')
plt.legend(loc=4)
# plt.savefig('roc_curve.pdf',bbox_inches='tight')
plt.show()    

7. Classification

The function 'test_classification()' performs the classification task. You MUST specify the paths of enrollment dataset, test dataset and RFF extractor before running the function. Here is a simple example:

from sklearn.metrics import accuracy_score
from sklearn.neighbors import KNeighborsClassifier
import numpy as np

pred_label, true_label, acc = test_classification(file_path_enrol = 
                                                  './dataset/Test/dataset_residential.h5',
                                                  file_path_clf = 
                                                  './dataset/Test/channel_problem/A.h5',
                                                  feature_extractor_name = 
                                                  './models/Extractor_1.h5')

This example returns predicted labels, true labels and the overall classification accuracy. We can further plot a confusion matrix to see fine-grained classification results:

import matplotlib.pyplot as plt
import seaborn as sns

# Plot the confusion matrix.
conf_mat = confusion_matrix(true_label, pred_label)
classes = test_dev_range + 1 # xticklabels

plt.figure()
sns.heatmap(conf_mat, annot=True, 
            fmt = 'd', cmap='Blues',
            cbar = False,
            xticklabels=classes, 
            yticklabels=classes)
plt.xlabel('Predicted label', fontsize = 20)
plt.ylabel('True label', fontsize = 20)

License

The dataset and code is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Contact

Please contact the following email addresses if you have any questions:
[email protected]
[email protected]

High-Resolution 3D Human Digitization from A Single Image.

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization (CVPR 2020) News: [2020/06/15] Demo with Google Colab (i

Meta Research 8.4k Dec 29, 2022
Extracts data from the database for a graph-node and stores it in parquet files

subgraph-extractor Extracts data from the database for a graph-node and stores it in parquet files Installation For developing, it's recommended to us

Cardstack 0 Jan 10, 2022
This is the official github repository of the Met dataset

The Met dataset This is the official github repository of the Met dataset. The official webpage of the dataset can be found here. What is it? This cod

Nikolaos-Antonios Ypsilantis 35 Dec 17, 2022
Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow 18 Oct 06, 2022
Do you like Quick, Draw? Well what if you could train/predict doodles drawn inside Streamlit? Also draws lines, circles and boxes over background images for annotation.

Streamlit - Drawable Canvas Streamlit component which provides a sketching canvas using Fabric.js. Features Draw freely, lines, circles, boxes and pol

Fanilo Andrianasolo 325 Dec 28, 2022
This is the pytorch re-implementation of the IterNorm

IterNorm-pytorch Pytorch reimplementation of the IterNorm methods, which is described in the following paper: Iterative Normalization: Beyond Standard

Lei Huang 32 Dec 27, 2022
Embracing Single Stride 3D Object Detector with Sparse Transformer

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

TuSimple 385 Dec 28, 2022
BED: A Real-Time Object Detection System for Edge Devices

BED: A Real-Time Object Detection System for Edge Devices About this project Thi

Data Analytics Lab at Texas A&M University 44 Nov 18, 2022
Label Mask for Multi-label Classification

LM-MLC 一种基于完型填空的多标签分类算法 1 前言 本文主要介绍本人在全球人工智能技术创新大赛【赛道一】设计的一种基于完型填空(模板)的多标签分类算法:LM-MLC,该算法拟合能力很强能感知标签关联性,在多个数据集上测试表明该算法与主流算法无显著性差异,在该比赛数据集上的dev效果很好,但是由

52 Nov 20, 2022
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

Phil Wang 180 Jan 05, 2023
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

Daniil Pakhomov 134 Dec 19, 2022
BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation

BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation Installing The Dependencies $ conda create --name beametrics python

7 Jul 04, 2022
[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

EarlyBERT This is the official implementation for the paper in ACL-IJCNLP 2021 "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by

VITA 13 May 11, 2022
This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

Asutosh Nayak 136 Dec 28, 2022
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

CLIP-GEN [简体中文][English] 本项目在萤火二号集群上用 PyTorch 实现了论文 《CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP》。 CLIP-GEN 是一个 Language-F

75 Dec 29, 2022
Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

Tengfei Wang 110 Dec 20, 2022
[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021

Pedestron Pedestron is a MMdetection based repository, that focuses on the advancement of research on pedestrian detection. We provide a list of detec

Irtiza Hasan 594 Jan 05, 2023
Use tensorflow to implement a Deep Neural Network for real time lane detection

LaneNet-Lane-Detection Use tensorflow to implement a Deep Neural Network for real time lane detection mainly based on the IEEE IV conference paper "To

MaybeShewill-CV 1.9k Jan 08, 2023
A model that attempts to learn and benefit from data collected on card counting.

A model that attempts to learn and benefit from data collected on card counting. A decision tree like model is built to win more often than loose and increase the bet of the player appropriately to c

1 Dec 17, 2021
Patch SVDD for Image anomaly detection

Patch SVDD Patch SVDD for Image anomaly detection. Paper: https://arxiv.org/abs/2006.16067 (published in ACCV 2020). Original Code : https://github.co

Hong-Jeongmin 0 Dec 03, 2021