Fuse radar and camera for detection

Related tags

Deep LearningSAF-FCOS
Overview

SAF-FCOS: Spatial Attention Fusion for Obstacle Detection using MmWave Radar and Vision Sensor

This project hosts the code for implementing the SAF-FCOS algorithm for object detection, as presented in our paper:

SAF-FCOS: Spatial Attention Fusion for Obstacle Detection using MmWave Radar and Vision Sensor;
Shuo Chang, YiFan Zhang, Fan Zhang, Xiaotong Zhao, Sai Huang, ZhiYong Feng and Zhiqing Wei;
In: Sensors, 2019.

And the whole project is built upon FCOS, Below is FCOS license.

FCOS for non-commercial purposes

Copyright (c) 2019 the authors
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

The full paper is available at: https://www.mdpi.com/1424-8220/20/4/956.

You should known

Please read the FCOS project first FCOS-README.md

Installation

Please check INSTALL.md for installation instructions.

Generate Data

  1. Please download Full dataset (v1.0) of nuScenes dataset from the link. download

  2. Then, upload all download tar files to an ubuntu server, and uncompress all *.tar files in a specific folder:

mkdir ~/Data/nuScenes
mv AllDownloadTarFiles ~/Data/nuScenes
cd ~/Data/nuScenes
for f in *.tar; do tar -xvf "$f"; done
  1. Convert the radar pcd file as image:
python tools/nuscenes/convert_radar_point_to_image.py --dataroot ~/Data/nuScenes --version v1.0-mini
python tools/nuscenes/convert_radar_point_to_image.py --dataroot ~/Data/nuScenes --version v1.0-trainval
python tools/nuscenes/convert_radar_point_to_image.py --dataroot ~/Data/nuScenes --version v1.0-test
  1. Calculate the norm info of radar images:
python tools/nuscenes/extract_pc_image_norm_info_from_image.py --datadir ~/Data/nuScenes --outdir ~/Data/nuScenes/v1.0-trainval
  1. Generate 2D detections results for nuScenes CAM_FRONT images by 'FCOS_imprv_dcnv2_X_101_64x4d_FPN_2x.pth',
    some of detection results should be refined by labelers to get tighter bboxes,
    and save the detection results as txt file in the folder ~/Data/nuScenes/fcos/CAM_FRONT:
    detection1 detection2 The detection results are saved as '0, 1479.519, 611.043, 1598.754, 849.447'. The first column is category, and the last stands for position.
    For convenience, we supply our generated 2D txt files in cloud drive and in folder data/fcos.zip.
    For users not in China, please download from google drive.
    For users in China, please download from baidu drive.

    链接:https://pan.baidu.com/s/11NNYpmBbs5sSqSsFxl-z7Q 
    提取码:6f1x 

    If you use our generated txt files, please:

mv fcos.zip ~/Data/nuScenes
unzip fcos.zip
  1. Generate 2D annotations in coco style for model training and test:
python tools/nuscenes/generate_2d_annotations_by_fcos.py --datadir ~/Data/nuScenes --outdir ~/Data/nuScenes/v1.0-trainval

Prepare training

The following command line will train fcos_imprv_R_101_FPN_1x_ATTMIX_135_Circle_07.yaml on 8 GPUs with Synchronous Stochastic Gradient Descent (SGD):

python -m torch.distributed.launch \
       --nproc_per_node=8 \
       --master_port=$((RANDOM + 10000)) \
       tools/train_net.py \
       --config-file configs/fcos_nuscenes/fcos_imprv_R_101_FPN_1x_ATTMIX_135_Circle_07.yaml \
       DATALOADER.NUM_WORKERS 2 \
       OUTPUT_DIR tmp/fcos_imprv_R_50_FPN_1x

Prepare Test

The following command line will test fcos_imprv_R_101_FPN_1x_ATTMIX_135_Circle_07.yaml on 8 GPUs:

python -m torch.distributed.launch \
       --nproc_per_node=8  
       --master_port=$((RANDOM + 10000)) \
       tools/test_epoch.py \
       --config-file configs/fcos_nuscenes/fcos_imprv_R_101_FPN_1x_ATTMIX_135_Circle_07.yaml \
       --checkpoint-file tmp/fcos_imprv_R_50_FPN_1x_ATTMIX_135_Circle_07/model_0010000.pth \ 
       OUTPUT_DIR tmp/fcos_imprv_R_101_FPN_1x_ATTMIX_135_Circle_07

Citations

Please consider citing our paper and FOCS in your publications if the project helps your research. BibTeX reference is as follows.

@article{chang2020spatial,
  title={Spatial Attention fusion for obstacle detection using mmwave radar and vision sensor},
  author={Chang, Shuo and Zhang, Yifan and Zhang, Fan and Zhao, Xiaotong and Huang, Sai and Feng, Zhiyong and Wei, Zhiqing},
  journal={Sensors},
  volume={20},
  number={4},
  pages={956},
  year={2020},
  publisher={Multidisciplinary Digital Publishing Institute}
}
@inproceedings{tian2019fcos,
  title   =  {{FCOS}: Fully Convolutional One-Stage Object Detection},
  author  =  {Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
  booktitle =  {Proc. Int. Conf. Computer Vision (ICCV)},
  year    =  {2019}
}
Owner
ChangShuo
Machine learning. Visual Object Tracking. Signal Processing. Multi-Sensor Fusion
ChangShuo
FairFuzz: AFL extension targeting rare branches

FairFuzz An AFL extension to increase code coverage by targeting rare branches. FairFuzz has a particular advantage on programs with highly nested str

Caroline Lemieux 222 Nov 16, 2022
A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch The official pytorch implementation of the paper "Towards Faster and Stabilize

Bingchen Liu 455 Jan 08, 2023
LRBoost is a scikit-learn compatible approach to performing linear residual based stacking/boosting.

LRBoost is a sckit-learn compatible package for linear residual boosting. LRBoost combines a linear estimator and a non-linear estimator to leverage t

Andrew Patton 5 Nov 23, 2022
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

EquiBind: geometric deep learning for fast predictions of the 3D structure in which a small molecule binds to a protein

Hannes Stärk 355 Jan 03, 2023
Code for paper entitled "Improving Novelty Detection using the Reconstructions of Nearest Neighbours"

NLN: Nearest-Latent-Neighbours A repository containing the implementation of the paper entitled Improving Novelty Detection using the Reconstructions

Michael (Misha) Mesarcik 4 Dec 14, 2022
Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

Chetan Hirapara 3 Oct 07, 2022
SPTAG: A library for fast approximate nearest neighbor search

SPTAG: A library for fast approximate nearest neighbor search SPTAG SPTAG (Space Partition Tree And Graph) is a library for large scale vector approxi

Microsoft 4.3k Jan 01, 2023
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 08, 2023
Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

MidiBERT-Piano Authors: Yi-Hui (Sophia) Chou, I-Chun (Bronwin) Chen Introduction This is the official repository for the paper, MidiBERT-Piano: Large-

137 Dec 15, 2022
Sound Source Localization for AI Grand Challenge 2021

Sound-Source-Localization Sound Source Localization study for AI Grand Challenge 2021 (sponsored by NC Soft Vision Lab) Preparation 1. Place the data-

sanghoon 19 Mar 29, 2022
Dynamic wallpaper generator.

Wiki • About • Installation About This project is a dynamic wallpaper changer. It waits untill you turn on the music, downloads album cover if it's po

3 Sep 18, 2021
A toy project using OpenCV and PyMunk

A toy project using OpenCV, PyMunk and Mediapipe the source code for my LindkedIn post It's just a toy project and I didn't write a documentation yet,

Amirabbas Asadi 82 Oct 28, 2022
Self-Regulated Learning for Egocentric Video Activity Anticipation

Self-Regulated Learning for Egocentric Video Activity Anticipation Introduction This is a Pytorch implementation of the model described in our paper:

qzhb 13 Sep 23, 2022
Semantic Image Synthesis with SPADE

Semantic Image Synthesis with SPADE New implementation available at imaginaire repository We have a reimplementation of the SPADE method that is more

NVIDIA Research Projects 7.3k Jan 07, 2023
This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset.

DeepLab-ResNet-TensorFlow This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. Up

19 Jan 16, 2022
Vision transformers (ViTs) have found only limited practical use in processing images

CXV Convolutional Xformers for Vision Vision transformers (ViTs) have found only limited practical use in processing images, in spite of their state-o

Cloudwalker 23 Sep 10, 2022
ChatBot-Pytorch - A GPT-2 ChatBot implemented using Pytorch and Huggingface-transformers

ChatBot-Pytorch A GPT-2 ChatBot implemented using Pytorch and Huggingface-transf

ParZival 42 Dec 09, 2022
SCAAML is a deep learning framwork dedicated to side-channel attacks run on top of TensorFlow 2.x.

SCAAML (Side Channel Attacks Assisted with Machine Learning) is a deep learning framwork dedicated to side-channel attacks. It is written in python and run on top of TensorFlow 2.x.

Google 69 Dec 21, 2022
Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

Aviv Navon 87 Dec 26, 2022
Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

mae-repo PyTorch re-implememtation of "masked autoencoders are scalable vision learners". In this repo, it heavily borrows codes from codebase https:/

Peng Qiao 1 Dec 14, 2021