Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

Last update: Dec 27, 2022

Related tags

Deep Learning SparseR-CNN

Overview

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Paper (CVPR 2021)

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Updates

(02/03/2021) Higher performance is reported by using stronger backbone model PVT.
(23/02/2021) Higher performance is reported by using stronger pretrain model DetCo.
(02/12/2020) Models and logs(R101_100pro_3x and R101_300pro_3x) are available.
(26/11/2020) Models and logs(R50_100pro_3x and R50_300pro_3x) are available.
(26/11/2020) Higher performance for Sparse R-CNN is reported by setting the dropout rate as 0.0.

Models

Method	inf_time	train_time	box AP	download
R50_100pro_3x	23 FPS	19h	42.8	model \| log
R50_300pro_3x	22 FPS	24h	45.0	model \| log
R101_100pro_3x	19 FPS	25h	44.1	model \| log
R101_300pro_3x	18 FPS	29h	46.4	model \| log

Models and logs are available in Baidu Drive by code wt9n.

Notes

We observe about 0.3 AP noise.
The training time is on 8 GPUs with batchsize 16. The inference time is on single GPU. All GPUs are NVIDIA V100.
We use the models pre-trained on imagenet using torchvision. And we provide torchvision's ResNet-101.pkl model. More details can be found in the conversion script.

Method	inf_time	train_time	box AP	codebase
R50_300pro_3x	22 FPS	24h	45.0	detectron2
R50_300pro_3x.detco	22 FPS	28h	46.5	detectron2
PVTSmall_300pro_3x	13 FPS	50h	45.7	mmdetection
PVTv2-b2_300pro_3x	11 FPS	76h	50.1	mmdetection

Installation

The codebases are built on top of Detectron2 and DETR.

Requirements

Linux or macOS with Python ≥ 3.6
PyTorch ≥ 1.5 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
OpenCV is optional and needed by demo and visualization

Steps

Install and build libs

git clone https://github.com/PeizeSun/SparseR-CNN.git
cd SparseR-CNN
python setup.py build develop

Link coco dataset path to SparseR-CNN/datasets/coco

mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017

Train SparseR-CNN

python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml

Evaluate SparseR-CNN

python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --eval-only MODEL.WEIGHTS path/to/model.pth

Visualize SparseR-CNN

python demo/demo.py\
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --input path/to/images --output path/to/save_images --confidence-threshold 0.4 \
    --opts MODEL.WEIGHTS path/to/model.pth

Third-party resources

mmdetection implementation: sparse_rcnn. Thank Shilong Zhang!
cvpod implementation:sparse_rcnn. Thank Benjin Zhu!
paddledetection implementation:sparse_rcnn. Thank FL77N!

License

SparseR-CNN is released under MIT License.

Citing

If you use SparseR-CNN in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{peize2020sparse,
  title   =  {{SparseR-CNN}: End-to-End Object Detection with Learnable Proposals},
  author  =  {Peize Sun and Rufeng Zhang and Yi Jiang and Tao Kong and Chenfeng Xu and Wei Zhan and Masayoshi Tomizuka and Lei Li and Zehuan Yuan and Changhu Wang and Ping Luo},
  journal =  {arXiv preprint arXiv:2011.12450},
  year    =  {2020}
}

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

Related tags

Overview

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Paper (CVPR 2021)

Updates

Models

Notes

Installation

Requirements

Steps

Third-party resources

License

Citing

Owner

Peize Sun

Instance Semantic Segmentation List

PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

Experiment about Deep Person Re-identification with EfficientNet-v2

A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation.

Codebase to experiment with a hybrid Transformer that combines conditional sequence generation with regression

Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch

Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

WRENCH: Weak supeRvision bENCHmark

Robot Servers and Server Manager software for robo-gym

General Virtual Sketching Framework for Vector Line Art (SIGGRAPH 2021)

Bare bones use-case for deploying a containerized web app (built in streamlit) on AWS.

efficient neural audio synthesis in the waveform domain

Code for Discriminative Sounding Objects Localization (NeurIPS 2020)

Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

PyTorch implementation of Munchausen Reinforcement Learning based on DQN and SAC. Handles discrete and continuous action spaces