Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

Last update: Jan 02, 2023

Overview

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection, CVPR 2021

Abhinav Kumar, Garrick Brazil, Xiaoming Liu

[project], [supp], [slides], [1min_talk], demo, arxiv

This code is based on Kinematic-3D, such that the setup/organization is very similar. A few of the implementations, such as classical NMS, are based on Caffe.

References

Please cite the following paper if you find this repository useful:

@inproceedings{kumar2021groomed,
  title={{GrooMeD-NMS}: Grouped Mathematically Differentiable NMS for Monocular {$3$D} Object Detection},
  author={Kumar, Abhinav and Brazil, Garrick and Liu, Xiaoming},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Setup

Requirements
1. Python 3.6
2. Pytorch 0.4.1
3. Torchvision 0.2.1
4. Cuda 8.0
5. Ubuntu 18.04/Debian 8.9
This is tested with NVIDIA 1080 Ti GPU. Other platforms have not been tested. Unless otherwise stated, the below scripts and instructions assume the working directory is the project root.

Clone the repo first:
```
git clone https://github.com/abhi1kumar/groomed_nms.git
```

Cuda & Python

Install some basic packages:

sudo apt-get install libopenblas-dev libboost-dev libboost-all-dev git
sudo apt install gfortran

# We need to compile with older version of gcc and g++
sudo apt install gcc-5 g++-5
sudo ln -f /usr/bin/gcc-5 /usr/local/cuda-8.0/bin/gcc
sudo ln -s /usr/bin/g++-5 /usr/local/cuda-8.0/bin/g++

Next, install conda and then install the required packages:

wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh
bash Anaconda3-2020.02-Linux-x86_64.sh
source ~/.bashrc
conda list
conda create --name py36 --file dependencies/conda.txt
conda activate py36

KITTI Data

Download the following images of the full KITTI 3D Object detection dataset:

left color images of object data set (12 GB)
camera calibration matrices of object data set (16 MB)
training labels of object data set (5 MB)

Then place a soft-link (or the actual data) in data/kitti:

 ln -s /path/to/kitti data/kitti

The directory structure should look like this:

./groomed_nms
|--- cuda_env
|--- data
|      |---kitti
|            |---training
|            |        |---calib
|            |        |---image_2
|            |        |---label_2
|            |
|            |---testing
|                     |---calib
|                     |---image_2
|
|--- dependencies
|--- lib
|--- models
|--- scripts

Then, use the following scripts to extract the data splits, which use soft-links to the above directory for efficient storage:

python data/kitti_split1/setup_split.py
python data/kitti_split2/setup_split.py

Next, build the KITTI devkit eval:

 sh data/kitti_split1/devkit/cpp/build.sh

Classical NMS

Lastly, build the classical NMS modules:
```
cd lib/nms
make
cd ../..
```

Training

Training is carried out in two stages - a warmup and a full. Review the configurations in scripts/config for details.

chmod +x scripts_training.sh
./scripts_training.sh

If your training is accidentally stopped, you can resume at a checkpoint based on the snapshot with the restore flag. For example, to resume training starting at iteration 10k, use the following command:

source dependencies/cuda_8.0_env
CUDA_VISIBLE_DEVICES=0 python -u scripts/train_rpn_3d.py --config=groumd_nms --restore=10000

Testing

We provide logs/models/predictions for the main experiments on KITTI Val 1/Val 2/Test data splits available to download here.

Make an output folder in the project directory:

mkdir output

Place different models in the output folder as follows:

./groomed_nms
|--- output
|      |---groumd_nms
|      |
|      |---groumd_nms_split2
|      |
|      |---groumd_nms_full_train_2
|
| ...

To test, run the file as below:

chmod +x scripts_evaluation.sh
./scripts_evaluation.sh

Contact

For questions, feel free to post here or drop an email to this address- [email protected]

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

MonoFlex Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21. Work in progress. Installation This repo is tested w

169 Dec 6, 2022

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University)

842 Jan 4, 2023

Comments

Is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

Hi~thanks for your great work. However, I have some confusion in understanding the motivation of this algorithm. If we want to achieve the consistency of training and test, we can simply penalize the highest-confidence proposal in the training pipeline, which seems to achieve similar result.So, is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

opened by kaixinbear 3
Problem in test

Hi, this is an exciting work.And i have a question when I try to test with the pre-train model. I can't find "Kinematic3D-Release/val1_kinematic/model_final".

opened by chenH20000109 1

Releases(v0.1)

v0.1(Mar 30, 2021)

First Release of GrooMeD-NMS
Source code(tar.gz)
Source code(zip)

Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

Related tags

Overview

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection

References

Setup

Training

Testing

Contact

You might also like...

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Categorical Depth Distribution Network for Monocular 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection

Comments

Is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

Problem in test

Releases(v0.1)

v0.1(Mar 30, 2021)

Owner

Abhinav Kumar

Multimodal commodity image retrieval 多模态商品图像检索

The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

Library for time-series-forecasting-as-a-service.

A task Provided by A respective Artenal Ai and Ml based Company to complete it

RodoSol-ALPR Dataset

Submodular Subset Selection for Active Domain Adaptation (ICCV 2021)

MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

The CLRS Algorithmic Reasoning Benchmark

Real-time 3D multi-person detection made easy with OpenPose and the ZED

Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment".

AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

Image morphing without reference points by applying warp maps and optimizing over them.

.NET bindings for the Pytorch engine

A Real-World Benchmark for Reinforcement Learning based Recommender System

Genshin-assets - 👧 Public documentation & static assets for Genshin Impact data.

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

WarpRNNT loss ported in Numba CPU/CUDA for Pytorch

Kohei's 5th place solution for xview3 challenge

VISNOTATE: An Opensource tool for Gaze-based Annotation of WSI Data