Improving 3D Object Detection with Channel-wise Transformer

Last update: Dec 20, 2022

Related tags

Overview

"Improving 3D Object Detection with Channel-wise Transformer"

Thanks for the OpenPCDet, this implementation of the CT3D is mainly based on the pcdet v0.3. Our paper can be downloaded here ICCV2021.

Overview of CT3D. The raw points are first fed into the RPN for generating 3D proposals. Then the raw points along with the corresponding proposals are processed by the channel-wise Transformer composed of the proposal-to-point encoding module and the channel-wise decoding module. Specifically, the proposal-to-point encoding module is to modulate each point feature with global proposal-aware context information. After that, the encoded point features are transformed into an effective proposal feature representation by the channel-wise decoding module for confidence prediction and box regression.

	[email protected]	[email protected]	Download
Only Car	86.06	85.79	model-car
3-Category (Car)	85.04	84.97	model-3cat
3-Category (Pedestrian)	56.28	55.58	-
3-Category (Cyclist)	71.71	71.88	-

1. Recommended Environment

Linux (tested on Ubuntu 16.04)
Python 3.6+
PyTorch 1.1 or higher (tested on PyTorch 1.6)
CUDA 9.0 or higher (PyTorch 1.3+ needs CUDA 9.2+)

2. Set the Environment

pip install -r requirement.txt
python setup.py develop

3. Data Preparation

Prepare KITTI dataset and road planes

# Download KITTI and organize it into the following form:
├── data
│   ├── kitti
│   │   │── ImageSets
│   │   │── training
│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │── testing
│   │   │   ├──calib & velodyne & image_2

# Generatedata infos:
python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml

Prepare Waymo dataset

# Download Waymo and organize it into the following form:
├── data
│   ├── waymo
│   │   │── ImageSets
│   │   │── raw_data
│   │   │   │── segment-xxxxxxxx.tfrecord
|   |   |   |── ...
|   |   |── waymo_processed_data
│   │   │   │── segment-xxxxxxxx/
|   |   |   |── ...
│   │   │── pcdet_gt_database_train_sampled_xx/
│   │   │── pcdet_waymo_dbinfos_train_sampled_xx.pkl

# Install tf 2.1.0
# Install the official waymo-open-dataset by running the following command:
pip3 install --upgrade pip
pip3 install waymo-open-dataset-tf-2-1-0 --user

# Extract point cloud data from tfrecord and generate data infos:
python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml

4. Train

Train with a single GPU

python train.py --cfg_file ${CONFIG_FILE}

# e.g.,
python train.py --cfg_file tools/cfgs/kitti_models/second_ct3d.yaml

Train with multiple GPUs or multiple machines

bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file ${CONFIG_FILE}
# or 
bash scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_GPUS} --cfg_file ${CONFIG_FILE}

# e.g.,
bash scripts/dist_train.sh 8 --cfg_file tools/cfgs/kitti_models/second_ct3d.yaml

5. Test

Test with a pretrained model:

python test.py --cfg_file ${CONFIG_FILE} --ckpt ${CKPT}

# e.g., 
python test.py --cfg_file tools/cfgs/kitti_models/second_ct3d.yaml --ckpt output/kitti_models/second_ct3d/default/kitti_val.pth

Improving 3D Object Detection with Channel-wise Transformer

Related tags

Overview

"Improving 3D Object Detection with Channel-wise Transformer"

1. Recommended Environment

2. Set the Environment

3. Data Preparation

4. Train

5. Test

Owner

Hualian Sheng

[IEEE Transactions on Computational Imaging] Self-Gated Memory Recurrent Network for Efficient Scalable HDR Deghosting

4th place solution to datafactory challenge by Intermarché.

Satellite labelling tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, rings etc.

GAN JAX - A toy project to generate images from GANs with JAX

My solution for the 7th place / 245 in the Umoja Hack 2022 challenge

classify fashion-mnist dataset with pytorch

Progressive Domain Adaptation for Object Detection

Lunar is a neural network aimbot that uses real-time object detection accelerated with CUDA on Nvidia GPUs.

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".

TAUFE: Task-Agnostic Undesirable Feature DeactivationUsing Out-of-Distribution Data

Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.

Deep Learning Specialization by Andrew Ng, deeplearning.ai.

Object Detection Projekt in GKI WS2021/22

《Geo Word Clouds》paper implementation

Official implementation for the paper: Multi-label Classification with Partial Annotations using Class-aware Selective Loss

This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

Website which uses Deep Learning to generate horror stories.

Towards Long-Form Video Understanding

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)