REGTR: End-to-end Point Cloud Correspondences with Transformers

Last update: Dec 17, 2022

Related tags

Overview

REGTR: End-to-end Point Cloud Correspondences with Transformers

This repository contains the source code for REGTR. REGTR utilizes multiple transformer attention layers to directly predict each downsampled point's corresponding location in the other point cloud. Unlike typical correspondence-based registration algorithms, the predicted correspondences are clean and do not require an additional RANSAC step. This results in a fast, yet accurate registration.

If you find this useful, please cite:

@inproceedings{yew2022regtr,
  title={REGTR: End-to-end Point Cloud Correspondences with Transformers},
  author={Yew, Zi Jian and Lee, Gim hee},
  booktitle={CVPR},
  year={2022},
}

Dataset environment

Our model is trained with the following environment:

Python 3.8.8
PyTorch 1.9.1 with torchvision 0.10.1 (Cuda 11.1)
PyTorch3D 0.6.0
MinkowskiEngine 0.5.4

Other required packages can be installed using pip: pip install -r src/requirements.txt.

Data and Preparation

Follow the following instructions to download each dataset (as necessary). Your folder should then look like this:

.
├── data/
    ├── indoor/
        ├── test/
        |   ├── 7-scenes-redkitchen/
        |   |   ├── cloud_bin_0.info.txt
        |   |   ├── cloud_bin_0.pth
        |   |   ├── ...
        |   ├── ...
        ├── train/
        |   ├── 7-scenes-chess/
        |   |   ├── cloud_bin_0.info.txt
        |   |   ├── cloud_bin_0.pth
        |   |   ├── ...
        ├── test_3DLoMatch_pairs-overlapmask.h5
        ├── test_3DMatch_pairs-overlapmask.h5
        ├── train_pairs-overlapmask.h5
        └── val_pairs-overlapmask.h5
    └── modelnet40_ply_hdf5_2048
        ├── ply_data_test0.h5
        ├── ply_data_test1.h5
        ├── ...
├── src/
└── Readme.md

3DMatch

Download the processed dataset from Predator project site, and place them into ../data.

Then for efficiency, it is recommended to pre-compute the overlapping points (used for computing the overlap loss). You can do this by running the following from the src/ directory:

python data_processing/compute_overlap_3dmatch.py

ModelNet

Download the PointNet-processed dataset from here, and place it into ../data.

Pretrained models

You can download our trained models here. Unzip the files into the trained_models/.

Demo

We provide a simple demo script demo.py that loads our model and checkpoints, registers 2 point clouds, and visualizes the result. Simply download the pretrained models and run the following from the src/ directory:

python demo.py --example 0  # choose from 0 - 4 (see code for details)

Press 'q' to end the visualization and exit. Refer the documentation for visualize_result() for explanation of the visualization.

Inference/Evaluation

The following code in the src/ directory performs evaluation using the pretrained checkpoints as provided above; change the checkpoint paths accordingly if you're using your own trained models. Note that due to non-determinism from the neighborhood computation during our GPU-based KPConv processing, the results will differ slightly (e.g. mean registration recall may differ by around +/- 0.2%) between each run.

3DMatch / 3DLoMatch

This will run inference and compute the evaluation metrics used in Predator (registration success of <20cm).

# 3DMatch
python test.py --dev --resume ../trained_models/3dmatch/ckpt/model-best.pth --benchmark 3DMatch

# 3DLoMatch
python test.py --dev --resume ../trained_models/3dmatch/ckpt/model-best.pth --benchmark 3DLoMatch

ModelNet

# ModelNet
python test.py --dev --resume ../trained_models/modelnet/ckpt/model-best.pth --benchmark ModelNet

# ModelLoNet
python test.py --dev --resume ../trained_models/modelnet/ckpt/model-best.pth --benchmark ModelNet

Training

Run the following commands from the src/ directory to train the network.

3DMatch (Takes ~2.5 days on a Titan RTX)

python train.py --config conf/3dmatch.yaml

ModelNet (Takes <2 days on a Titan RTX)

python train.py --config conf/modelnet.yaml

Acknowledgements

We would like to thank the authors for Predator, D3Feat, KPConv, DETR for making their source codes public.

Comments

About the influence of the weak data augmentation

Thanks for the great work. I notice that RegTR adopts a much weaker augmentation than the commonly used augmentation in [1, 2, 3]. How does this affect the convergence of RegTR? And will the weak augmentation affect the robustness to large transformation perturbation? Thank you.

[1] Bai, X., Luo, Z., Zhou, L., Fu, H., Quan, L., & Tai, C. L. (2020). D3feat: Joint learning of dense detection and description of 3d local features. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6359-6367). [2] Huang, S., Gojcic, Z., Usvyatsov, M., Wieser, A., & Schindler, K. (2021). Predator: Registration of 3d point clouds with low overlap. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition (pp. 4267-4276). [3] Yu, H., Li, F., Saleh, M., Busam, B., & Ilic, S. (2021). Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration. Advances in Neural Information Processing Systems, 34, 23872-23884.

opened by qinzheng93 4
Training for custom dataset

Hi @yewzijian,

Thanks for sharing your work. I would like to ask you whether you could elaborate with some details about how someone could train the model for a custom dataset.

Thanks.

opened by ttsesm 3
how to visualize the progress of the training process?

Is it possible to visualize the progress of the training pipeline described in https://github.com/yewzijian/RegTR#training with tensorboard or another lib?

opened by ttsesm 2
Setting parameter values for training of custom dataset

Hi @yewzijian! Thanks for sharing the codebase for your work. I am trying to train the network on custom data. As I went through the configuration, I found that for feature loss config., I need to set(r_p, r_n) which according to the paper are(m,2m), where m being the "voxel distance used in the final downsampling layer in the KPConv backbone". How do I figure out m for my dataset?

opened by praffulp 1

A CUDA Error

Dear Yew & other friends： I have run code on (just like in readme)： Python 3.8.8 PyTorch 1.9.1 with torchvision 0.10.1 (Cuda 11.1) PyTorch3D 0.6.0 MinkowskiEngine 0.5.4 RTX 3090

    But I got following error：

    recent call last):
      File "train.py", line 88, in <module>
        main()
      File "train.py", line 84, in main
        trainer.fit(model, train_loader, val_loader)
      File "/home/***/codes/RegTR-main/src/trainer.py", line 119, in fit
        losses['total'].backward()
      File "/home/***/enter/envs/regtr/lib/python3.8/site-packages/torch/_tensor.py", line 255, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
      File "/home/***/enter/envs/regtr/lib/python3.8/site-packages/torch/autograd/__init__.py", line 147, in backward
        Variable._execution_engine.run_backward(
    RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
    


    I have already tried to set os.environ['CUDA_LAUNCH_BLOCKING'] = '1'， but it did not work.

opened by Fzuerzmj 1

Is it possible to remove Minkowski Engine?

The last release date of minkowski is in May 2021. The dependencies of it might not be easily met with new software and hardware. I found it impossible to make RegTR train on my machine because of a cuda memory problem to which I found no solution. Without Minkowski, I would have more freedom when choosing the versions of pytorch and everything, so that I cound have more chance to solve this problem. I am a slam/c++ veteran and deep learning/python newbie(starting learning deep learning 2 weeks ago), so its hard for me to modify it myself for now. I was wondering if you could be so kind to release a version of RegTR without Minkowski.

opened by JaySlamer 1

Train BUG, please help me

When I execute the following command: python train.py --config conf/modelnet.yaml I got a Bug:


Traceback (most recent call last):
  File "train.py", line 85, in <module>
    main()
  File "train.py", line 81, in main
    trainer.fit(model, train_loader, val_loader)
  File "/home/zsy/Code/RegTR-main/src/trainer.py", line 79, in fit
    self._run_validation(model, val_loader, step=global_step,
  File "/home/zsy/Code/RegTR-main/src/trainer.py", line 249, in _run_validation
    val_out = model.validation_step(val_batch, val_batch_idx)
  File "/home/zsy/Code/RegTR-main/src/models/generic_reg_model.py", line 83, in validation_step
    pred = self.forward(batch)
  File "/home/zsy/Code/RegTR-main/src/models/regtr.py", line 117, in forward
    kpconv_meta = self.preprocessor(batch['src_xyz'] + batch['tgt_xyz'])
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/zsy/Code/RegTR-main/src/models/backbone_kpconv/kpconv.py", line 489, in forward
    pool_p, pool_b = batch_grid_subsampling_kpconv_gpu(
  File "/home/zsy/Code/RegTR-main/src/models/backbone_kpconv/kpconv.py", line 232, in batch_grid_subsampling_kpconv_gpu
    sparse_tensor = ME.SparseTensor(
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/MinkowskiEngine/MinkowskiSparseTensor.py", line 275, in __init__
    coordinates, features, coordinate_map_key = self.initialize_coordinates(
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/MinkowskiEngine/MinkowskiSparseTensor.py", line 338, in initialize_coordinates
    features = spmm_avg.apply(self.inverse_mapping, cols, size, features)
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/MinkowskiEngine/sparse_matrix_functions.py", line 183, in forward
    result, COO, vals = spmm_average(
  File "/home/zsy/anaconda3/envs/REG/lib/python3.8/site-packages/MinkowskiEngine/sparse_matrix_functions.py", line 93, in spmm_average
    result, COO, vals = MEB.coo_spmm_average_int32(
RuntimeError: CUSPARSE_STATUS_INVALID_VALUE at /tmp/pip-req-build-h0w4jzhp/src/spmm.cu:591

My environment is configured as required. I think the problem might be with the code below:

        features=points,
        coordinates=coord_batched,
        quantization_mode=ME.SparseTensorQuantizationMode.UNWEIGHTED_AVERAGE
    )

I can't solve it , please help me, thx

opened by immensitySea 3

How to get the image result of Visualization of attention?

Hi, Zi Jian:

Thanks for sharing so nice work. Could you mind sharing the method to reproduce your results for Visualization of attention? Just as presented by Fig5. and Fig6. from your paper?

opened by ZJU-PLP 2
A sparse tensor bug

ubuntu18.04 RTX3090 cuda11.1 MinkowskiEngine 0.5.4

The following error occurred when I tried to run your model。

(RegTR) ➜ src git:(main) ✗ python test.py --dev --resume ../trained_models/3dmatch/ckpt/model-best.pth --benchmark 3DMatch

/home/lileixin/anaconda3/envs/RegTR/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/init.py:36: UserWarning: The environment variable OMP_NUM_THREADS not set. MinkowskiEngine will automatically set OMP_NUM_THREADS=16. If you want to set OMP_NUM_THREADS manually, please export it on the command line before running a python script. e.g. export OMP_NUM_THREADS=12; python your_program.py. It is recommended to set it below 24. warnings.warn( /home/lileixin/anaconda3/envs/RegTR/lib/python3.8/site-packages/_distutils_hack/init.py:30: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.") 04/23 20:06:22 [INFO] root - Output and logs will be saved to ../logdev 04/23 20:06:22 [INFO] cvhelpers.misc - Command: test.py --dev --resume ../trained_models/3dmatch/ckpt/model-best.pth --benchmark 3DMatch 04/23 20:06:22 [INFO] cvhelpers.misc - Source is from Commit 64e5b3f0 (2022-03-28): Fixed minor typo in Readme.md and demo.py 04/23 20:06:22 [INFO] cvhelpers.misc - Arguments: benchmark: 3DMatch, config: None, logdir: ../logs, dev: True, name: None, num_workers: 0, resume: ../trained_models/3dmatch/ckpt/model-best.pth 04/23 20:06:22 [INFO] root - Using config file from checkpoint directory: ../trained_models/3dmatch/config.yaml 04/23 20:06:22 [INFO] data_loaders.threedmatch - Loading data from ../data/indoor 04/23 20:06:22 [INFO] RegTR - Instantiating model RegTR 04/23 20:06:22 [INFO] RegTR - Loss weighting: {'overlap_5': 1.0, 'feature_5': 0.1, 'corr_5': 1.0, 'feature_un': 0.0} 04/23 20:06:22 [INFO] RegTR - Config: d_embed:256, nheads:8, pre_norm:True, use_pos_emb:True, sa_val_has_pos_emb:True, ca_val_has_pos_emb:True 04/23 20:06:25 [INFO] CheckPointManager - Loaded models from ../trained_models/3dmatch/ckpt/model-best.pth 0%| | 0/1623 [00:00<?, ?it/s] ** On entry to cusparseSpMM_bufferSize() parameter number 1 (handle) had an illegal value: bad initialization or already destroyed

Traceback (most recent call last): File "test.py", line 75, in main() File "test.py", line 71, in main trainer.test(model, test_loader) File "/home/lileixin/work/Point_Registration/RegTR/src/trainer.py", line 204, in test test_out = model.test_step(test_batch, test_batch_idx) File "/home/lileixin/work/Point_Registration/RegTR/src/models/generic_reg_model.py", line 132, in test_step pred = self.forward(batch) File "/home/lileixin/work/Point_Registration/RegTR/src/models/regtr.py", line 117, in forward kpconv_meta = self.preprocessor(batch['src_xyz'] + batch['tgt_xyz']) File "/home/lileixin/anaconda3/envs/RegTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/lileixin/work/Point_Registration/RegTR/src/models/backbone_kpconv/kpconv.py", line 489, in forward pool_p, pool_b = batch_grid_subsampling_kpconv_gpu( File "/home/lileixin/work/Point_Registration/RegTR/src/models/backbone_kpconv/kpconv.py", line 232, in batch_grid_subsampling_kpconv_gpu sparse_tensor = ME.SparseTensor( File "/home/lileixin/anaconda3/envs/RegTR/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/MinkowskiSparseTensor.py", line 275, in init coordinates, features, coordinate_map_key = self.initialize_coordinates( File "/home/lileixin/anaconda3/envs/RegTR/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/MinkowskiSparseTensor.py", line 338, in initialize_coordinates features = spmm_avg.apply(self.inverse_mapping, cols, size, features) File "/home/lileixin/anaconda3/envs/RegTR/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/sparse_matrix_functions.py", line 183, in forward result, COO, vals = spmm_average( File "/home/lileixin/anaconda3/envs/RegTR/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/sparse_matrix_functions.py", line 93, in spmm_average result, COO, vals = MEB.coo_spmm_average_int32( RuntimeError: CUSPARSE_STATUS_INVALID_VALUE at /home/lileixin/MinkowskiEngine/src/spmm.cu:590 (RegTR) ➜ src git:(main) ✗ python test.py --dev --resume ../trained_models/3dmatch/ckpt/model-best.pth --benchmark 3DMatch

/home/lileixin/anaconda3/envs/RegTR/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/init.py:36: UserWarning: The environment variable OMP_NUM_THREADS not set. MinkowskiEngine will automatically set OMP_NUM_THREADS=16. If you want to set OMP_NUM_THREADS manually, please export it on the command line before running a python script. e.g. export OMP_NUM_THREADS=12; python your_program.py. It is recommended to set it below 24. warnings.warn( /home/lileixin/anaconda3/envs/RegTR/lib/python3.8/site-packages/_distutils_hack/init.py:30: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")

But when I cross out this line of code, the program can run. sparse_tensor = ME.SparseTensor( features=points, coordinates=coord_batched, #quantization_mode=ME.SparseTensorQuantizationMode.UNWEIGHTED_AVERAGE )

opened by caijillx 10

Releases(v1)

v1(Mar 28, 2022)

Trained checkpoints
Source code(tar.gz)
Source code(zip)
trained_models.zip(241.13 MB)

Owner

Zi Jian Yew

PhD candidate at National University of Singapore

GitHub Repository

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

LightHuBERT LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Github | Huggingface | SUPER

46 Dec 29, 2022

Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

ALPHAMEPOL This repository contains the implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Envir

3 Dec 23, 2021

git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction Code for the ECCV 2020 paper by Yiming Qian and Yasutaka Furukawa Getting

37 Dec 04, 2022

"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Texformer: 3D Human Texture Estimation from a Single Image with Transformers This is the official implementation of "3D Human Texture Estimation from

193 Dec 05, 2022

Image Matching Evaluation

Image Matching Evaluation (IME) IME provides to test any feature matching algorithm on datasets containing ground-truth homographies. Also, one can re

32 Nov 17, 2022

A collection of easy-to-use, ready-to-use, interesting deep neural network models

Interesting and reproducible research works should be conserved. This repository wraps a collection of deep neural network models into a simple and un

16 Jun 16, 2022

Junction Tree Variational Autoencoder for Molecular Graph Generation (ICML 2018)

Junction Tree Variational Autoencoder for Molecular Graph Generation Official implementation of our Junction Tree Variational Autoencoder https://arxi

418 Jan 07, 2023

we propose EfficientDerain for high-efficiency single-image deraining

EfficientDerain we propose EfficientDerain for high-efficiency single-image deraining Requirements python 3.6 pytorch 1.6.0 opencv-python 4.4.0.44 sci

126 Dec 07, 2022

Like ThreeJS but for Python and based on wgpu

pygfx A render engine, inspired by ThreeJS, but for Python and targeting Vulkan/Metal/DX12 (via wgpu). Introduction This is a Python render engine bui

139 Jan 07, 2023

Official implementation of Deep Reparametrization of Multi-Frame Super-Resolution and Denoising

Deep-Rep-MFIR Official implementation of Deep Reparametrization of Multi-Frame Super-Resolution and Denoising Publication: Deep Reparametrization of M

39 Jan 04, 2023

True Few-Shot Learning with Language Models

This codebase supports using language models (LMs) for true few-shot learning: learning to perform a task using a limited number of examples from a single task distribution.

124 Jan 04, 2023

Deep Q Learning with OpenAI Gym and Pokemon Showdown

pokemon-deep-learning An openAI gym project for pokemon involving deep q learning. Made by myself, Sam Little, and Layton Webber. This code captures g

2 Dec 22, 2021

PyTorch implementation of CloudWalk's recent work DenseBody

densebody_pytorch PyTorch implementation of CloudWalk's recent paper DenseBody. Note: For most recent updates, please check out the dev branch. Update

401 Nov 19, 2022

Dynamical movement primitives (DMPs), probabilistic movement primitives (ProMPs), spatially coupled bimanual DMPs.

Movement Primitives Movement primitives are a common group of policy representations in robotics. There are many different types and variations. This

63 Jan 06, 2023

Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning"

Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning" This is the code for the paper Solving Graph-based Public Goo

3 Dec 05, 2022

A Simulation Environment to train Robots in Large Realistic Interactive Scenes

iGibson: A Simulation Environment to train Robots in Large Realistic Interactive Scenes iGibson is a simulation environment providing fast visual rend

493 Jan 04, 2023

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

129 Dec 11, 2022

REGTR: End-to-end Point Cloud Correspondences with Transformers

Related tags

Overview

REGTR: End-to-end Point Cloud Correspondences with Transformers

Dataset environment

Data and Preparation

3DMatch

ModelNet

Pretrained models

Demo

Inference/Evaluation

3DMatch / 3DLoMatch

ModelNet

Training

3DMatch (Takes ~2.5 days on a Titan RTX)

ModelNet (Takes <2 days on a Titan RTX)

Acknowledgements

Comments

Releases(v1)

v1(Mar 28, 2022)

Owner

Zi Jian Yew

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Image Matching Evaluation

A collection of easy-to-use, ready-to-use, interesting deep neural network models

Junction Tree Variational Autoencoder for Molecular Graph Generation (ICML 2018)

we propose EfficientDerain for high-efficiency single-image deraining

Like ThreeJS but for Python and based on wgpu

Official implementation of Deep Reparametrization of Multi-Frame Super-Resolution and Denoising

True Few-Shot Learning with Language Models

Deep Q Learning with OpenAI Gym and Pokemon Showdown

PyTorch implementation of CloudWalk's recent work DenseBody

Dynamical movement primitives (DMPs), probabilistic movement primitives (ProMPs), spatially coupled bimanual DMPs.

Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning"

A Simulation Environment to train Robots in Large Realistic Interactive Scenes

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

Generalized Data Weighting via Class-level Gradient Manipulation

EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

ChebLieNet, a spectral graph neural network turned equivariant by Riemannian geometry on Lie groups.