EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

Overview

EgonNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

Paper: EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale submitted to IEEE Robotics and Automation Letters (RA-L) (ArXiv)

Jacek Komorowski, Monika Wysoczanska, Tomasz Trzcinski

Warsaw University of Technology

What's new

  • [2021-10-24] Evaluation code and pretrained models released.

Our other projects

  • MinkLoc3D: Point Cloud Based Large-Scale Place Recognition (WACV 2021): MinkLoc3D
  • MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition (IJCNN 2021): MinkLoc++
  • Large-Scale Topological Radar Localization Using Learned Descriptors (ICONIP 2021): RadarLoc

Introduction

The paper presents a deep neural network-based method for global and local descriptors extraction from a point cloud acquired by a rotating 3D LiDAR sensor. The descriptors can be used for two-stage 6DoF relocalization. First, a course position is retrieved by finding candidates with the closest global descriptor in the database of geo-tagged point clouds. Then, 6DoF pose between a query point cloud and a database point cloud is estimated by matching local descriptors and using a robust estimator such as RANSAC. Our method has a simple, fully convolutional architecture and uses a sparse voxelized representation of the input point cloud. It can efficiently extract a global descriptor and a set of keypoints with their local descriptors from large point clouds with tens of thousand points.

Citation

If you find this work useful, please consider citing:

Environment and Dependencies

Code was tested using Python 3.8 with PyTorch 1.9.1 and MinkowskiEngine 0.5.4 on Ubuntu 20.04 with CUDA 10.2. Note: CUDA 11.1 is not recommended as there are some issues with MinkowskiEngine 0.5.4 on CUDA 11.1.

The following Python packages are required:

  • PyTorch (version 1.9.1)
  • MinkowskiEngine (version 0.5.4)
  • pytorch_metric_learning (version 0.9.99 or above)
  • wandb

Modify the PYTHONPATH environment variable to include absolute path to the project root folder:

export PYTHONPATH=$PYTHONPATH:/home/.../Egonn

Datasets

EgoNN is trained and evaluated using the following datasets:

  • MulRan dataset: Sejong traversal is used. The traversal is split into training and evaluation part link
  • Apollo-SouthBay dataset: SunnyvaleBigLoop trajectory is used for evaluation, other 5 trajectories (BaylandsToSeafood, ColumbiaPark, Highway237, MathildaAVE, SanJoseDowntown) are used for training link
  • Kitti dataset: Sequence 00 is used for evaluation link

First, you need to download datasets:

  • For MulRan dataset you need to download ground truth data (*.csv) and LiDAR point clouds (Ouster.zip) for traversals: Sejong01 and Sejong02 (link).
  • Download Apollo-SouthBay dataset using the download link on the dataset website (link).
  • Download Kitti odometry dataset (calibration files, ground truth poses, Velodyne laser data) (link).

After loading datasets you need to generate training pickles for the network training and evaluation pickles for model evaluation.

Training pickles generation

Generating training tuples is very time consuming, as ICP is used to refine the ground truth poses between each pair of neighbourhood point clouds.

cd datasets/mulran
python generate_training_tuples.py --dataset_root <mulran_dataset_root_path>

cd ../southbay
python generate_training_tuples.py --dataset_root <apollo_southbay_dataset_root_path>
Evaluation pickles generation
cd datasets/mulran
python generate_evaluation_sets.py --dataset_root <mulran_dataset_root_path>

cd ../southbay
python generate_evaluation_sets.py --dataset_root <apollo_southbay_dataset_root_path>

cd ../kitti
python generate_evaluation_sets.py --dataset_root <kitti_dataset_root_path>

Training (training code will be released after the paper acceptance)

First, download datasets and generate training and evaluation pickles as described above. Edit the configuration file config_egonn.txt. Set dataset_folder parameter to point to the dataset root folder. Modify batch_size_limit and secondary_batch_size_limit parameters depending on available GPU memory. Default limits requires at least 11GB of GPU RAM.

To train the EgoNN model, run:

cd training

python train.py --config ../config/config_egonn.txt --model_config ../models/egonn.txt 

Pre-trained Model

EgoNN model trained (on training splits of MulRan and Apollo-SouthBay datasets) is available in weights/model_egonn_20210916_1104.pth folder.

Evaluation

To evaluate a pretrained model run below commands. Ground truth poses between different traversals in all three datasets are slightly misaligned. To reproduce results from the paper, use --icp_refine option to refine ground truth poses using ICP.

cd eval

# To evaluate on test split of Mulran dataset
python evaluate.py --dataset_root <dataset_root_path> --dataset_type mulran --eval_set test_Sejong01_Sejong02.pickle --model_config ../models/egonn.txt --weights ../weights/model_egonn_20210916_1104.pth --icp_refine

# To evaluate on test split of Apollo-SouthBay dataset
python evaluate.py --dataset_root <dataset_root_path> --dataset_type southbay --eval_set test_SunnyvaleBigloop_1.0_5.pickle --model_config ../models/egonn.txt --weights ../weights/model_egonn_20210916_1104.pth --icp_refine

# To evaluate on test split of KITTI dataset
python evaluate.py --dataset_root <dataset_root_path> --dataset_type kitti --eval_set kitti_00_eval.pickle --model_config ../models/egonn.txt --weights ../weights/model_egonn_20210916_1104.pth --icp_refine

Results

EgoNN performance...

Visualizations

Visualizations of our keypoint detector results. On the left, we show 128 keypoints with the lowest saliency uncertainty (red dots). On the right, 128 keypoints with the highest uncertainty (yellow dots).

Successful registration of point cloud pairs from KITTI dataset gathered during revisiting the same place from different directions. On the left we show keypoint correspondences (RANSAC inliers) found during 6DoF pose estimation with RANSAC. On the right we show point clouds aligned using estimated poses.

License

Our code is released under the MIT License (see LICENSE file for details).

An implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

This work has now been superseded by: https://github.com/sniklaus/revisiting-sepconv sepconv-slomo This is a reference implementation of Video Frame I

Simon Niklaus 984 Dec 16, 2022
[NeurIPS 2020] Official repository for the project "Listening to Sound of Silence for Speech Denoising"

Listening to Sounds of Silence for Speech Denoising Introduction This is the repository of the "Listening to Sounds of Silence for Speech Denoising" p

Henry Xu 40 Dec 20, 2022
2D Human Pose estimation using transformers. Implementation in Pytorch

PE-former: Pose Estimation Transformer Vision transformer architectures perform very well for image classification tasks. Efforts to solve more challe

Panteleris Paschalis 23 Oct 17, 2022
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

ALBERT ***************New March 28, 2020 *************** Add a colab tutorial to run fine-tuning for GLUE datasets. ***************New January 7, 2020

Google Research 3k Jan 01, 2023
An official PyTorch implementation of the TKDE paper "Self-Supervised Graph Representation Learning via Topology Transformations".

Self-Supervised Graph Representation Learning via Topology Transformations This repository is the official PyTorch implementation of the following pap

Hsiang Gao 2 Oct 31, 2022
Robot Hacking Manual (RHM). From robotics to cybersecurity. Papers, notes and writeups from a journey into robot cybersecurity.

RHM: Robot Hacking Manual Download in PDF RHM v0.4 ┃ Read online The Robot Hacking Manual (RHM) is an introductory series about cybersecurity for robo

Víctor Mayoral Vilches 233 Dec 30, 2022
3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

3DMV 3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 p

Владислав Молодцов 0 Feb 06, 2022
An implementation of the proximal policy optimization algorithm

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022
Official Code for "Non-deep Networks"

Non-deep Networks arXiv:2110.07641 Ankit Goyal, Alexey Bochkovskiy, Jia Deng, Vladlen Koltun Overview: Depth is the hallmark of DNNs. But more depth m

Ankit Goyal 567 Dec 12, 2022
Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

WenxueCui 7 Nov 07, 2022
Fast and Simple Neural Vocoder, the Multiband RNNMS

Multiband RNN_MS Fast and Simple vocoder, Multiband RNN_MS. Demo Quick training How to Use System Details Results References Demo ToDO: Link super gre

tarepan 5 Jan 11, 2022
COPA-SSE contains crowdsourced explanations for the Balanced COPA dataset

COPA-SSE Repository for COPA-SSE: Semi-Structured Explanations for Commonsense Reasoning. COPA-SSE contains crowdsourced explanations for the Balanced

Ana Brassard 5 Jul 31, 2022
A generator of point clouds dataset for PyPipes.

CloudPipesGenerator Documentation | Colab Notebooks | Video Tutorials | Master Degree website A generator of point clouds dataset for PyPipes. TODO Us

1 Jan 13, 2022
The code succinctly shows how our ensemble learning based on deep learning CNN is used for LAM-avulsion-diagnosis.

deep-learning-LAM-avulsion-diagnosis The code succinctly shows how our ensemble learning based on deep learning CNN is used for LAM-avulsion-diagnosis

1 Jan 12, 2022
Look Who’s Talking: Active Speaker Detection in the Wild

Look Who's Talking: Active Speaker Detection in the Wild Dependencies pip install -r requirements.txt In addition to the Python dependencies, ffmpeg

Clova AI Research 60 Dec 08, 2022
Have you ever wondered how cool it would be to have your own A.I

Have you ever wondered how cool it would be to have your own A.I. assistant Imagine how easier it would be to send emails without typing a single word, doing Wikipedia searches without opening web br

Harsh Gupta 1 Nov 09, 2021
CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021

crfill Usage | Web App | | Paper | Supplementary Material | More results | code for paper ``CR-Fill: Generative Image Inpainting with Auxiliary Contex

182 Dec 20, 2022
GluonMM is a library of transformer models for computer vision and multi-modality research

GluonMM is a library of transformer models for computer vision and multi-modality research. It contains reference implementations of widely adopted baseline models and also research work from Amazon

42 Dec 02, 2022
Official Implementation for the "An Empirical Investigation of 3D Anomaly Detection and Segmentation" paper.

An Empirical Investigation of 3D Anomaly Detection and Segmentation Project | Paper Official PyTorch Implementation for the "An Empirical Investigatio

Eliahu Horwitz 55 Dec 14, 2022
Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

This repo is the official implementation of "Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework". @inproceedings{zhou2021insta

34 Dec 31, 2022