A 2D Visual Localization Framework based on Essential Matrices [ICRA2020]

Overview

A 2D Visual Localization Framework based on Essential Matrices

This repository provides implementation of our paper accepted at ICRA: To Learn or Not to Learn: Visual Localization from Essential Matrices

Pipeline

To use our code, first download the repository:

git clone [email protected]:GrumpyZhou/visloc-relapose.git

Setup Running Environment

We have tested the code on Linux Ubuntu 16.04.6 under following environments:

Python 3.6 / 3.7
Pytorch 0.4.0 / 1.0 / 1.1 
CUDA 8.0 + CUDNN 8.0v5.1
CUDA 10.0 + CUDNN 10.0v7.5.1.10

The setting we used in the paper is:
Python 3.7 + Pytorch 1.1 + CUDA 10.0 + CUDNN 10.0v7.5.1.10

We recommend to use Anaconda to manage packages. Run following lines to automatically setup a ready environment for our code.

conda env create -f environment.yml  # Notice this one installs latest pytorch version.
conda activte relapose

Otherwise, one can try to download all required packages separately according to their offical documentation.

Prepare Datasets

Our code is flexible for evaluation on various localization datasets. We use Cambridge Landmarks dataset as an example to show how to prepare a dataset:

  1. Create data/ folder
  2. Download original Cambridge Landmarks Dataset and extract it to $CAMBRIDGE_DIR$.
  3. Construct the following folder structure in order to conveniently run all scripts in this repo:
    cd visloc-relapose/
    mkdir data
    mkdir data/datasets_original
    cd data/original_datasets
    ln -s $CAMBRIDGE_DIR$ CambridgeLandmarks
    
  4. Download our pairs for training, validation and testing. About the format of our pairs, check readme.
  5. Place the pairs to corresponding folder under data/datasets_original/CambridgeLandmarks.
  6. Pre-save resized 480 images to speed up data loading time for regression models (Optional, but Recommended)
    cd visloc-relapose/
    python -m utils.datasets.resize_dataset \
    	--base_dir data/datasets_original/CambridgeLandmarks \ 
    	--save_dir=data/datasets_480/CambridgeLandmarks \
    	--resize 480  --copy_txt True 
    
  7. Test your setup by visualizing the data using notebooks/data_loading.ipynb.

7Scenes Datasets

We follow the camera pose label convention of Cambridge Landmarks dataset. Similarly, you can download our pairs for 7Scenes. For other datasets, contact me for information about preprocessing and pair generation.

Feature-based: SIFT + 5-Point Solver

We use the SIFT feature extractor and feature matcher in colmap. One can follow the installation guide to install colmap. We save colmap outputs in database format, see explanation.

Preparing SIFT features

Execute following commands to run SIFT extraction and matching on CambridgeLandmarks:

cd visloc-relapose/
bash prepare_colmap_data.sh  CambridgeLandmarks

Here CambridgeLandmarks is the folder name that is consistent with the dataset folder. So you can also use other dataset names such as 7Scenes if you have prepared the dataset properly in advance.

Evaluate SIFT within our pipeline

Example to run sift+5pt on Cambridge Landmarks:

python -m pipeline.sift_5pt \
        --data_root 'data/datasets_original/' \
        --dataset 'CambridgeLandmarks' \
        --pair_txt 'test_pairs.5nn.300cm50m.vlad.minmax.txt' \
        --cv_ransac_thres 0.5\
        --loc_ransac_thres 5\
        -odir 'output/sift_5pt'\
        -log 'results.dvlad.minmax.txt'

More evaluation examples see: sift_5pt.sh. Check example outputs Visualize SIFT correspondences using notebooks/visualize_sift_matches.ipynb.

Learning-based: Direct Regression via EssNet

The pipeline.relapose_regressor module can be used for both training or testing our regression networks defined under networks/, e.g., EssNet, NCEssNet, RelaPoseNet... We provide training and testing examples in regression.sh. The module allows flexible variations of the setting. For more details about the module options, run python -m pipeline.relapose_regressor -h.

Training

Here we show an example how to train an EssNet model on ShopFacade scene.

python -m pipeline.relapose_regressor \
        --gpu 0 -b 16 --train -val 20 --epoch 200 \
        --data_root 'data/datasets_480' -ds 'CambridgeLandmarks' \
        --incl_sces 'ShopFacade' \
        -rs 480 --crop 448 --normalize \
        --ess_proj --network 'EssNet' --with_ess\
        --pair 'train_pairs.30nn.medium.txt' -vpair 'val_pairs.5nn.medium.txt' \
        -lr 0.0001 -wd 0.000001 \
        --odir  'output/regression_models/example' \
        -vp 9333 -vh 'localhost' -venv 'main' -vwin 'example.shopfacade' 

This command produces outputs are available online here.

Visdom (optional)

As you see in the example above, we use Visdom server to visualize the training process. One can adapt the meters to plot inside utils/common/visdom.py. If you DON'T want to use visdom, just remove the last line -vp 9333 -vh 'localhost' -venv 'main' -vwin 'example.shopfacade'.

Trained models and weights

We release all trained models that are used in our paper. One can download them from pretrained regression models. We also provide some pretrained weights on MegaDepth/ScanNet.

Testing

Here is a piece of code to test the example model above.

python -m pipeline.relapose_regressor \
        --gpu 2 -b 16  --test \
        --data_root 'data/datasets_480' -ds 'CambridgeLandmarks' \
        --incl_sces 'ShopFacade' \
        -rs 480 --crop 448 --normalize\
        --ess_proj --network 'EssNet'\
        --pair 'test_pairs.5nn.300cm50m.vlad.minmax.txt'\
        --resume 'output/regression_models/example/ckpt/checkpoint_140_0.36m_1.97deg.pth' \
        --odir 'output/regression_models/example'

This testing code outputs are shown in test_results.txt. For convenience, we also provide notebooks/eval_regression_models.ipynb to perform evaluation.

Hybrid: Learnable Matching + 5-Point Solver

In this method, the code of the NCNet is taken from the original implementation https://github.com/ignacio-rocco/ncnet. We use their pre-trained model but we only use the weights for neighbourhood consensus(NC-Matching), i.e., the 4d-conv layer weights. For convenience, you can download our parsed version nc_ivd_5ep.pth. The models for feature extractor initialization needs to be downloaded from pretrained regression models in advance, if you want to test them.

Testing example for NC-EssNet(7S)+NCM+5Pt (Paper.Tab2)

In this example, we use NCEssNet trained on 7Scenes for 60 epochs to extract features and use the pre-trained NC Matching layer to get the point matches. Finally the 5 point solver calculates the essential matrix. The model is evaluated on CambridgeLandmarks.

# 
python -m pipeline.ncmatch_5pt \
    --data_root 'data/datasets_original' \
    --dataset 'CambridgeLandmarks' \
    --pair_txt 'test_pairs.5nn.300cm50m.vlad.minmax.txt' \
    --cv_ransac_thres 4.0\
    --loc_ransac_thres 15\
    --feat 'output/regression_models/448_normalize/nc-essnet/7scenes/checkpoint_60_0.04m_1.62deg.pth'\
    --ncn 'output/pretrained_weights/nc_ivd_5ep.pth' \    
    --posfix 'essncn_7sc_60ep+ncn'\
    --match_save_root 'output/ncmatch_5pt/saved_matches'\
    --ncn_thres 0.9 \
    --gpu 2\
    -o 'output/ncmatch_5pt/loc_results/Cambridge/essncn_7sc_60ep+ncn.txt' 

Example outputs is available in essncn_7sc_60ep+ncn.txt. If you don't want to save THE intermediate matches extracted, remove THE option --match_save_root.

Owner
Qunjie Zhou
PhD Candidate at the Dynamic Vision and Learning Group.
Qunjie Zhou
Ros2-voiceroid2 - ROS2 wrapper package of VOICEROID2

ros2_voiceroid2 ROS2 wrapper package of VOICEROID2 Windows Only Installation Ins

Nkyoku 1 Jan 23, 2022
A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL

🌟 HNSW + PostgreSQL Indexer HNSWPostgreSQLIndexer Jina is a production-ready, scalable Indexer for the Jina neural search framework. It combines the

Jina AI 25 Oct 14, 2022
A more easy-to-use implementation of KPConv

A more easy-to-use implementation of KPConv This repo contains a more easy-to-use implementation of KPConv based on PyTorch. Introduction KPConv is a

Zheng Qin 35 Dec 14, 2022
Contains supplementary materials for reproduce results in HMC divergence time estimation manuscript

Scalable Bayesian divergence time estimation with ratio transformations This repository contains the instructions and files to reproduce the analyses

Suchard Research Group 1 Sep 21, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Dec 31, 2022
Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Human-Level Control through Deep Reinforcement Learning Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning. This imp

Devsisters Corp. 2.4k Dec 26, 2022
[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods Large Scale Learning on Non-Homophilous Graphs: New Benchmark

60 Jan 03, 2023
Human Pose Detection on EdgeTPU

Coral PoseNet Pose estimation refers to computer vision techniques that detect human figures in images and video, so that one could determine, for exa

google-coral 476 Dec 31, 2022
Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"

Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline Ankit Goyal, Hei Law, Bowei Liu, Alejandro Newell, Jia Deng Internati

Princeton Vision & Learning Lab 115 Jan 04, 2023
Code samples for my book "Neural Networks and Deep Learning"

Code samples for "Neural Networks and Deep Learning" This repository contains code samples for my book on "Neural Networks and Deep Learning". The cod

Michael Nielsen 13.9k Dec 26, 2022
MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Octave Convolution MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution Imag

Meta Research 549 Dec 28, 2022
[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

Sparse Structure Learning via Graph Neural Networks for inductive document classification Make graph dataset create co-occurrence graph for datasets.

16 Dec 22, 2022
Official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION.

IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION This is the official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSU

电线杆 14 Dec 15, 2022
8-week curriculum for AI Builders

curriculum 8-week curriculum for AI Builders สารบัญ บทที่ 1 - Machine Learning คืออะไร บทที่ 2 - ชุดข้อมูลมหัศจรรย์และถิ่นที่อยู่ บทที่ 3 - Stochastic

AI Builders 134 Jan 03, 2023
Some experiments with tennis player aging curves using Hilbert space GPs in PyMC. Only experimental for now.

NOTE: This is still being developed! Setup notes This document uses Jeff Sackmann's tennis data. You can obtain it as follows: git clone https://githu

Martin Ingram 1 Jan 20, 2022
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

NTT Communication Science Laboratories 160 Jan 04, 2023
A faster pytorch implementation of faster r-cnn

A Faster Pytorch Implementation of Faster R-CNN Write at the beginning [05/29/2020] This repo was initaited about two years ago, developed as the firs

Jianwei Yang 7.1k Jan 01, 2023
Official PyTorch Implementation of "Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs". NeurIPS 2020.

Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs This repository is the implementation of SELAR. Dasol Hwang* , Jinyoung Pa

MLV Lab (Machine Learning and Vision Lab at Korea University) 48 Nov 09, 2022
Unsupervised Representation Learning via Neural Activation Coding

Neural Activation Coding This repository contains the code for the paper "Unsupervised Representation Learning via Neural Activation Coding" published

yookoon park 5 May 26, 2022
SMPL-X: A new joint 3D model of the human body, face and hands together

SMPL-X: A new joint 3D model of the human body, face and hands together [Paper Page] [Paper] [Supp. Mat.] Table of Contents License Description News I

Vassilis Choutas 1k Jan 09, 2023