A 2D Visual Localization Framework based on Essential Matrices [ICRA2020]

Overview

A 2D Visual Localization Framework based on Essential Matrices

This repository provides implementation of our paper accepted at ICRA: To Learn or Not to Learn: Visual Localization from Essential Matrices

Pipeline

To use our code, first download the repository:

git clone [email protected]:GrumpyZhou/visloc-relapose.git

Setup Running Environment

We have tested the code on Linux Ubuntu 16.04.6 under following environments:

Python 3.6 / 3.7
Pytorch 0.4.0 / 1.0 / 1.1 
CUDA 8.0 + CUDNN 8.0v5.1
CUDA 10.0 + CUDNN 10.0v7.5.1.10

The setting we used in the paper is:
Python 3.7 + Pytorch 1.1 + CUDA 10.0 + CUDNN 10.0v7.5.1.10

We recommend to use Anaconda to manage packages. Run following lines to automatically setup a ready environment for our code.

conda env create -f environment.yml  # Notice this one installs latest pytorch version.
conda activte relapose

Otherwise, one can try to download all required packages separately according to their offical documentation.

Prepare Datasets

Our code is flexible for evaluation on various localization datasets. We use Cambridge Landmarks dataset as an example to show how to prepare a dataset:

  1. Create data/ folder
  2. Download original Cambridge Landmarks Dataset and extract it to $CAMBRIDGE_DIR$.
  3. Construct the following folder structure in order to conveniently run all scripts in this repo:
    cd visloc-relapose/
    mkdir data
    mkdir data/datasets_original
    cd data/original_datasets
    ln -s $CAMBRIDGE_DIR$ CambridgeLandmarks
    
  4. Download our pairs for training, validation and testing. About the format of our pairs, check readme.
  5. Place the pairs to corresponding folder under data/datasets_original/CambridgeLandmarks.
  6. Pre-save resized 480 images to speed up data loading time for regression models (Optional, but Recommended)
    cd visloc-relapose/
    python -m utils.datasets.resize_dataset \
    	--base_dir data/datasets_original/CambridgeLandmarks \ 
    	--save_dir=data/datasets_480/CambridgeLandmarks \
    	--resize 480  --copy_txt True 
    
  7. Test your setup by visualizing the data using notebooks/data_loading.ipynb.

7Scenes Datasets

We follow the camera pose label convention of Cambridge Landmarks dataset. Similarly, you can download our pairs for 7Scenes. For other datasets, contact me for information about preprocessing and pair generation.

Feature-based: SIFT + 5-Point Solver

We use the SIFT feature extractor and feature matcher in colmap. One can follow the installation guide to install colmap. We save colmap outputs in database format, see explanation.

Preparing SIFT features

Execute following commands to run SIFT extraction and matching on CambridgeLandmarks:

cd visloc-relapose/
bash prepare_colmap_data.sh  CambridgeLandmarks

Here CambridgeLandmarks is the folder name that is consistent with the dataset folder. So you can also use other dataset names such as 7Scenes if you have prepared the dataset properly in advance.

Evaluate SIFT within our pipeline

Example to run sift+5pt on Cambridge Landmarks:

python -m pipeline.sift_5pt \
        --data_root 'data/datasets_original/' \
        --dataset 'CambridgeLandmarks' \
        --pair_txt 'test_pairs.5nn.300cm50m.vlad.minmax.txt' \
        --cv_ransac_thres 0.5\
        --loc_ransac_thres 5\
        -odir 'output/sift_5pt'\
        -log 'results.dvlad.minmax.txt'

More evaluation examples see: sift_5pt.sh. Check example outputs Visualize SIFT correspondences using notebooks/visualize_sift_matches.ipynb.

Learning-based: Direct Regression via EssNet

The pipeline.relapose_regressor module can be used for both training or testing our regression networks defined under networks/, e.g., EssNet, NCEssNet, RelaPoseNet... We provide training and testing examples in regression.sh. The module allows flexible variations of the setting. For more details about the module options, run python -m pipeline.relapose_regressor -h.

Training

Here we show an example how to train an EssNet model on ShopFacade scene.

python -m pipeline.relapose_regressor \
        --gpu 0 -b 16 --train -val 20 --epoch 200 \
        --data_root 'data/datasets_480' -ds 'CambridgeLandmarks' \
        --incl_sces 'ShopFacade' \
        -rs 480 --crop 448 --normalize \
        --ess_proj --network 'EssNet' --with_ess\
        --pair 'train_pairs.30nn.medium.txt' -vpair 'val_pairs.5nn.medium.txt' \
        -lr 0.0001 -wd 0.000001 \
        --odir  'output/regression_models/example' \
        -vp 9333 -vh 'localhost' -venv 'main' -vwin 'example.shopfacade' 

This command produces outputs are available online here.

Visdom (optional)

As you see in the example above, we use Visdom server to visualize the training process. One can adapt the meters to plot inside utils/common/visdom.py. If you DON'T want to use visdom, just remove the last line -vp 9333 -vh 'localhost' -venv 'main' -vwin 'example.shopfacade'.

Trained models and weights

We release all trained models that are used in our paper. One can download them from pretrained regression models. We also provide some pretrained weights on MegaDepth/ScanNet.

Testing

Here is a piece of code to test the example model above.

python -m pipeline.relapose_regressor \
        --gpu 2 -b 16  --test \
        --data_root 'data/datasets_480' -ds 'CambridgeLandmarks' \
        --incl_sces 'ShopFacade' \
        -rs 480 --crop 448 --normalize\
        --ess_proj --network 'EssNet'\
        --pair 'test_pairs.5nn.300cm50m.vlad.minmax.txt'\
        --resume 'output/regression_models/example/ckpt/checkpoint_140_0.36m_1.97deg.pth' \
        --odir 'output/regression_models/example'

This testing code outputs are shown in test_results.txt. For convenience, we also provide notebooks/eval_regression_models.ipynb to perform evaluation.

Hybrid: Learnable Matching + 5-Point Solver

In this method, the code of the NCNet is taken from the original implementation https://github.com/ignacio-rocco/ncnet. We use their pre-trained model but we only use the weights for neighbourhood consensus(NC-Matching), i.e., the 4d-conv layer weights. For convenience, you can download our parsed version nc_ivd_5ep.pth. The models for feature extractor initialization needs to be downloaded from pretrained regression models in advance, if you want to test them.

Testing example for NC-EssNet(7S)+NCM+5Pt (Paper.Tab2)

In this example, we use NCEssNet trained on 7Scenes for 60 epochs to extract features and use the pre-trained NC Matching layer to get the point matches. Finally the 5 point solver calculates the essential matrix. The model is evaluated on CambridgeLandmarks.

# 
python -m pipeline.ncmatch_5pt \
    --data_root 'data/datasets_original' \
    --dataset 'CambridgeLandmarks' \
    --pair_txt 'test_pairs.5nn.300cm50m.vlad.minmax.txt' \
    --cv_ransac_thres 4.0\
    --loc_ransac_thres 15\
    --feat 'output/regression_models/448_normalize/nc-essnet/7scenes/checkpoint_60_0.04m_1.62deg.pth'\
    --ncn 'output/pretrained_weights/nc_ivd_5ep.pth' \    
    --posfix 'essncn_7sc_60ep+ncn'\
    --match_save_root 'output/ncmatch_5pt/saved_matches'\
    --ncn_thres 0.9 \
    --gpu 2\
    -o 'output/ncmatch_5pt/loc_results/Cambridge/essncn_7sc_60ep+ncn.txt' 

Example outputs is available in essncn_7sc_60ep+ncn.txt. If you don't want to save THE intermediate matches extracted, remove THE option --match_save_root.

Owner
Qunjie Zhou
PhD Candidate at the Dynamic Vision and Learning Group.
Qunjie Zhou
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Hierarchical Token Semantic Audio Transformer Introduction The Code Repository for "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound

Knut(Ke) Chen 134 Jan 01, 2023
Implementation of algorithms for continuous control (DDPG and NAF).

DEPRECATION This repository is deprecated and is no longer maintaned. Please see a more recent implementation of RL for continuous control at jax-sac.

Ilya Kostrikov 288 Dec 31, 2022
SpanNER: Named EntityRe-/Recognition as Span Prediction

SpanNER: Named EntityRe-/Recognition as Span Prediction Overview | Demo | Installation | Preprocessing | Prepare Models | Running | System Combination

NeuLab 104 Dec 17, 2022
[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers

VisTR: End-to-End Video Instance Segmentation with Transformers This is the official implementation of the VisTR paper: Installation We provide instru

Yuqing Wang 687 Jan 07, 2023
Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

Implementation for the paper: Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs, Nurendra Choudhary, Nikhil Rao, Sumeet Ka

Nurendra Choudhary 8 Nov 15, 2022
Reinforcement Learning for the Blackjack

Reinforcement Learning for Blackjack Author: ZHA Mengyue Math Department of HKUST Problem Statement We study playing Blackjack by reinforcement learni

Dolores 3 Jan 24, 2022
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab 89 Dec 26, 2022
Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

Building Shazam from scratch In this repository we tried to implement a simplified copy of the Shazam application able to tell you the name of a song

Arturo Ghinassi 0 Nov 17, 2022
The 2nd place solution of 2021 google landmark retrieval on kaggle.

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

229 Dec 13, 2022
Iranian Cars Detection using Yolov5s, PyTorch

Iranian Cars Detection using Yolov5 Train 1- git clone https://github.com/ultralytics/yolov5 cd yolov5 pip install -r requirements.txt 2- Dataset ../

Nahid Ebrahimian 22 Dec 05, 2022
Studying Python release adoptions by looking at PyPI downloads

Analysis of version adoptions on PyPI We get PyPI download statistics via Google's BigQuery using the pypinfo tool. Usage First you need to get an acc

Julien Palard 9 Nov 04, 2022
An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners This is a coarse version for MAE, only make the pretrain model, the fine

FlyEgle 214 Dec 29, 2022
This repo is a C++ version of yolov5_deepsort_tensorrt. Packing all C++ programs into .so files, using Python script to call C++ programs further.

yolov5_deepsort_tensorrt_cpp Introduction This repo is a C++ version of yolov5_deepsort_tensorrt. And packing all C++ programs into .so files, using P

41 Dec 27, 2022
Iterative Normalization: Beyond Standardization towards Efficient Whitening

IterNorm Code for reproducing the results in the following paper: Iterative Normalization: Beyond Standardization towards Efficient Whitening Lei Huan

Lei Huang 21 Dec 27, 2022
Benchmarking the robustness of Spatial-Temporal Models

Benchmarking the robustness of Spatial-Temporal Models This repositery contains the code for the paper Benchmarking the Robustness of Spatial-Temporal

Yi Chenyu Ian 15 Dec 16, 2022
A PyTorch implementation of the Relational Graph Convolutional Network (RGCN).

Torch-RGCN Torch-RGCN is a PyTorch implementation of the RGCN, originally proposed by Schlichtkrull et al. in Modeling Relational Data with Graph Conv

Thiviyan Singam 66 Nov 30, 2022
This project implements "virtual speed" from heart rate monito

ANT+ Virtual Stride Based Speed and Distance Monitor Overview This project imple

2 May 20, 2022
Implementation of our recent paper, WOOD: Wasserstein-based Out-of-Distribution Detection.

WOOD Implementation of our recent paper, WOOD: Wasserstein-based Out-of-Distribution Detection. Abstract The training and test data for deep-neural-ne

8 Dec 24, 2022
Hl classification bc - A Network-Based High-Level Data Classification Algorithm Using Betweenness Centrality

A Network-Based High-Level Data Classification Algorithm Using Betweenness Centr

Esteban Vilca 3 Dec 01, 2022
Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

Manifold-SCA Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning The repo is org

Yuanyuan Yuan 172 Dec 29, 2022