Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Related tags

Deep Learningpvnet
Overview

Good news! We release a clean version of PVNet: clean-pvnet, including

  1. how to train the PVNet on the custom dataset.
  2. Use PVNet with a detector.
  3. The training and testing on the tless dataset, where we detect multiple instances in an image.

PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation

introduction

PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation
Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Zhou, Hujun Bao
CVPR 2019 oral
Project Page

Any questions or discussions are welcomed!

Truncation LINEMOD Dataset

Check TRUNCATION_LINEMOD.md for information about the Truncation LINEMOD dataset.

Installation

One way is to set up the environment with docker: How to install pvnet with docker.

Thanks Joe Dinius for providing the docker implementation.

Another way is to use the following commands.

  1. Set up python 3.6.7 environment
pip install -r requirements.txt

We need compile several files, which works fine with pytorch v0.4.1/v1.1 and gcc 5.4.0.

For users with a RTX GPU, you must use CUDA10 and pytorch v1.1 built from CUDA10.

  1. Compile the Ransac Voting Layer
ROOT=/path/to/pvnet
cd $ROOT/lib/ransac_voting_gpu_layer
python setup.py build_ext --inplace
  1. Compile some extension utils
cd $ROOT/lib/utils/extend_utils

Revise the cuda_include and dart in build_extend_utils_cffi.py to be compatible with the CUDA in your computer.

sudo apt-get install libgoogle-glog-dev=0.3.4-0.1
sudo apt-get install libsuitesparse-dev=1:4.4.6-1
sudo apt-get install libatlas-base-dev=3.10.2-9
python build_extend_utils_cffi.py

If you cannot install libsuitesparse-dev=1:4.4.6-1, please install libsuitesparse, run build_ceres.sh and move ceres/ceres-solver/build/lib/libceres.so* to lib/utils/extend_utils/lib.

Add the lib under extend_utils to the LD_LIBRARY_PATH

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/pvnet/lib/utils/extend_utils/lib

Dataset Configuration

Prepare the dataset

Download the LINEMOD, which can be found at here.

Download the LINEMOD_ORIG, which can be found at here.

Download the OCCLUSION_LINEMOD, which can be found at here.

Create the soft link

mkdir $ROOT/data
ln -s path/to/LINEMOD $ROOT/data/LINEMOD
ln -s path/to/LINEMOD_ORIG $ROOT/data/LINEMOD_ORIG
ln -s path/to/OCCLUSION_LINEMOD $ROOT/data/OCCLUSION_LINEMOD

Compute FPS keypoints

python lib/utils/data_utils.py

Synthesize images for each object

See pvnet-rendering for information about the image synthesis.

Demo

Download the pretrained model of cat from here and put it to $ROOT/data/model/cat_demo/199.pth.

Run the demo

python tools/demo.py

If setup correctly, the output will look like

cat

Visualization of the voting procedure

We add a jupyter notebook visualization.ipynb for the keypoint detection pipeline of PVNet, aiming to make it easier for readers to understand our paper. Thanks for Kudlur, M 's suggestion.

Training and testing

Training on the LINEMOD

Before training, remember to add the lib under extend_utils to the LD_LIDBRARY_PATH

export LD_LIDBRARY_PATH=$LD_LIDBRARY_PATH:/path/to/pvnet/lib/utils/extend_utils/lib

Training

python tools/train_linemod.py --cfg_file configs/linemod_train.json --linemod_cls cat

Testing

We provide the pretrained models of each object, which can be found at here.

Download the pretrained model and move it to $ROOT/data/model/{cls}_linemod_train/199.pth. For instance

mkdir $ROOT/data/model
mv ape_199.pth $ROOT/data/model/ape_linemod_train/199.pth

Testing

python tools/train_linemod.py --cfg_file configs/linemod_train.json --linemod_cls cat --test_model

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{peng2019pvnet,
  title={PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation},
  author={Peng, Sida and Liu, Yuan and Huang, Qixing and Zhou, Xiaowei and Bao, Hujun},
  booktitle={CVPR},
  year={2019}
}

Acknowledgement

This work is affliated with ZJU-SenseTime Joint Lab of 3D Vision, and its intellectual property belongs to SenseTime Group Ltd.

Copyright (c) ZJU-SenseTime Joint Lab of 3D Vision. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Owner
ZJU3DV
ZJU3DV is a research group of State Key Lab of CAD&CG, Zhejiang University. We focus on the research of 3D computer vision, SLAM and AR.
ZJU3DV
Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics This repository is the official PyTorch implementation of "Physics-aware Differ

USC-Melady 46 Nov 20, 2022
Intel® Neural Compressor is an open-source Python library running on Intel CPUs and GPUs

Intel® Neural Compressor targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep l

Intel Corporation 846 Jan 04, 2023
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
Code To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment.

COLIEE 2021 - task 2: Legal Case Entailment This repository contains the code to reproduce NeuralMind's submissions to COLIEE 2021 presented in the pa

NeuralMind 13 Dec 16, 2022
[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

ZiNiU WaN 176 Dec 15, 2022
Unbiased Learning To Rank Algorithms (ULTRA)

This is an Unbiased Learning To Rank Algorithms (ULTRA) toolbox, which provides a codebase for experiments and research on learning to rank with human annotated or noisy labels.

71 Dec 01, 2022
adversarial_multi_armed_bandit_variable_plays

Adversarial Multi-Armed Bandit with Variable Plays This code is for paper: Adversarial Online Learning with Variable Plays in the Evasion-and-Pursuit

Yiyang Wang 1 Oct 28, 2021
The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

DG-TrajGen The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022. Our Meth

Wang 25 Sep 26, 2022
Tensorflow 2.x implementation of Panoramic BlitzNet for object detection and semantic segmentation on indoor panoramic images.

Deep neural network for object detection and semantic segmentation on indoor panoramic images. The implementation is based on the papers:

Alejandro de Nova Guerrero 9 Nov 24, 2022
Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition

🎵 MuSiQue: Multi-hop Questions via Single-hop Question Composition This is the repository for our paper "MuSiQue: Multi-hop Questions via Single-hop

21 Jan 02, 2023
Patches desktop steam to look like the new steamdeck ui.

steam_deck_ui_patch The Deck UI patch will patch the regular desktop steam to look like the brand new SteamDeck UI. This patch tool currently works on

The_IT_Dude 3 Aug 29, 2022
This is an example of a reproducible modelling project

An example of a reproducible modelling project What are we doing? This example was created for the 2021 fall lecture series of Stanford's Center for O

Armin Thomas 2 Oct 26, 2021
Bottom-up Human Pose Estimation

Introduction This is the official code of Rethinking the Heatmap Regression for Bottom-up Human Pose Estimation. This paper has been accepted to CVPR2

108 Dec 01, 2022
A disassembler for the RP2040 Programmable I/O State-machine!

piodisasm A disassembler for the RP2040 Programmable I/O State-machine! Usage Just run piodisasm.py on a file that contains the PIO code as hex! (Such

Ghidra Ninja 29 Dec 06, 2022
Learning to trade under the reinforcement learning framework

Trading Using Q-Learning In this project, I will present an adaptive learning model to trade a single stock under the reinforcement learning framework

Uirá Caiado 470 Nov 28, 2022
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 Jittor code will come soon

MenghaoGuo 357 Dec 11, 2022
Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Emotion and Theme Recognition in Music The repository contains code for the submission of the lileonardo team to the 2021 Emotion and Theme Recognitio

Vincent Bour 8 Aug 02, 2022
Pytorch implementation for DFN: Distributed Feedback Network for Single-Image Deraining.

DFN:Distributed Feedback Network for Single-Image Deraining Abstract Recently, deep convolutional neural networks have achieved great success for sing

6 Nov 05, 2022
The repository for freeCodeCamp's YouTube course, Algorithmic Trading in Python

Algorithmic Trading in Python This repository Course Outline Section 1: Algorithmic Trading Fundamentals What is Algorithmic Trading? The Differences

Nick McCullum 1.8k Jan 02, 2023
Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Learning to Execute (L2E) Official code base for completely reproducing all results reported in I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learnin

3 May 18, 2022