Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

Overview

ZePHyR: Zero-shot Pose Hypothesis Rating

ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compares the sensor observation to a sparse object rendering of each candidate pose hypothesis. We used PointNet++ as the network structure and trained and tested on YCB-V and LM-O dataset.

[ArXiv] [Project Page] [Video] [BibTex]

ZePHyR pipeline animation

Get Started

First, checkout this repo by

git clone --recurse-submodules [email protected]:r-pad/zephyr.git

Set up environment

  1. We recommend building the environment and install all required packages using Anaconda.
conda env create -n zephyr --file zephyr_env.yml
conda activate zephyr
  1. Install the required packages for compiling the C++ module
sudo apt-get install build-essential cmake libopencv-dev python-numpy
  1. Compile the c++ library for python bindings in the conda virtual environment
mkdir build
cd build
cmake .. -DPYTHON_EXECUTABLE=$(python -c "import sys; print(sys.executable)") -DPYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())")  -DPYTHON_LIBRARY=$(python -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))")
make; make install
  1. Install the current python package
cd .. # move to the root folder of this repo
pip install -e .

Download pre-processed dataset

Download pre-processed training and testing data (ycbv_preprocessed.zip, lmo_preprocessed.zip and ppf_hypos.zip) from this Google Drive link and unzip it in the python/zephyr/data folder. The unzipped data takes around 66GB of storage in total.

The following commands need to be run in python/zephyr/ folder.

cd python/zephyr/

Example script to run the network

To use the network, an example is provided in notebooks/TestExample.ipynb. In the example script, a datapoint is loaded from LM-O dataset provided by the BOP Challenge. The pose hypotheses is provided by PPF algorithm (extracted from ppf_hypos.zip). Despite the complex dataloading code, only the following data of the observation and the model point clouds is needed to run the network:

  • img: RGB image, np.ndarray of size (H, W, 3) in np.uint8
  • depth: depth map, np.ndarray of size (H, W) in np.float, in meters
  • cam_K: camera intrinsic matrix, np.ndarray of size (3, 3) in np.float
  • model_colors: colors of model point cloud, np.ndarray of size (N, 3) in float, scaled in [0, 1]
  • model_points: xyz coordinates of model point cloud, np.ndarray of size (N, 3) in float, in meters
  • model_normals: normal vectors of mdoel point cloud, np.ndarray of size (N, 3) in float, each L2 normalized
  • pose_hypos: pose hypotheses in camera frame, np.ndarray of size (K, 4, 4) in float

Run PPF algorithm using HALCON software

The PPF algorithm we used is the surface matching function implmemented in MVTec HALCON software. HALCON provides a Python interface for programmers together with its newest versions. I wrote a simple wrapper which calls create_surface_model() and find_surface_model() to get the pose hypotheses. See notebooks/TestExample.ipynb for how to use it.

The wrapper requires the HALCON 21.05 to be installed, which is a commercial software but it provides free licenses for students.

If you don't have access to HALCON, sets of pre-estimated pose hypotheses are provided in the pre-processed dataset.

Test the network

Download the pretrained pytorch model checkpoint from this Google Drive link and unzip it in the python/zephyr/ckpts/ folder. We provide 3 checkpoints, two trained on YCB-V objects with odd ID (final_ycbv.ckpt) and even ID (final_ycbv_valodd.ckpt) respectively, and one trained on LM objects that are not in LM-O dataset (final_lmo.ckpt).

Test on YCB-V dataset

Test on the YCB-V dataset using the model trained on objects with odd ID

python test.py \
    --model_name pn2 \
    --dataset_root ./data/ycb/matches_data_test/ \
    --dataset_name ycbv \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final \
    --resume_path ./ckpts/final_ycbv.ckpt

Test on the YCB-V dataset using the model trained on objects with even ID

python test.py \
    --model_name pn2 \
    --dataset_root ./data/ycb/matches_data_test/ \
    --dataset_name ycbv \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final \
    --resume_path ./ckpts/final_ycbv_valodd.ckpt

Test on LM-O dataset

python test.py \
    --model_name pn2 \
    --dataset_root ./data/lmo/matches_data_test/ \
    --dataset_name lmo \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final \
    --resume_path ./ckpts/final_lmo.ckpt

The testing results will be stored in test_logs and the results in BOP Challenge format will be in test_logs/bop_results. Please refer to bop_toolkit for converting the results to BOP Average Recall scores used in BOP challenge.

Train the network

Train on YCB-V dataset

These commands will train the network on the real-world images in the YCB-Video training set.

On object Set 1 (objects with odd ID)

python train.py \
    --model_name pn2 \
    --dataset_root ./data/ycb/matches_data_train/ \
    --dataset_name ycbv \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final

On object Set 2 (objects with even ID)

python train.py \
    --model_name pn2 \
    --dataset_root ./data/ycb/matches_data_train/ \
    --dataset_name ycbv \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --val_obj odd \
    --exp_name final_valodd

Train on LM-O synthetic dataset

This command will train the network on the synthetic images provided by BlenderProc4BOP. We take the lm_train_pbr.zip as the training set but the network is only supervised on objects that is in Linemod but not in Linemod-Occluded (i.e. IDs for training objects are 2 3 4 7 13 14 15).

python train.py \
    --model_name pn2 \
    --dataset_root ./data/lmo/matches_data_train/ \
    --dataset_name lmo \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final

Cite

If you find this codebase useful in your research, please consider citing:

@inproceedings{icra2021zephyr,
    title={ZePHyR: Zero-shot Pose Hypothesis Rating},
    author={Brian Okorn, Qiao Gu, Martial Hebert, David Held},
    booktitle={2021 International Conference on Robotics and Automation (ICRA)},
    year={2021}
}

Reference

Owner
R-Pad - Robots Perceiving and Doing
This is the repository for the R-Pad lab at CMU.
R-Pad - Robots Perceiving and Doing
Quickly and easily create / train a custom DeepDream model

Dream-Creator This project aims to simplify the process of creating a custom DeepDream model by using pretrained GoogleNet models and custom image dat

55 Dec 27, 2022
Heart Arrhythmia Classification

This program takes and input of an ECG in European Data Format (EDF) and outputs the classification for heartbeats into normal vs different types of arrhythmia . It uses a deep learning model for cla

4 Nov 02, 2022
Code and data for "TURL: Table Understanding through Representation Learning"

TURL This Repo contains code and data for "TURL: Table Understanding through Representation Learning". Environment and Setup Data Pretraining Finetuni

SunLab-OSU 63 Nov 23, 2022
Supervised domain-agnostic prediction framework for probabilistic modelling

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data

The Alan Turing Institute 112 Oct 23, 2022
This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

TransMix: Attend to Mix for Vision Transformers This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transf

Jie-Neng Chen 130 Jan 01, 2023
ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Status: Under development (expect bug fixes and huge updates) ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectiv

37 Dec 28, 2022
CondenseNet V2: Sparse Feature Reactivation for Deep Networks

CondenseNetV2 This repository is the official Pytorch implementation for "CondenseNet V2: Sparse Feature Reactivation for Deep Networks" paper by Le Y

Haojun Jiang 74 Dec 12, 2022
Official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

peng gao 42 Nov 26, 2022
Hcaptcha-challenger - Gracefully face hCaptcha challenge with Yolov5(ONNX) embedded solution

hCaptcha Challenger 🚀 Gracefully face hCaptcha challenge with Yolov5(ONNX) embe

593 Jan 03, 2023
Unofficial Implementation of Oboe (SIGCOMM'18').

Oboe-Reproduce This is the unofficial implementation of the paper "Oboe: Auto-tuning video ABR algorithms to network conditions, Zahaib Akhtar, Yun Se

Tianchi Huang 13 Nov 04, 2022
PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"

Efficient Neural Architecture Search (ENAS) in PyTorch PyTorch implementation of Efficient Neural Architecture Search via Parameters Sharing. ENAS red

Taehoon Kim 2.6k Dec 31, 2022
Creating Multi Task Models With Keras

Creating Multi Task Models With Keras About The Project! I used the keras and Tensorflow Library, To build a Deep Learning Neural Network to Creating

Srajan Chourasia 4 Nov 28, 2022
A web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks

This project is a web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks. Thanks for NVlabs' excelle

K.L. 150 Dec 15, 2022
This repo contains code to reproduce all experiments in Equivariant Neural Rendering

Equivariant Neural Rendering This repo contains code to reproduce all experiments in Equivariant Neural Rendering by E. Dupont, M. A. Bautista, A. Col

Apple 83 Nov 16, 2022
Machine Learning Privacy Meter: A tool to quantify the privacy risks of machine learning models with respect to inference attacks, notably membership inference attacks

ML Privacy Meter Machine learning is playing a central role in automated decision making in a wide range of organization and service providers. The da

Data Privacy and Trustworthy Machine Learning Research Lab 357 Jan 06, 2023
TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments for IV 2022.

TorchGRL TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffi

XXQQ 42 Dec 09, 2022
Exploiting a Zoo of Checkpoints for Unseen Tasks

Exploiting a Zoo of Checkpoints for Unseen Tasks This repo includes code to reproduce all results in the above Neurips paper, authored by Jiaji Huang,

Baidu Research 8 Sep 06, 2022
Employs neural networks to classify images into four categories: ship, automobile, dog or frog

Neural Net Image Classifier Employs neural networks to classify images into four categories: ship, automobile, dog or frog Viterbi_1.py uses a classic

Riley Baker 1 Jan 18, 2022
An improvement of FasterGICP: Acceptance-rejection Sampling based 3D Lidar Odometry

fasterGICP This package is an improvement of fast_gicp Please cite our paper if possible. W. Jikai, M. Xu, F. Farzin, D. Dai and Z. Chen, "FasterGICP:

79 Dec 31, 2022
A curated list of the top 10 computer vision papers in 2021 with video demos, articles, code and paper reference.

The Top 10 Computer Vision Papers of 2021 The top 10 computer vision papers in 2021 with video demos, articles, code, and paper reference. While the w

Louis-François Bouchard 118 Dec 21, 2022