[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Last update: Jan 04, 2023

Overview

EPro-PnP

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
In CVPR 2022 (Oral). [paper]
Hansheng Chen*^1,2, Pichao Wang†², Fan Wang², Wei Tian†¹, Lu Xiong¹, Hao Li²

¹Tongji University, ²Alibaba Group
*Part of work done during an internship at Alibaba Group.
†Corresponding Authors: Pichao Wang, Wei Tian.

Introduction

EPro-PnP is a probabilistic Perspective-n-Points (PnP) layer for end-to-end 6DoF pose estimation networks. Broadly speaking, it is essentially a continuous counterpart of the widely used categorical Softmax layer, and is theoretically generalizable to other learning models with nested optimization.

Given the layer input: an -point correspondence set consisting of 3D object coordinates , 2D image coordinates , and 2D weights , a conventional PnP solver searches for an optimal pose (rigid transformation in SE(3)) that minimizes the weighted reprojection error. Previous work tries to backpropagate through the PnP operation, yet is inherently non-differentiable due to the inner operation. This leads to convergence issue if all the components in must be learned by the network.

In contrast, our probabilistic PnP layer outputs a posterior distribution of pose, whose probability density can be derived for proper backpropagation. The distribution is approximated via Monte Carlo sampling. With EPro-PnP, the correspondences can be learned from scratch altogether by minimizing the KL divergence between the predicted and target pose distribution.

Models

We release two distinct networks trained with EPro-PnP:

EPro-PnP-6DoF for 6DoF pose estimation
EPro-PnP-Det for 3D object detection

Use EPro-PnP in Your Own Model

We provide a demo on the usage of the EPro-PnP layer.

Citation

If you find this project useful in your research, please consider citing:

@inproceedings{epropnp, 
  author = {Hansheng Chen and Pichao Wang and Fan Wang and Wei Tian and Lu Xiong and Hao Li, 
  title = {EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation}, 
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
  year = {2022}
}

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Related tags

Overview

EPro-PnP

Introduction

Models

Use EPro-PnP in Your Own Model

Citation

Owner

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University)

Efficient 6-DoF Grasp Generation in Cluttered Scenes

PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza, Jocelyn Chanussot. Graph Convolutional Networks for Hyperspectral Image Classification, IEEE TGRS, 2021.

smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectious disease models: the COVID-19 case by Storvik et al

[AAAI 2022] Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation

Official PyTorch Implementation of "AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting".

Modular Gaussian Processes

hipCaffe: the HIP port of Caffe

Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

The source codes for TME-BNA: Temporal Motif-Preserving Network Embedding with Bicomponent Neighbor Aggregation.

This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

PyContinual (An Easy and Extendible Framework for Continual Learning)

Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021

Keeper for Ricochet Protocol, implemented with Apache Airflow

Attempt at implementation of a simple GAN using Keras

Pytorch implementation of Nueral Style transfer

ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators