Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Overview

Learning to Reconstruct 3D Manhattan Wireframes From a Single Image

This repository contains the PyTorch implementation of the paper: Yichao Zhou, Haozhi Qi, Yuexiang Zhai, Qi Sun, Zhili Chen, Li-Yi Wei, Yi Ma. "Learning to Reconstruct 3D Manhattan Wireframes From a Single Image", ICCV 2019.

Introduction

The goal of this project is to explore the idea of reconstructing high-quality compact CAD-like 3D models from images. We propose a method to create accurate 3D wireframe representation from a single image by exploiting global structural regularities. Our method uses a convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points.

Qualitative Results

Input Predicted Input Predicted

Code Structure

Below is a quick overview of the function of key files.

########################### Data ###########################
data/
    SU3/                        # default folder for the scenecity 3D dataset
logs/                           # default folder for storing the output during training
########################### Code ###########################
config/                         # neural network hyper-parameters and configurations
wireframe/                      # module so you can "import wireframe" in scripts
train.py                        # script for training and evaluating the neural network
vectorize_u3d.py                # script for turning the 2.5D results into 3D wireframe

Reproducing Results

Installation

You are suggested to install miniconda before following executing the following commands.

git clone https://github.com/zhou13/shapeunity
cd shapeunity
conda create -y -n shapeunity
source activate shapeunity
conda install -y pyyaml docopt matplotlib scikit-image opencv tqdm
# Replace cudatoolkit=10.2 with your CUDA version: https://pytorch.org/get-started/
conda install -y pytorch cudatoolkit=10.2 -c pytorch
python -m pip install --upgrade vispy cvxpy
mkdir data logs

Downloading the Processed Datasets

Make sure curl is installed on your system and execute

cd data
../misc/gdrive-download.sh 1-TABJjT4-_yzE-iRD-n_yIJ9Kwzzkm7X SU3.zip
unzip SU3.zip
rm *.zip
cd ..

Note: If your downloaded zip file is corrupted, it is likely due to the restriction on the amount of data that can be downloaded from my account per day. In that case, you can try to download the pre-processed dataset manually from our Google Drive and proceed accordingly.

Downloading the Pre-trained Models

Execute the following command to download and unzip the pre-trained models.

cd logs
../misc/gdrive-download.sh 1AuE3yje7jTRne2KjiVdxAWo1UT03i16a pretrained-wireframe.zip
../misc/gdrive-download.sh 1YwPMbAHnxSA3BgiM5Q26mKSTjd46OYRo pretrained-vanishing-points.zip
unzip pretrained-wireframe.zip
unzip pretrained-vanishing-points.zip
rm *.zip
cd ..

Alternatively, you can download them at this Google Drive link and this Google Drive link, respectively.

Training (Optional)

If you want to train the model yourself rather than using the pre-trained models, execute the following commands to train the neural networks from scratch with four GPUs (specified by -d 0,1,2,3):

python ./train.py -d 0,1,2,3 --identifier baseline config/hourglass.yaml

The checkpoints and logs will be written to logs/ accordingly.

We note that vanishing points are only supported by the neural network under the git branch vanishing-points. You need to visit that part of the code with git checkout vanishing-points for training the network with the vanishing point branch.

Predicting the 2.5D Wireframe (Optional)

Execute the following command to evaluate the neural network on the validation split:

python train.py --eval -d 0 -i default --from logs/pretrained-wireframe/checkpoint_latest.pth.tar logs/pretrained-wireframe/config.yaml

This command should generate a new folder under the logs directory with results in the npz folders.

Vectorization & Visualization

To visualize the working examples of ShapeUnity, execute the following commands:

python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 57
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 100
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 109
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 141
python vectorize_u3d.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000 299

Evaluation (Optional)

To quantitatively evaluate the wireframe quality of ShapeUnity, execute the following command:

python eval_2d3d_metric.py logs/pretrained-wireframe/npz/003576000 --vpdir logs/pretrained-vanishing-points/npz/000096000

The details of the sAP-10 metric can be found in the paper LCNN.

Acknowledgement

This work is supported by a research grant from Sony Research. We thank Xili Dai for providing the sAP evaluation script for the project.

Citing ShapeUnity

If you find this project useful in your research, please consider citing:

@inproceedings{zhou2019learning,
  title={Learning to Reconstruct 3D Manhattan Wireframes From a Single Image},
  author={Zhou, Yichao and Qi, Haozhi and Zhai, Yuexiang and Sun, Qi and Chen, Zhili and Wei, Li-Yi and Ma, Yi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2019}
}
Owner
Yichao Zhou
Apple Inc. | Ph.D. at UC Berkeley
Yichao Zhou
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

Sunbow Liu 22 Nov 25, 2022
Official DGL implementation of "Rethinking High-order Graph Convolutional Networks"

SE Aggregation This is the implementation for Rethinking High-order Graph Convolutional Networks. Here we show the codes for citation networks as an e

Tianqi Zhang (张天启) 32 Jul 19, 2022
Causal Influence Detection for Improving Efficiency in Reinforcement Learning

Causal Influence Detection for Improving Efficiency in Reinforcement Learning This repository contains the code release for the paper "Causal Influenc

Autonomous Learning Group 21 Nov 29, 2022
Largest list of models for Core ML (for iOS 11+)

Since iOS 11, Apple released Core ML framework to help developers integrate machine learning models into applications. The official documentation We'v

Kedan Li 5.6k Jan 08, 2023
PyTorch code of "SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks"

SLAPS-GNN This repo contains the implementation of the model proposed in SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks

60 Dec 22, 2022
Code for paper "Context-self contrastive pretraining for crop type semantic segmentation"

Code for paper "Context-self contrastive pretraining for crop type semantic segmentation" Setting up a python environment Follow the instruction in ht

Michael Tarasiou 11 Oct 09, 2022
ICNet for Real-Time Semantic Segmentation on High-Resolution Images, ECCV2018

ICNet for Real-Time Semantic Segmentation on High-Resolution Images by Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia, details a

Hengshuang Zhao 594 Dec 31, 2022
Implementation of Convolutional enhanced image Transformer

CeiT : Convolutional enhanced image Transformer This is an unofficial PyTorch implementation of Incorporating Convolution Designs into Visual Transfor

Rishikesh (ऋषिकेश) 82 Dec 13, 2022
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

117 Dec 28, 2022
Creating multimodal multitask models

Fusion Brain Challenge The English version of the document can be found here. Обновления 01.11 Мы выкладываем пример данных, аналогичных private test

Sber AI 43 Nov 28, 2022
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 47 Sep 06, 2022
Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers.

Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers. It contains purchases, recurring

Ayodeji Yekeen 1 Jan 01, 2022
Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral)

Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral) Tianyu Wang*, Xiaowei Hu*, Chi-Wing Fu, and Pheng-Ann Hen

Steve Wong 51 Oct 20, 2022
Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021)

Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021) Introduction This is the official repository for the PyTorch implementation

165 Dec 07, 2022
Underwater industrial application yolov5m6

This project wins the intelligent algorithm contest finalist award and stands out from over 2000teams in China Underwater Robot Professional Contest, entering the final of China Underwater Robot Prof

8 Nov 09, 2022
DiSECt: Differentiable Simulator for Robotic Cutting

DiSECt: Differentiable Simulator for Robotic Cutting Website | Paper | Dataset | Video | Blog post DiSECt is a simulator for the cutting of deformable

NVIDIA Research Projects 73 Oct 29, 2022
DCA - Official Python implementation of Delaunay Component Analysis algorithm

Delaunay Component Analysis (DCA) Official Python implementation of the Delaunay

Petra Poklukar 9 Sep 06, 2022
This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage

1.4k Jan 04, 2023
This repo includes the supplementary of our paper "CEMENT: Incomplete Multi-View Weak-Label Learning with Long-Tailed Labels"

Supplementary Materials for CEMENT: Incomplete Multi-View Weak-Label Learning with Long-Tailed Labels This repository includes all supplementary mater

Zhiwei Li 0 Jan 05, 2022
PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

This is a PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR), using subpixel convolution to optimize the inference speed of TecoGAN VSR model. Please refer to the offi

789 Jan 04, 2023