Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

Related tags

Deep LearningVMNet
Overview

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation

Framework Fig

Created by Zeyu HU

Introduction

This work is based on our paper VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation, which appears at the IEEE International Conference on Computer Vision (ICCV) 2021.

In recent years, sparse voxel-based methods have become the state-of-the-arts for 3D semantic segmentation of indoor scenes, thanks to the powerful 3D CNNs. Nevertheless, being oblivious to the underlying geometry, voxel-based methods suffer from ambiguous features on spatially close objects and struggle with handling complex and irregular geometries due to the lack of geodesic information. In view of this, we present Voxel-Mesh Network (VMNet), a novel 3D deep architecture that operates on the voxel and mesh representations leveraging both the Euclidean and geodesic information. Intuitively, the Euclidean information extracted from voxels can offer contextual cues representing interactions between nearby objects, while the geodesic information extracted from meshes can help separate objects that are spatially close but have disconnected surfaces. To incorporate such information from the two domains, we design an intra-domain attentive module for effective feature aggregation and an inter-domain attentive module for adaptive feature fusion. Experimental results validate the effectiveness of VMNet: specifically, on the challenging ScanNet dataset for large-scale segmentation of indoor scenes, it outperforms the state-of-the-art SparseConvNet and MinkowskiNet (74.6% vs 72.5% and 73.6% in mIoU) with a simpler network structure (17M vs 30M and 38M parameters).

Citation

If you find our work useful in your research, please consider citing:

@misc{hu2021vmnet,
      title={VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation}, 
      author={Zeyu Hu and Xuyang Bai and Jiaxiang Shang and Runze Zhang and Jiayu Dong and Xin Wang and Guangyuan Sun and Hongbo Fu and Chiew-Lan Tai},
      year={2021},
      eprint={2107.13824},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Installation

  • Our code is based on Pytorch. Please make sure CUDA and cuDNN are installed. One configuration has been tested:

    • Python 3.7
    • Pytorch 1.4.0
    • torchvision 0.5.0
    • CUDA 10.0
    • cudatoolkit 10.0.130
    • cuDNN 7.6.5
  • VMNet depends on the torch-geometric and torchsparse libraries. Please follow their installation instructions. One configuration has been tested, higher versions should work as well:

    • torch-geometric 1.6.3
    • torchsparse 1.1.0
  • We adapted VCGlib to generate pooling trace maps for vertex clustering and quadric error metrics.

    git clone https://github.com/cnr-isti-vclab/vcglib
    
    # QUADRIC ERROR METRICS
    cd vcglib/apps/tridecimator/
    qmake
    make
    
    # VERTEX CLUSTERING
    cd ../sample/trimesh_clustering
    qmake
    make
    

    Please add vcglib/apps/tridecimator and vcglib/apps/sample/trimesh_clustering to your environment path variable.

  • Other dependencies. One configuration has been tested:

    • open3d 0.9.0
    • plyfile 0.7.3
    • scikit-learn 0.24.0
    • scipy 1.6.0

Data Preparation

  • Please refer to https://github.com/ScanNet/ScanNet and https://github.com/niessner/Matterport to get access to the ScanNet and Matterport dataset. Our method relies on the .ply as well as the .labels.ply files. We take ScanNet dataset as example for the following instructions.

  • Create directories to store processed data.

    • 'path/to/processed_data/train/'
    • 'path/to/processed_data/val/'
    • 'path/to/processed_data/test/'
  • Prepare train data.

    python prepare_data.py --considered_rooms_path dataset/data_split/scannetv2_train.txt --in_path path/to/ScanNet/scans --out_path path/to/processed_data/train/
    
  • Prepare val data.

    python prepare_data.py --considered_rooms_path dataset/data_split/scannetv2_val.txt --in_path path/to/ScanNet/scans --out_path path/to/processed_data/val/
    
  • Prepare test data.

    python prepare_data.py --test_split --considered_rooms_path dataset/data_split/scannetv2_test.txt --in_path path/to/ScanNet/scans_test --out_path path/to/processed_data/test/
    

Train

  • On train/val/test setting.

    CUDA_VISIBLE_DEVICES=0 python run.py --train --exp_name name_you_want --data_path path/to/processed_data
    
  • On train+val/test setting (for ScanNet benchmark).

    CUDA_VISIBLE_DEVICES=0 python run.py --train_benchmark --exp_name name_you_want --data_path path/to/processed_data
    

Inference

  • Validation. Pretrained model (73.3% mIoU on ScanNet Val). Please download and put into directory check_points/val_split.

    CUDA_VISIBLE_DEVICES=0 python run.py --val --exp_name val_split --data_path path/to/processed_data
    
  • Test. Pretrained model (74.6% mIoU on ScanNet Test). Please download and put into directory check_points/test_split. TxT files for benchmark submission will be saved in directory test_results/.

    CUDA_VISIBLE_DEVICES=0 python run.py --test --exp_name test_split --data_path path/to/processed_data
    

Acknowledgements

Our code is built upon torch-geometric, torchsparse and dcm-net.

License

Our code is released under MIT License (see LICENSE file for details).

Owner
HU Zeyu
HU Zeyu
Multi-objective constrained optimization for energy applications via tree ensembles

Multi-objective constrained optimization for energy applications via tree ensembles

C⚙G - Imperial College London 1 Nov 19, 2021
Laplace Redux -- Effortless Bayesian Deep Learning

Laplace Redux - Effortless Bayesian Deep Learning This repository contains the code to run the experiments for the paper Laplace Redux - Effortless Ba

Runa Eschenhagen 28 Dec 07, 2022
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Vikash Sehwag 65 Dec 19, 2022
Keras-1D-NN-Classifier

Keras-1D-NN-Classifier This code is based on the reference codes linked below. reference 1, reference 2 This code is for 1-D array data classification

Jae-Hoon Shim 6 May 18, 2021
Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features"

EDM-subgenre-classifier This repository contains the code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Fea

11 Dec 20, 2022
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 05, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Keyword Spotting Transformer This is the unofficial TensorFlow implementation of the Keyword Spotting Transformer model. This model is used to train o

Intelligent Machines Limited 8 May 11, 2022
Imagededup - 😎 Finding duplicate images made easy

imagededup is a python package that simplifies the task of finding exact and near duplicates in an image collection.

idealo 4.3k Jan 07, 2023
The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

João Fonseca 3 Jan 03, 2023
Complete system for facial identity system

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

4 May 02, 2022
An Active Automata Learning Library Written in Python

AALpy An Active Automata Learning Library AALpy is a light-weight active automata learning library written in pure Python. You can start learning auto

TU Graz - SAL Dependable Embedded Systems Lab (DES Lab) 78 Dec 30, 2022
CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms

CARLA - Counterfactual And Recourse Library CARLA is a python library to benchmark counterfactual explanation and recourse models. It comes out-of-the

Carla Recourse 200 Dec 28, 2022
Measuring and Improving Consistency in Pretrained Language Models

ParaRel 🤘 This repository contains the code and data for the paper: Measuring and Improving Consistency in Pretrained Language Models as well as the

Yanai Elazar 26 Dec 02, 2022
Template repository for managing machine learning research projects built with PyTorch-Lightning

Tutorial Repository with a minimal example for showing how to deploy training across various compute infrastructure.

Sidd Karamcheti 3 Feb 11, 2022
This is the source code for generating the ASL-Skeleton3D and ASL-Phono datasets. Check out the README.md for more details.

ASL-Skeleton3D and ASL-Phono Datasets Generator The ASL-Skeleton3D contains a representation based on mapping into the three-dimensional space the coo

Cleison Amorim 5 Nov 20, 2022
Auto-Encoding Score Distribution Regression for Action Quality Assessment

DAE-AQA It is an open source program reference to paper Auto-Encoding Score Distribution Regression for Action Quality Assessment. 1.Introduction DAE

13 Nov 16, 2022
An energy estimator for eyeriss-like DNN hardware accelerator

Energy-Estimator-for-Eyeriss-like-Architecture- An energy estimator for eyeriss-like DNN hardware accelerator This is an energy estimator for eyeriss-

HEXIN BAO 2 Mar 26, 2022
Simulation-based performance analysis of server-less Blockchain-enabled Federated Learning

Blockchain-enabled Server-less Federated Learning Repository containing the files used to reproduce the results of the publication "Blockchain-enabled

Francesc Wilhelmi 9 Sep 27, 2022
Honours project, on creating a depth estimation map from two stereo images of featureless regions

image-processing This module generates depth maps for shape-blocked-out images Install If working with anaconda, then from the root directory: conda e

2 Oct 17, 2022