Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

Last update: Dec 26, 2022

Overview

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

Alexey Nekrasov*, Jonas Schult*, Or Litany, Bastian Leibe, Francis Engelmann

Mix3D is a data augmentation technique for 3D segmentation methods that improves generalization.

[Project Webpage] [arXiv]

News

12. October 2021: Code released.
6. October 2021: Mix3D accepted for oral presentation at 3DV 2021. Paper on [arXiv].
30. July 2021: Mix3D ranks 1st on the ScanNet semantic labeling benchmark.

Running the code

This repository contains the code for the analysis experiments of section 4.2. Motivation and Analysis Experiments from the paper For the ScanNet benchmark and Table 1 (main paper) we use the original SpatioTemporalSegmentation-Scannet code. To add Mix3D to the original MinkowskiNet codebase, we provide the patch file SpatioTemporalSegmentation.patch. Check the supplementary for more details.

Code structure

├── mix3d
│   ├── __init__.py
│   ├── __main__.py     <- the main file
│   ├── conf            <- hydra configuration files
│   ├── datasets
│   │   ├── outdoor_semseg.py       <- outdoor dataset
│   │   ├── preprocessing       <- folder with preprocessing scripts
│   │   ├── semseg.py       <- indoor dataset
│   │   └── utils.py        <- code for mixing point clouds
│   ├── logger
│   ├── models      <- MinkowskiNet models
│   ├── trainer
│   │   ├── __init__.py
│   │   └── trainer.py      <- train loop
│   └── utils
├── data
│   ├── processed       <- folder for preprocessed datasets
│   └── raw     <- folder for raw datasets
├── scripts
│   ├── experiments
│   │   └── 1000_scene_merging.bash
│   ├── init.bash
│   ├── local_run.bash
│   ├── preprocess_matterport.bash
│   ├── preprocess_rio.bash
│   ├── preprocess_scannet.bash
│   └── preprocess_semantic_kitti.bash
├── docs
├── dvc.lock
├── dvc.yaml        <- dvc file to reproduce the data
├── poetry.lock
├── pyproject.toml      <- project dependencies
├── README.md
├── saved       <- folder that stores models and logs
└── SpatioTemporalSegmentation-ScanNet.patch        <- patch file for original repo

Dependencies

The main dependencies of the project are the following:

python: 3.7
cuda: 10.1

For others, the project uses the poetry dependency management package. Everything can be installed with the command:

poetry install

Check scripts/init.bash for more details.

Data preprocessing

After the dependencies are installed, it is important to run the preprocessing scripts. They will bring scannet, matterport, rio, semantic_kitti datasets to a single format. By default, the scripts expect to find datsets in the data/raw/ folder. Check scripts/preprocess_*.bash for more details.

dvc repro scannet # matterport, rio, semantic_kitti

This command will run the preprocessing for scannet and will save the result using the dvc data versioning system.

Training and testing

Train MinkowskiNet on the scannet dataset without Mix3D with a voxel size of 5cm:

poetry run train

Train MinkowskiNet on the scannet dataset with Mix3D with a voxel size of 5cm:

poetry run train data/collation_functions=voxelize_collate_merge

BibTeX

@inproceedings{Nekrasov213DV,
  title     = {{Mix3D: Out-of-Context Data Augmentation for 3D Scenes}},
  author    = {Nekrasov, Alexey and Schult, Jonas and Litany, Or and Leibe, Bastian and Engelmann, Francis},
  booktitle = {{International Conference on 3D Vision (3DV)}},
  year      = {2021}
}

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

Related tags

Overview

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

News

Running the code

Code structure

Dependencies

Data preprocessing

Training and testing

BibTeX

Owner

Alexey Nekrasov

Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper

A pytorch implementation of Pytorch-Sketch-RNN

Video Background Music Generation with Controllable Music Transformer (ACM MM 2021 Oral)

Code for: https://berkeleyautomation.github.io/bags/

An Inverse Kinematics library aiming performance and modularity

Few-NERD: Not Only a Few-shot NER Dataset

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Full Stack Deep Learning Labs

g9.py - Torch interactive graphics

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features

PySLM Python Library for Selective Laser Melting and Additive Manufacturing

Official PyTorch Implementation of "AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting".

Tackling the Class Imbalance Problem of Deep Learning Based Head and Neck Organ Segmentation

A fast, dataset-agnostic, deep visual search engine for digital art history

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

a grammar based feedback fuzzer

Optimizing Value-at-Risk and Conditional Value-at-Risk of Black Box Functions with Lacing Values (LV)

Rafael Project- Classifying rockets to different types using data science algorithms.

This repository contains the code for the ICCV 2019 paper "Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics"

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

Related tags

Overview

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

News

Running the code

Code structure

Dependencies

Data preprocessing

Training and testing

BibTeX

Owner

Alexey Nekrasov

Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper

A pytorch implementation of Pytorch-Sketch-RNN

Video Background Music Generation with Controllable Music Transformer (ACM MM 2021 Oral)

Code for: https://berkeleyautomation.github.io/bags/

An Inverse Kinematics library aiming performance and modularity

Few-NERD: Not Only a Few-shot NER Dataset

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Full Stack Deep Learning Labs

g9.py - Torch interactive graphics

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features

PySLM Python Library for Selective Laser Melting and Additive Manufacturing

Official PyTorch Implementation of "AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting".

Tackling the Class Imbalance Problem of Deep Learning Based Head and Neck Organ Segmentation

A fast, dataset-agnostic, deep visual search engine for digital art history

​ This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

a grammar based feedback fuzzer

Optimizing Value-at-Risk and Conditional Value-at-Risk of Black Box Functions with Lacing Values (LV)

Rafael Project- Classifying rockets to different types using data science algorithms.

This repository contains the code for the ICCV 2019 paper "Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics"

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.