Pre-Training 3D Point Cloud Transformers with Masked Point Modeling

Overview

Point-BERT: Pre-Training 3D Point Cloud Transformers with Masked Point Modeling

Created by Xumin Yu*, Lulu Tang*, Yongming Rao*, Tiejun Huang, Jie Zhou, Jiwen Lu

[arXiv] [Project Page] [Models]

This repository contains PyTorch implementation for Point-BERT:Pre-Training 3D Point Cloud Transformers with Masked Point Modeling.

Point-BERT is a new paradigm for learning Transformers to generalize the concept of BERT onto 3D point cloud. Inspired by BERT, we devise a Masked Point Modeling (MPM) task to pre-train point cloud Transformers. Specifically, we first divide a point cloud into several local patches, and a point cloud Tokenizer is devised via a discrete Variational AutoEncoder (dVAE) to generate discrete point tokens containing meaningful local information. Then, we randomly mask some patches of input point clouds and feed them into the backbone Transformer. The pre-training objective is to recover the original point tokens at the masked locations under the supervision of point tokens obtained by the Tokenizer.

intro

Pretrained Models

model dataset config url
dVAE ShapeNet config Tsinghua Cloud / BaiDuYun(code:26d3)
Point-BERT ShapeNet config Tsinghua Cloud / BaiDuYun(code:jvtg)
model dataset Acc. Acc. (vote) config url
Transformer ModelNet 92.67 93.24 config Tsinghua Cloud / BaiDuYun(code:tqow)
Transformer ModelNet 92.91 93.48 config Tsinghua Cloud / BaiDuYun(code:tcin)
Transformer ModelNet 93.19 93.76 config Tsinghua Cloud / BaiDuYun(code:k343)
Transformer ScanObjectNN 88.12 -- config Tsinghua Cloud / BaiDuYun(code:f0km)
Transformer ScanObjectNN 87.43 -- config Tsinghua Cloud / BaiDuYun(code:k3cb)
Transformer ScanObjectNN 83.07 -- config Tsinghua Cloud / BaiDuYun(code:rxsw)

Usage

Requirements

  • PyTorch >= 1.7.0
  • python >= 3.7
  • CUDA >= 9.0
  • GCC >= 4.9
  • torchvision
  • timm
  • open3d
  • tensorboardX
pip install -r requirements.txt

Building Pytorch Extensions for Chamfer Distance, PointNet++ and kNN

NOTE: PyTorch >= 1.7 and GCC >= 4.9 are required.

# Chamfer Distance
bash install.sh
# PointNet++
pip install "git+git://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

Dataset

We use ShapeNet for the training of dVAE and the pre-training of Point-BERT models. And finetuning the Point-BERT models on ModelNet, ScanObjectNN, ShapeNetPart

The details of used datasets can be found in DATASET.md.

dVAE

To train a dVAE by yourself, simply run:

bash scripts/train.sh <GPU_IDS>\
    --config cfgs/ShapeNet55_models/dvae.yaml \
    --exp_name <name>

Visualize the reconstruction results of a pre-trained dVAE, run: (default path: ./vis)

bash ./scripts/test.sh <GPU_IDS> \
    --ckpts <path>\
    --config cfgs/ShapeNet55_models/dvae.yaml\
    --exp_name <name>

Point-BERT pre-training

To pre-train the Point-BERT models on ShapeNet, simply run: (complete the ckpt in cfgs/Mixup_models/Point-BERT.yaml first )

bash ./scripts/dist_train_BERT.sh <NUM_GPU> <port>\
    --config cfgs/Mixup_models/Point-BERT.yaml \
    --exp_name pointBERT_pretrain 
    [--val_freq 10]

val_freq controls the frequence to evaluate the Transformer on ModelNet40 with LinearSVM.

Fine-tuning on downstream tasks

We finetune our Point-BERT on 4 downstream tasks: Classfication on ModelNet40, Few-shot learning on ModelNet40, Transfer learning on ScanObjectNN and Part segmentation on ShapeNetPart.

ModelNet40

To finetune a pre-trained Point-BERT model on ModelNet40, simply run:

# 1024 points
bash ./scripts/train_BERT.sh <GPU_IDS> \
    --config cfgs/ModelNet_models/PointTransformer.yaml\
    --finetune_model\
    --ckpts <path>\
    --exp_name <name>
# 4096 points
bash ./scripts/train_BERT.sh <GPU_IDS>\
    --config cfgs/ModelNet_models/PointTransformer_4096point.yaml\ 
    --finetune_model\ 
    --ckpts <path>\
    --exp_name <name>
# 8192 points
bash ./scripts/train_BERT.sh <GPU_IDS>\
    --config cfgs/ModelNet_models/PointTransformer_8192point.yaml\ 
    --finetune_model\ 
    --ckpts <path>\
    --exp_name <name>

To evaluate a model finetuned on ModelNet40, simply run:

bash ./scripts/test_BERT.sh <GPU_IDS>\
    --config cfgs/ModelNet_models/PointTransformer.yaml \
    --ckpts <path> \
    --exp_name <name>

Few-shot Learning on ModelNet40

We follow the few-shot setting in the previous work.

First, generate your own few-shot learning split or use the same split as us (see DATASET.md).

# generate few-shot learning split
cd datasets/
python generate_few_shot_data.py
# train and evaluate the Point-BERT
bash ./scripts/train_BERT.sh <GPU_IDS> \
    --config cfgs/Fewshot_models/PointTransformer.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name> \
    --way <int> \
    --shot <int> \
    --fold <int>

ScanObjectNN

To finetune a pre-trained Point-BERT model on ScanObjectNN, simply run:

bash ./scripts/train_BERT.sh <GPU_IDS>  \
    --config cfgs/ScanObjectNN_models/PointTransformer_hardest.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

To evaluate a model on ScanObjectNN, simply run:

bash ./scripts/test_BERT.sh <GPU_IDS>\
    --config cfgs/ScanObjectNN_models/PointTransformer_hardest.yaml \
    --ckpts <path> \
    --exp_name <name>

Part Segmentation

Code coming soon

Visualization

Masked point clouds reconstruction using our Point-BERT model trained on ShapeNet

results

License

MIT License

Citation

If you find our work useful in your research, please consider citing:

@article{yu2021pointbert,
  title={Point-BERT: Pre-Training 3D Point Cloud Transformers with Masked Point Modeling},
  author={Yu, Xumin and Tang, Lulu and Rao, Yongming and Huang, Tiejun and Zhou, Jie and Lu, Jiwen},
  journal={arXiv preprint},
  year={2021}
}
Owner
Lulu Tang
Lulu Tang
Easy-to-use micro-wrappers for Gym and PettingZoo based RL Environments

SuperSuit introduces a collection of small functions which can wrap reinforcement learning environments to do preprocessing ('microwrappers'). We supp

Farama Foundation 357 Jan 06, 2023
Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

Unseen Object Clustering: Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation Introduction In this work, we propose a new method

NVIDIA Research Projects 132 Dec 13, 2022
Adversarial Attacks on Probabilistic Autoregressive Forecasting Models.

Attack-Probabilistic-Models This is the source code for Adversarial Attacks on Probabilistic Autoregressive Forecasting Models. This repository contai

SRI Lab, ETH Zurich 25 Sep 14, 2022
A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

443 Jan 06, 2023
The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Published by SpaceML • About SpaceML • Quick Colab Example Self-Supervised Learner The Self-Supervised Learner can be used to train a classifier with

SpaceML 92 Nov 30, 2022
Optimizing synthesizer parameters using gradient approximation

Optimizing synthesizer parameters using gradient approximation NASH 2021 Hackathon! These are some experiments I conducted during NASH 2021, the Neura

Jordie Shier 10 Feb 10, 2022
QA-GNN: Question Answering using Language Models and Knowledge Graphs

QA-GNN: Question Answering using Language Models and Knowledge Graphs This repo provides the source code & data of our paper: QA-GNN: Reasoning with L

Michihiro Yasunaga 434 Jan 04, 2023
Pytorch implementation of the popular Improv RNN model originally proposed by the Magenta team.

Pytorch Implementation of Improv RNN Overview This code is a pytorch implementation of the popular Improv RNN model originally implemented by the Mage

Sebastian Murgul 3 Nov 11, 2022
CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields Paper | Supplementary | Video | Poster If you find our code or paper useful, please

26 Nov 29, 2022
Simple tutorials on Pytorch DDP training

pytorch-distributed-training Distribute Dataparallel (DDP) Training on Pytorch Features Easy to study DDP training You can directly copy this code for

Ren Tianhe 188 Jan 06, 2023
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Autoregressive Predictive Coding This repository contains the official implementation (in PyTorch) of Autoregressive Predictive Coding (APC) proposed

iamyuanchung 173 Dec 18, 2022
基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

37 Jan 01, 2023
EvoJAX is a scalable, general purpose, hardware-accelerated neuroevolution toolkit

EvoJAX: Hardware-Accelerated Neuroevolution EvoJAX is a scalable, general purpose, hardware-accelerated neuroevolution toolkit. Built on top of the JA

Google 598 Jan 07, 2023
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

VisualGPT Our Paper VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning Main Architecture of Our VisualGPT Downloa

Vision CAIR Research Group, KAUST 140 Dec 28, 2022
This repo implements several applications of the proposed generalized Bures-Wasserstein (GBW) geometry on symmetric positive definite matrices.

GBW This repo implements several applications of the proposed generalized Bures-Wasserstein (GBW) geometry on symmetric positive definite matrices. Ap

Andi Han 0 Oct 22, 2021
This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

Differentiable Volumetric Rendering Paper | Supplementary | Spotlight Video | Blog Entry | Presentation | Interactive Slides | Project Page This repos

697 Jan 06, 2023
RaceBERT -- A transformer based model to predict race and ethnicty from names

RaceBERT -- A transformer based model to predict race and ethnicty from names Installation pip install racebert Using a virtual environment is highly

Prasanna Parasurama 3 Nov 02, 2022
A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

A Light and Fast Face Detector for Edge Devices Big News: LFD, which is a big update of LFFD, now is released (2021.03.09). It is strongly recommended

YonghaoHe 1.3k Dec 25, 2022
An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

Kakao Brain 72 Dec 28, 2022
Studying Python release adoptions by looking at PyPI downloads

Analysis of version adoptions on PyPI We get PyPI download statistics via Google's BigQuery using the pypinfo tool. Usage First you need to get an acc

Julien Palard 9 Nov 04, 2022