Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Last update: Dec 29, 2022

Related tags

Overview

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Introduction

Point cloud videos exhibit irregularities and lack of order along the spatial dimension where points emerge inconsistently across different frames. To capture the dynamics in point cloud videos, point tracking is usually employed. However, as points may flow in and out across frames, computing accurate point trajectories is extremely difficult. Moreover, tracking usually relies on point colors and thus may fail to handle colorless point clouds. In this paper, to avoid point tracking, we propose a novel Point 4D Transformer (P4Transformer) network to model raw point cloud videos. Specifically, P4Transformer consists of (i) a point 4D convolution to embed the spatio-temporal local structures presented in a point cloud video and (ii) a transformer to capture the appearance and motion information across the entire video by performing self-attention on the embedded local features. In this fashion, related or similar local areas are merged with attention weight rather than by explicit tracking.

Installation

The code is tested with Red Hat Enterprise Linux Workstation release 7.7 (Maipo), g++ (GCC) 8.3.1, PyTorch (both v1.4.0 and v1.8.1 are supported), CUDA 10.2 and cuDNN v7.6.

Compile the CUDA layers for PointNet++, which we used for furthest point sampling (FPS) and radius neighbouring search:

mv modules-pytorch-1.4.0/modules-pytorch-1.8.1 modules
cd modules
python setup.py install

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{fan21p4transformer,
  author    = {Hehe Fan and
               Yi Yang and
               Mohan Kankanhalli},
  title     = {Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos},
  booktitle = {{IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR}},
  year      = {2021}
}

Related Repos

PointNet++ PyTorch implementation: https://github.com/facebookresearch/votenet/tree/master/pointnet2
MeteorNet: https://github.com/xingyul/meteornet
3DV: https://github.com/3huo/3DV-Action
PSTNet: https://github.com/hehefan/Point-Spatio-Temporal-Convolution
Transformer: https://github.com/lucidrains/vit-pytorch
PointRNN (TensorFlow implementation): https://github.com/hehefan/PointRNN
PointRNN (PyTorch implementation): https://github.com/hehefan/PointRNN-PyTorch

Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Related tags

Overview

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Introduction

Installation

Citation

Related Repos

Owner

Hehe Fan

Multi-objective constrained optimization for energy applications via tree ensembles

Learning kernels to maximize the power of MMD tests

🛠️ SLAMcore SLAM Utilities

A TensorFlow implementation of Neural Program Synthesis from Diverse Demonstration Videos

Python framework for Stochastic Differential Equations modeling

SoK: Vehicle Orientation Representations for Deep Rotation Estimation

A collection of resources, problems, explanations and concepts that are/were important during my Data Science journey

A curated list of neural network pruning resources.

Detectron2 is FAIR's next-generation platform for object detection and segmentation.

Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

PIKA: a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi

Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

Pytorch implementation of Masked Auto-Encoder

This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

A SAT-based sudoku solver

The official code of Anisotropic Stroke Control for Multiple Artists Style Transfer

GeneralOCR is open source Optical Character Recognition based on PyTorch.

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

Deep Learning segmentation suite designed for 2D microscopy image segmentation

AI pipelines for Nvidia Jetson Platform