PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Last update: Dec 16, 2022

Related tags

Deep Learning R2Plus1D-PyTorch

Overview

R2Plus1D-PyTorch

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Link to original: paper and code

NOTE: This repository has been archived, although forks and other work that extend on top of this remain welcome

Requirements

R2Plus1D-PyTorch has the following requirements

PyTorch 0.4 and dependencies
OpenCV (tested on 3.4.0.12)
tqdm (for progress bars)

About this repository

This repository consists of four python files:

module.py - Contains an implementation of the factored, R2Plus1D convolution the entire implementation is based around. It is designed to be a replacement for nn.Conv3D in the appropriate scenario
network.py - Uses module.py to build up the residual network described in the paper
dataset.py - Implements a PyTorch dataset, that can load videos with appropriate labels from a given directory.
trainer.py - A mildly modified version of the script from the PyTorch tutorials to train the model. Features saving and restoring capabilities.

Training on Kinetics-400/600

This repository does not include a crawler or downloader for the Kinetics-400/600 dataset, however, one can be found here. It is strongly recommended to downsample the videos prior to training (and not on the fly), using a tool such as ffmpeg. If using the crawler, this can be done by adding "-vf", "scale=172:128" to the ffmpeg command list in the download clip function.

Training in general

This repository is designed for the ResNet to be trained on any dataset of videos in general, using the VideoDataloader class from dataset.py . It expects the videos to be arranged in a directory -> [train/val] folders -> [class_label] folders (one for each class) -> videos (the files themselves).

Forks and fixes of this repo are highly welcome!

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Related tags

Overview

R2Plus1D-PyTorch

Requirements

About this repository

Training on Kinetics-400/600

Training in general

Owner

Irhum Shafkat

MoCap-Solver: A Neural Solver for Optical Motion Capture Data

Voice Gender Recognition

Code for the paper "Can Active Learning Preemptively Mitigate Fairness Issues?" presented at RAI 2021.

Adversarial-autoencoders - Tensorflow implementation of Adversarial Autoencoders

A simple, fully convolutional model for real-time instance segmentation.

A pytorch implementation of Pytorch-Sketch-RNN

3D-printable hand-strapped keyboard

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

[ICCV 2021] Deep Hough Voting for Robust Global Registration

This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners.

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务

Code for paper Novel View Synthesis via Depth-guided Skip Connections

Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

Audio2Face - Audio To Face With Python

FaceQgen: Semi-Supervised Deep Learning for Face Image Quality Assessment

[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

Implementation detail for paper "Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet"

Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

Framework for joint representation learning, evaluation through multimodal registration and comparison with image translation based approaches

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Related tags

Overview

R2Plus1D-PyTorch

Requirements

About this repository

Training on Kinetics-400/600

Training in general

Owner

Irhum Shafkat

MoCap-Solver: A Neural Solver for Optical Motion Capture Data

Voice Gender Recognition

Code for the paper "Can Active Learning Preemptively Mitigate Fairness Issues?" presented at RAI 2021.

Adversarial-autoencoders - Tensorflow implementation of Adversarial Autoencoders

A simple, fully convolutional model for real-time instance segmentation.

A pytorch implementation of Pytorch-Sketch-RNN

3D-printable hand-strapped keyboard

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

[ICCV 2021] Deep Hough Voting for Robust Global Registration

This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners.

“英特尔创新大师杯”深度学习挑战赛 赛道3：CCKS2021中文NLP地址相关性任务

Code for paper Novel View Synthesis via Depth-guided Skip Connections

Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

Audio2Face - Audio To Face With Python

FaceQgen: Semi-Supervised Deep Learning for Face Image Quality Assessment

[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

Implementation detail for paper "Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet"

Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

Framework for joint representation learning, evaluation through multimodal registration and comparison with image translation based approaches

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务