Synthetic Humans for Action Recognition, IJCV 2021

Related tags

Deep Learningsurreact
Overview

SURREACT: Synthetic Humans for Action Recognition from Unseen Viewpoints

Gül Varol, Ivan Laptev and Cordelia Schmid, Andrew Zisserman, Synthetic Humans for Action Recognition from Unseen Viewpoints, IJCV 2021.

[Project page] [arXiv]

Contents

1. Synthetic data generation from motion estimation

Please follow the instructions at datageneration/README.md for setting up the Blender environment and downloading required assets.

Once ready, you can generate one clip by running:

# set `BLENDER_PATH` and `CODE_PATH` variables in this script
bash datageneration/exe/run.sh

Note that -t 1 option in run.sh can be removed to run faster on multi cores. We used submit_multi_job*.sh to generate clips for the whole datasets in parallel on the cluster, you can adapt this for your infrastructure. This script also has sample argument-value pairs. Find in utils/argutils.py a list of arguments and their explanations. You can enable/disable outputting certain modalities by setting output_types here.

2. Training action recognition models

Please follow the instructions at training/README.md for setting up the Pytorch environment and preparing the datasets.

Once ready, you can launch training by running:

cd training/
bash exp/surreact_train.sh

3. Download SURREACT datasets

In order to download SURREACT datasets, you need to accept the license terms from SURREAL. The links to license terms and download procedure are available here:

https://www.di.ens.fr/willow/research/surreal/data/

Once you receive the credentials to download the dataset, you will have a personal username and password. Use these to download the synthetic videos from the following links. Note that due to storage complexity, we only provide .mp4 video files and metadata, but not the other modalities such as flow and segmentation. You are encouraged to run the data generation code to obtain those. We provide videos corresponding to NTU and UESTC datasets.

The structure of the folders can be as follows:

surreact/
------- uestc/  # using motion estimates from the UESTC dataset
------------ hmmr/
------------ vibe/
------- ntu/  # using motion estimates from the NTU dataset
------------ hmmr/
------------ vibe/
---------------- train/
---------------- test/
--------------------- <sequenceName>/ # e.g. S001C002P003R002A001 for NTU, a25_d1_p048_c1_color.avi for UESTC
------------------------------ <sequenceName>_v%03d_r%02d.mp4       # RGB - 240x320 resolution video
------------------------------ <sequenceName>_v%03d_r%02d_info.mat  # metadata
# bg         [char]          - name of the background image file
# cam_dist   [1 single]      - camera distance
# cam_height [1 single]      - camera height
# cloth      [chat]          - name of the texture image file
# gender     [1 uint8]       - gender (0: 'female', 1: 'male')
# joints2D   [2x24xT single] - 2D coordinates of 24 SMPL body joints on the image pixels
# joints3D   [3x24xT single] - 3D coordinates of 24 SMPL body joints in world meters
# light      [9 single]      - spherical harmonics lighting coefficients
# pose       [72xT single]   - SMPL parameters (axis-angle)
# sequence   [char]          - <sequenceName>
# shape      [10 single]     - body shape parameters
# source     [char]          - 'ntu' | 'hri40'
# zrot_euler [1 single]      - rotation in Z (euler angle), zero

# *** v%03d stands for the viewpoint in euler angles, we render 8 views: 000, 045, 090, 135, 180, 225, 270, 315.
# *** r%02d stands for the repetition, when the same video is rendered multiple times (this is always 00 for the released files)
# *** T is the number of frames, note that this can be smaller than the real source video length due to motion estimation dropping frames

Citation

If you use this code or data, please cite the following:

@INPROCEEDINGS{varol21_surreact,  
  title     = {Synthetic Humans for Action Recognition from Unseen Viewpoints},  
  author    = {Varol, G{\"u}l and Laptev, Ivan and Schmid, Cordelia and Zisserman, Andrew},  
  booktitle = {IJCV},  
  year      = {2021}  
}

License

Please check the SURREAL license terms before downloading and/or using the SURREACT data and data generation code.

Acknowledgements

The data generation code was extended from gulvarol/surreal. The training code was extended from bearpaw/pytorch-pose. The source of assets include action recognition datasets NTU and UESTC, SMPL and SURREAL projects. The motion estimation was possible thanks to mkocabas/VIBE or akanazawa/human_dynamics (HMMR) repositories. Please cite the respective papers if you use these.

Special thanks to Inria clusters sequoia and rioc.

Owner
Gul Varol
Computer Vision Researcher
Gul Varol
NaturalCC is a sequence modeling toolkit that allows researchers and developers to train custom models

NaturalCC NaturalCC is a sequence modeling toolkit that allows researchers and developers to train custom models for many software engineering tasks,

159 Dec 28, 2022
Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Viewmaker Networks: Learning Views for Unsupervised Representation Learning Alex Tamkin, Mike Wu, and Noah Goodman Paper link: https://arxiv.org/abs/2

Alex Tamkin 31 Dec 01, 2022
Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Box_Discretization_Network This repository is built on the pytorch [maskrcnn_benchmark]. The method is the foundation of our ReCTs-competition method

Yuliang Liu 266 Nov 24, 2022
Calibrate your listeners! Robust communication-based training for pragmatic speakers. Findings of EMNLP 2021.

Calibrate your listeners! Robust communication-based training for pragmatic speakers Rose E. Wang, Julia White, Jesse Mu, Noah D. Goodman Findings of

Rose E. Wang 3 Apr 02, 2022
[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

DeepDeform (CVPR'2020) DeepDeform is an RGB-D video dataset containing over 390,000 RGB-D frames in 400 videos, with 5,533 optical and scene flow imag

Aljaz Bozic 165 Jan 09, 2023
Use of Attention Gates in a Convolutional Neural Network / Medical Image Classification and Segmentation

Attention Gated Networks (Image Classification & Segmentation) Pytorch implementation of attention gates used in U-Net and VGG-16 models. The framewor

Ozan Oktay 1.6k Dec 30, 2022
Numerical-computing-is-fun - Learning numerical computing with notebooks for all ages.

As much as this series is to educate aspiring computer programmers and data scientists of all ages and all backgrounds, it is also a reminder to mysel

EKA foundation 758 Dec 25, 2022
Dataset Condensation with Contrastive Signals

Dataset Condensation with Contrastive Signals This repository is the official implementation of Dataset Condensation with Contrastive Signals (DCC). T

3 May 19, 2022
Code for SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes (NeurIPS 2021)

SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes (NeurIPS 2021) SyncTwin is a treatment effect estimation method tailored for observat

Zhaozhi Qian 3 Nov 03, 2022
PromptDet: Expand Your Detector Vocabulary with Uncurated Images

PromptDet: Expand Your Detector Vocabulary with Uncurated Images Paper Website Introduction The goal of this work is to establish a scalable pipeline

103 Dec 20, 2022
Multiple Object Tracking with Yolov5!

Tracking with yolov5 This implementation is for who need to tracking multi-object only with detector. You can easily track mult-object with your well

9 Nov 08, 2022
Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.

NuPIC Numenta Platform for Intelligent Computing The Numenta Platform for Intelligent Computing (NuPIC) is a machine intelligence platform that implem

Numenta 6.3k Dec 30, 2022
MVS2D: Efficient Multi-view Stereo via Attention-Driven 2D Convolutions

MVS2D: Efficient Multi-view Stereo via Attention-Driven 2D Convolutions Project Page | Paper If you find our work useful for your research, please con

96 Jan 04, 2023
Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing

Notice: Support for Python 3.6 will be dropped in v.0.2.1, please plan accordingly! Efficient and Scalable Physics-Informed Deep Learning Collocation-

tensordiffeq 74 Dec 09, 2022
Spectrum is an AI that uses machine learning to generate Rap song lyrics

Spectrum Spectrum is an AI that uses deep learning to generate rap song lyrics. View Demo Report Bug Request Feature Open In Colab About The Project S

39 Dec 16, 2022
[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

mmTransformer Introduction This repo is official implementation for mmTransformer in pytorch. Currently, the core code of mmTransformer is implemented

DeciForce: Crossroads of Machine Perception and Autonomy 232 Dec 31, 2022
PyTorch META-DATASET (Few-shot classification benchmark)

PyTorch META-DATASET (Few-shot classification benchmark) This repo contains a PyTorch implementation of meta-dataset and a unified implementation of s

Malik Boudiaf 39 Oct 31, 2022
CoRe: Contrastive Recurrent State-Space Models

CoRe: Contrastive Recurrent State-Space Models This code implements the CoRe model and reproduces experimental results found in Robust Robotic Control

Apple 21 Aug 11, 2022
Real-time analysis of intracranial neurophysiology recordings.

py_neuromodulation Click this button to run the "Tutorial ML with py_neuro" notebooks: The py_neuromodulation toolbox allows for real time capable pro

Interventional Cognitive Neuromodulation - Neumann Lab Berlin 15 Nov 03, 2022
Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

Exploring Classification Equilibrium in Long-Tailed Object Detection (LOCE, ICCV 2021) Paper Introduction The conventional detectors tend to make imba

52 Nov 21, 2022