Synthetic Humans for Action Recognition, IJCV 2021

Related tags

Deep Learningsurreact
Overview

SURREACT: Synthetic Humans for Action Recognition from Unseen Viewpoints

Gül Varol, Ivan Laptev and Cordelia Schmid, Andrew Zisserman, Synthetic Humans for Action Recognition from Unseen Viewpoints, IJCV 2021.

[Project page] [arXiv]

Contents

1. Synthetic data generation from motion estimation

Please follow the instructions at datageneration/README.md for setting up the Blender environment and downloading required assets.

Once ready, you can generate one clip by running:

# set `BLENDER_PATH` and `CODE_PATH` variables in this script
bash datageneration/exe/run.sh

Note that -t 1 option in run.sh can be removed to run faster on multi cores. We used submit_multi_job*.sh to generate clips for the whole datasets in parallel on the cluster, you can adapt this for your infrastructure. This script also has sample argument-value pairs. Find in utils/argutils.py a list of arguments and their explanations. You can enable/disable outputting certain modalities by setting output_types here.

2. Training action recognition models

Please follow the instructions at training/README.md for setting up the Pytorch environment and preparing the datasets.

Once ready, you can launch training by running:

cd training/
bash exp/surreact_train.sh

3. Download SURREACT datasets

In order to download SURREACT datasets, you need to accept the license terms from SURREAL. The links to license terms and download procedure are available here:

https://www.di.ens.fr/willow/research/surreal/data/

Once you receive the credentials to download the dataset, you will have a personal username and password. Use these to download the synthetic videos from the following links. Note that due to storage complexity, we only provide .mp4 video files and metadata, but not the other modalities such as flow and segmentation. You are encouraged to run the data generation code to obtain those. We provide videos corresponding to NTU and UESTC datasets.

The structure of the folders can be as follows:

surreact/
------- uestc/  # using motion estimates from the UESTC dataset
------------ hmmr/
------------ vibe/
------- ntu/  # using motion estimates from the NTU dataset
------------ hmmr/
------------ vibe/
---------------- train/
---------------- test/
--------------------- <sequenceName>/ # e.g. S001C002P003R002A001 for NTU, a25_d1_p048_c1_color.avi for UESTC
------------------------------ <sequenceName>_v%03d_r%02d.mp4       # RGB - 240x320 resolution video
------------------------------ <sequenceName>_v%03d_r%02d_info.mat  # metadata
# bg         [char]          - name of the background image file
# cam_dist   [1 single]      - camera distance
# cam_height [1 single]      - camera height
# cloth      [chat]          - name of the texture image file
# gender     [1 uint8]       - gender (0: 'female', 1: 'male')
# joints2D   [2x24xT single] - 2D coordinates of 24 SMPL body joints on the image pixels
# joints3D   [3x24xT single] - 3D coordinates of 24 SMPL body joints in world meters
# light      [9 single]      - spherical harmonics lighting coefficients
# pose       [72xT single]   - SMPL parameters (axis-angle)
# sequence   [char]          - <sequenceName>
# shape      [10 single]     - body shape parameters
# source     [char]          - 'ntu' | 'hri40'
# zrot_euler [1 single]      - rotation in Z (euler angle), zero

# *** v%03d stands for the viewpoint in euler angles, we render 8 views: 000, 045, 090, 135, 180, 225, 270, 315.
# *** r%02d stands for the repetition, when the same video is rendered multiple times (this is always 00 for the released files)
# *** T is the number of frames, note that this can be smaller than the real source video length due to motion estimation dropping frames

Citation

If you use this code or data, please cite the following:

@INPROCEEDINGS{varol21_surreact,  
  title     = {Synthetic Humans for Action Recognition from Unseen Viewpoints},  
  author    = {Varol, G{\"u}l and Laptev, Ivan and Schmid, Cordelia and Zisserman, Andrew},  
  booktitle = {IJCV},  
  year      = {2021}  
}

License

Please check the SURREAL license terms before downloading and/or using the SURREACT data and data generation code.

Acknowledgements

The data generation code was extended from gulvarol/surreal. The training code was extended from bearpaw/pytorch-pose. The source of assets include action recognition datasets NTU and UESTC, SMPL and SURREAL projects. The motion estimation was possible thanks to mkocabas/VIBE or akanazawa/human_dynamics (HMMR) repositories. Please cite the respective papers if you use these.

Special thanks to Inria clusters sequoia and rioc.

Owner
Gul Varol
Computer Vision Researcher
Gul Varol
MVP Benchmark for Multi-View Partial Point Cloud Completion and Registration

MVP Benchmark: Multi-View Partial Point Clouds for Completion and Registration [NEWS] 2021-07-12 [NEW 🎉 ] The submission on Codalab starts! 2021-07-1

PL 93 Dec 21, 2022
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Junbin Xiao 50 Nov 24, 2022
Automatic 2D-to-3D Video Conversion with CNNs

Deep3D: Automatic 2D-to-3D Video Conversion with CNNs How To Run To run this code. Please install MXNet following the official document. Deep3D requir

Eric Junyuan Xie 1.2k Dec 30, 2022
STARCH compuets regional extreme storm physical characteristics and moisture balance based on spatiotemporal precipitation data from reanalysis or climate model data.

STARCH (Storm Tracking And Regional CHaracterization) STARCH computes regional extreme storm physical and moisture balance characteristics based on sp

Onosama 7 Oct 20, 2022
A simple Neural Network that predicts the label for a series of handwritten digits

Neural_Network A simple Neural Network that predicts the label for a series of handwritten numbers This program tries to predict the label (1,2,3 etc.

Ty 1 Dec 18, 2021
masscan + nmap + Finger

说明 个人根据使用习惯修改masnmap而来的一个小工具。调用masscan做全端口扫描,再调用nmap做服务识别,最后调用Finger做Web指纹识别。工具使用场景适合风险探测排查、众测等。 使用方法 安装依赖 pip3 install -r requirements.txt -i https:/

Ryan 3 Mar 25, 2022
High-Fidelity Pluralistic Image Completion with Transformers (ICCV 2021)

Image Completion Transformer (ICT) Project Page | Paper (ArXiv) | Pre-trained Models | Supplemental Material This repository is the official pytorch i

Ziyu Wan 243 Jan 03, 2023
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

bottom-up-attention This code implements a bottom-up attention model, based on multi-gpu training of Faster R-CNN with ResNet-101, using object and at

Peter Anderson 1.3k Jan 09, 2023
Auditing Black-Box Prediction Models for Data Minimization Compliance

Data-Minimization-Auditor An auditing tool for model-instability based data minimization that is introduced in "Auditing Black-Box Prediction Models f

Bashir Rastegarpanah 2 Mar 24, 2022
Tensorflow 2.x implementation of Panoramic BlitzNet for object detection and semantic segmentation on indoor panoramic images.

Deep neural network for object detection and semantic segmentation on indoor panoramic images. The implementation is based on the papers:

Alejandro de Nova Guerrero 9 Nov 24, 2022
Cross-view Transformers for real-time Map-view Semantic Segmentation (CVPR 2022 Oral)

Cross View Transformers This repository contains the source code and data for our paper: Cross-view Transformers for real-time Map-view Semantic Segme

Brady Zhou 363 Dec 25, 2022
A framework for Quantification written in Python

QuaPy QuaPy is an open source framework for quantification (a.k.a. supervised prevalence estimation, or learning to quantify) written in Python. QuaPy

41 Dec 14, 2022
Pytorch implementation of the popular Improv RNN model originally proposed by the Magenta team.

Pytorch Implementation of Improv RNN Overview This code is a pytorch implementation of the popular Improv RNN model originally implemented by the Mage

Sebastian Murgul 3 Nov 11, 2022
Aligning Latent and Image Spaces to Connect the Unconnectable

About This repo contains the official implementation of the Aligning Latent and Image Spaces to Connect the Unconnectable paper. It is a GAN model whi

Ivan Skorokhodov 203 Jan 03, 2023
An imperfect information game is a type of game with asymmetric information

DecisionHoldem An imperfect information game is a type of game with asymmetric information. Compared with perfect information game, imperfect informat

Decision AI 25 Dec 23, 2022
Official implementation for "Low-light Image Enhancement via Breaking Down the Darkness"

Low-light Image Enhancement via Breaking Down the Darkness by Qiming Hu, Xiaojie Guo. 1. Dependencies Python3 PyTorch=1.0 OpenCV-Python, TensorboardX

Qiming Hu 30 Jan 01, 2023
Fuzzing the Kernel Using Unicornafl and AFL++

Unicorefuzz Fuzzing the Kernel using UnicornAFL and AFL++. For details, skim through the WOOT paper or watch this talk at CCCamp19. Is it any good? ye

Security in Telecommunications 283 Dec 26, 2022
A project that uses optical flow and machine learning to detect aimhacking in video clips.

waldo-anticheat A project that aims to use optical flow and machine learning to visually detect cheating or hacking in video clips from fps games. Che

waldo.vision 542 Dec 03, 2022
Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL)

LUPerson-NL Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL) The repository is for our CVPR2022 paper Large-Scale

43 Dec 26, 2022
This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent Möbius CNNs This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of

Maurice Weiler 59 Dec 09, 2022