STAR

Official implementation of Sparse Transformer-based Action Recognition

Dataset

download NTU RGB+D 60 action recognition of 2D/3D skeleton from http://rose1.ntu.edu.sg/datasets/actionRecognition.asp

or use google drive

NTU60 NTU120

uzip data as the following file structure: $(project_folder)/raw/.\*skeleton or $(project_folder)/dataset/raw/.\*skeleton (create "raw" folder under $(project_folder) or $(project_folder)/dataset then put raw skeleton files under "raw" folder)

run the code below to generate dataset:

python datagen.py

Training

git fetch and checkout to "distributed" branch

python train_dist.py -#distributed training

Configuration

parser.set_defaults(gpu=True,
                    batch_size=128,
                    dataset_name='NTU',
                    dataset_root=osp.join(os.getcwd()),  # or dataset_root=osp.join(os.getcwd(), 'dataset')
                    load_model=False,
                    in_channels=9,
                    num_enc_layers=5,
                    num_conv_layers=2,
                    weight_decay=4e-5,
                    drop_rate=[0.4, 0.4, 0.4, 0.4],  # linear_attention, sparse_attention, add_norm, ffn
                    hid_channels=64,
                    out_channels=64,
                    heads=8,
                    data_parallel=False,
                    cross_k=5,
                    mlp_head_hidden=128)

parser.set_defaults(gpu=True,
                    batch_size=128,
                    dataset_name='NTU',
                    dataset_root=osp.join(os.getcwd()),
                    load_model=False,
                    in_channels=9,
                    num_enc_layers=5,
                    num_conv_layers=2,
                    weight_decay=4e-5,
                    drop_rate=[0.4, 0.4, 0.4, 0.4],  # linear_attention, sparse_attention, add_norm, ffn
                    hid_channels=128,
                    out_channels=128,
                    heads=8,
                    data_parallel=False,
                    cross_k=5,
                    mlp_head_hidden=128)

Official implementation of Sparse Transformer-based Action Recognition

Related tags

Overview

STAR

Dataset

Training

Configuration

Owner

Chonghan_Lee

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".

Pairwise model for commonlit competition

A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Utilizes Pose Estimation to offer sprinters cues based on an image of their running form.

Galaxy images labelled by morphology (shape). Aimed at ML development and teaching

Rethinking Transformer-based Set Prediction for Object Detection

Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

SynNet - synthetic tree generation using neural networks

XViT - Space-time Mixing Attention for Video Transformer

Spatiotemporal resampling methods for mlr3

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

A3C LSTM Atari with Pytorch plus A3G design

LQM - Improving Object Detection by Estimating Bounding Box Quality Accurately

Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

JAX bindings to the Flatiron Institute Non-uniform Fast Fourier Transform (FINUFFT) library

SciPy fixes and extensions

Learning Skeletal Articulations with Neural Blend Shapes

An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities.