ASFormer: Transformer for Action Segmentation

This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segmentation.

Enviroment

Pytorch == 1.1.0, torchvision == 0.3.0, python == 3.6, CUDA=10.1

Reproduce our results

1. Download the dataset data.zip at (https://mega.nz/#!O6wXlSTS!wcEoDT4Ctq5HRq_hV-aWeVF1_JB3cacQBQqOLjCIbc8) or (https://zenodo.org/record/3625992#.Xiv9jGhKhPY). 
2. Unzip the data.zip file to the current folder. There are three datasets in the ./data folder, i.e. ./data/breakfast, ./data/50salads, ./data/gtea
3. Download the pre-trained models at (https://pan.baidu.com/s/1zf-d-7eYqK-IxroBKTxDfg). There are pretrained models for three datasets, i.e. ./models/50salads, ./models/breakfast, ./models/gtea
4. Run python main.py --action=predict --dataset=50salads/gtea/breakfast --split=1/2/3/4/5 to generate predicted results for each split.
5. Run python eval.py --dataset=50salads/gtea/breakfast --split=0/1/2/3/4/5 to evaluate the performance. **NOTE**: split=0 will evaulate the average results for all splits, It needs to be done after you complete all split predictions.

Train your own model

Also, you can retrain the model by yourself with following command.

python main.py --action=train --dataset=50salads/gtea/breakfast --split=1/2/3/4/5

The training process is very stable in our experiments. It convergences very fast and is not sensitive to the number of training epochs.

Demo for using ASFormer as your backbone

In our paper, we replace the original TCN-based backbone model MS-TCN in ASRF with our ASFormer. The new model achieves even higher results on the 50salads dataset than the original ASRF. Code is Here.

If you find our repo useful, please give us a star and cite

@inproceedings{chinayi_ASformer,  
	author={Fangqiu Yi and Hongyu Wen and Tingting Jiang}, 
	booktitle={The British Machine Vision Conference (BMVC)},   
	title={ASFormer: Transformer for Action Segmentation},
	year={2021},  
}

Feel free to raise a issue if you got trouble with our code.

Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

Related tags

Overview

ASFormer: Transformer for Action Segmentation

Enviroment

Reproduce our results

Train your own model

Demo for using ASFormer as your backbone

Owner

Image morphing without reference points by applying warp maps and optimizing over them.

Progressive Image Deraining Networks: A Better and Simpler Baseline

AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

This GitHub repo consists of Code and Some results of project- Diabetes Treatment using Gold nanoparticles. These Consist of ML Models used for prediction Diabetes and further the basic theory and working of Gold nanoparticles.

Tutorials and implementations for "Self-normalizing networks"

Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

Machine-in-the-Loop Rewriting for Creative Image Captioning

基于Pytorch实现优秀的自然图像分割框架！(包括FCN、U-Net和Deeplab)

A curated list of resources for Image and Video Deblurring

Magic tool for managing internet connection in local network by @zalexdev

Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python

Code to reproduce the results for Compositional Attention

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021

SemEval2022 Patronizing and Condescending Language (PCL) Detection

Code for the Active Speakers in Context Paper (CVPR2020)

Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

Multiview 3D object detection on MultiviewC dataset through moft3d.

【Arxiv】Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

Random Erasing Data Augmentation. Experiments on CIFAR10, CIFAR100 and Fashion-MNIST