BMVC 2021: This is the github repository for "Few Shot Temporal Action Localization using Query Adaptive Transformers" accepted in British Machine Vision Conference (BMVC) 2021, Virtual

Overview

FS-QAT: Few Shot Temporal Action Localization using Query Adaptive Transformer

Accepted as Poster in BMVC 2021

This is an official implementation in PyTorch of FS-QAT. Our paper is available at Arxiv

Updates

  • (October, 2021) We released FS-QAT training and inference code for ActivityNet dataset.
  • (October, 2021) FS-QAT is accepted in BMVC2021.

Abstract

Existing temporal action localization (TAL) works rely on a large number of training videos with exhaustive segment-level annotation, preventing them from scaling to new classes. As a solution to this problem, few-shot TAL (FS-TAL) aims to adapt a model to a new class represented by as few as a single video. Exiting FS-TAL methods assume trimmed training videos for new classes. However, this setting is not only unnatural – actions are typically captured in untrimmed videos, but also ignores background video segments containing vital contextual cues for foreground action segmentation. In this work, we first propose a new FS-TAL setting by proposing to use untrimmed training videos. Further, a novel FS-TAL model is proposed which maximizes the knowledge transfer from training classes whilst enabling the model to be dynamically adapted to both the new class and each video of that class simultaneously. This is achieved by introducing a query adaptive Transformer in the model. Extensive experiments on two action localization benchmarks demonstrate that our method can outperform all the stateof-the-art alternatives significantly in both single-domain and cross-domain scenarios.

Summary

  • First Few-Shot TAL setting to use Untrimmed Videos for both Support and Query
  • Unified Model can accomodate both Untrimmed and Trimmed Video without design change
  • Instead of meta-learning the entire network, only Transformer is meta-learned hence faster adaptation.
  • Intra-Class Variance is handled using this adaptation
  • Promising performance in Cross-Domain/Dataset settings.

Qualitative Performance

Training and Evaluation

Appologize for the messed up Code

Refactoring will be done soon ( delay due to CVPR workload )

To Train

python gtad_train_fs.py 

To Test

sh test_fs.sh

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@misc{nag2021fewshot,
      title={Few-Shot Temporal Action Localization with Query Adaptive Transformer}, 
      author={Sauradip Nag and Xiatian Zhu and Tao Xiang},
      year={2021},
      eprint={2110.10552},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
Sauradip Nag
PhD Student, ReID Lab, CVSSP, University of Surrey , United Kingdom | Ex- IIT Madras | Ex - ISI, Kolkata. Website : https://sauradip.github.io
Sauradip Nag
Code for the paper "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Jukebox Code for "Jukebox: A Generative Model for Music" Paper Blog Explorer Colab Insta

OpenAI 6k Jan 02, 2023
A collection of resources and papers on Diffusion Models, a darkhorse in the field of Generative Models

This repository contains a collection of resources and papers on Diffusion Models and Score-based Models. If there are any missing valuable resources

5.1k Jan 08, 2023
Fashion Recommender System With Python

Fashion-Recommender-System Thr growing e-commerce industry presents us with a la

Omkar Gawade 2 Feb 02, 2022
Duke Machine Learning Winter School: Computer Vision 2022

mlwscv2002 Welcome to the Duke Machine Learning Winter School: Computer Vision 2022! The MLWS-CV includes 3 hands-on training sessions on implementing

Duke + Data Science (+DS) 9 May 25, 2022
Sample Prior Guided Robust Model Learning to Suppress Noisy Labels

PGDF This repo is the official implementation of our paper "Sample Prior Guided Robust Model Learning to Suppress Noisy Labels ". Citation If you use

CVSM Group - email: <a href=[email protected]"> 22 Dec 23, 2022
Code for the paper Task Agnostic Morphology Evolution.

Task-Agnostic Morphology Optimization This repository contains code for the paper Task-Agnostic Morphology Evolution by Donald (Joey) Hejna, Pieter Ab

Joey Hejna 18 Aug 04, 2022
Make a surveillance camera from your raspberry pi!

rpi-surveillance Make a surveillance camera from your Raspberry Pi 4! The surveillance is built as following: the camera records 10 seconds video and

Vladyslav 62 Feb 03, 2022
CCNet: Criss-Cross Attention for Semantic Segmentation (TPAMI 2020 & ICCV 2019).

CCNet: Criss-Cross Attention for Semantic Segmentation Paper Links: Our most recent TPAMI version with improvements and extensions (Earlier ICCV versi

Zilong Huang 1.3k Dec 27, 2022
This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper

DeepShift This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper, that aims to replace multiplicati

Mostafa Elhoushi 88 Dec 23, 2022
Leibniz is a python package which provide facilities to express learnable partial differential equations with PyTorch

Leibniz is a python package which provide facilities to express learnable partial differential equations with PyTorch

Beijing ColorfulClouds Technology Co.,Ltd. 16 Aug 07, 2022
Make differentially private training of transformers easy for everyone

private-transformers This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers. What is this? Why

Xuechen Li 73 Dec 28, 2022
Le dataset des images du projet d'IA de 2021

face-mask-dataset-ilc-2021 Le dataset des images du projet d'IA de 2021, Indiquez vos id git dans la issue pour les droits TL;DR: Choisir 200 images J

7 Nov 15, 2021
Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

Diversifying Commonsense Reasoning Generation on Knowledge Graph Introduction -- This is the pytorch implementation of our ACL 2022 paper "Diversifyin

DM2 Lab @ ND 61 Dec 30, 2022
Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

PyMAF This repository contains the code for the following paper: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop Hongwe

Hongwen Zhang 450 Dec 28, 2022
Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation

UniFuse (RAL+ICRA2021) Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation, arXiv, Demo Preparation I

Alibaba 47 Dec 26, 2022
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 46.9k Jan 03, 2023
A PyTorch-centric hybrid classical-quantum machine learning framework

torchquantum A PyTorch-centric hybrid classical-quantum dynamic neural networks framework. News Add a simple example script using quantum gates to do

MIT HAN Lab 400 Jan 02, 2023
The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

ISC-Track1-Submission The codes and related files to reproduce the results for Image Similarity Challenge Track 1. Required dependencies To begin with

Wenhao Wang 115 Jan 02, 2023
SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

[arXiv] The main motivation of the SHIFT15M project is to provide a dataset that contains natural dataset shifts collected from a web service IQON, wh

ZOZO, Inc. 138 Nov 24, 2022