[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Last update: Oct 26, 2022

Related tags

Deep Learning SSVC

Overview

SSVC

The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning]

samples of the generated motion-preserved video with threshold $\alpha=0.5$.

Requirements

python3
torch1.1+
PIL
FrEIA==0.2 (Flow-based model)
lintel==1.0 (Decode mp4 videos on the fly)

Structure

backbone
data
- lists: train/val lists (.txt)
- augmentation.py: train/val data augmentation during ssl pre-training
- vDataLoader.py: custom your path to data list
model
- advflow: flow-based model
- classifier.py: linear classifier for down-stream tasks
- infonce.py: combine S$^2$VC with MoCo
flow
- pre-trained flow-based model weights
utils
main_pretrain.py: the main function for self-supervised pretrain
main_eval.py: the main function for supervised fine-tune

Self-supervised Pretrain

DDP

python -m torch.distributed.launch --nproc_per_node=1 --master_port 1234 main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0,1 --dataset XX

Single GPU

python main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0 --dataset XX

Evaluation

NN-Retrieval

python main_eval.py --retrieval --test SSL_Pt_Model_PTH --dataset XX --gpu X

Finetune

# fine-tune overall model
python main_eval.py --train_what ft --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

# freeze backbone, finetune last layer
python main_eval.py --train_what last --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

Test

python main_eval.py --train_what XX --ten_crop --test Sup_Ft_Model_PTH --gpu X \
--dataset XX --net r3d18 --img_dim 224 --seq_len 32

[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Related tags

Overview

SSVC

Requirements

Structure

Self-supervised Pretrain

DDP

Single GPU

Evaluation

NN-Retrieval

Finetune

Test

Owner

Introduction to CPM

Hl classification bc - A Network-Based High-Level Data Classification Algorithm Using Betweenness Centrality

StyleGAN2 with adaptive discriminator augmentation (ADA) - Official TensorFlow implementation

A cross-document event and entity coreference resolution system, trained and evaluated on the ECB+ corpus.

[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

Hierarchical Attentive Recurrent Tracking

Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

Simple Tensorflow implementation of Toward Spatially Unbiased Generative Models (ICCV 2021)

PyTorch implementation of normalizing flow models

Riemannian Geometry for Molecular Surface Approximation (RGMolSA)

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021)

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

This repo implements a 3D segmentation task for an airport baggage dataset.

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

Repository for the NeurIPS 2021 paper: "Exploiting Domain-Specific Features to Enhance Domain Generalization".

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

A developer interface for creating Chat AIs for the Chai app.