Video Contrastive Learning with Global Context

Overview

Video Contrastive Learning with Global Context (VCLR)

This is the official PyTorch implementation of our VCLR paper.

Install dependencies

  • environments
    conda create --name vclr python=3.7
    conda activate vclr
    conda install numpy scipy scikit-learn matplotlib scikit-image
    pip install torch==1.7.1 torchvision==0.8.2
    pip install opencv-python tqdm termcolor gcc7 ffmpeg tensorflow==1.15.2
    pip install mmcv-full==1.2.7

Prepare datasets

Please refer to PREPARE_DATA to prepare the datasets.

Prepare pretrained MoCo weights

In this work, we follow SeCo and use the pretrained weights of MoCov2 as initialization.

cd ~
git clone https://github.com/amazon-research/video-contrastive-learning.git
cd video-contrastive-learning
mkdir pretrain && cd pretrain
wget https://dl.fbaipublicfiles.com/moco/moco_checkpoints/moco_v2_200ep/moco_v2_200ep_pretrain.pth.tar
cd ..

Self-supervised pretraining

bash shell/main_train.sh

Checkpoints will be saved to ./results

Downstream tasks

Linear evaluation

In order to evaluate the effectiveness of self-supervised learning, we conduct a linear evaluation (probing) on Kinetics400 dataset. Basically, we first extract features from the pretrained weight and then train a SVM classifier to see how the learned features perform.

bash shell/eval_svm.sh
  • Results

    Arch Pretrained dataset Epoch Pretrained model Acc. on K400
    ResNet50 Kinetics400 400 Download link 64.1

Video retrieval

bash shell/eval_retrieval.sh

Action recognition & action localization

Here, we use mmaction2 for both tasks. If you are not familiar with mmaction2, you can read the official documentation.

Installation

  • Step1: Install mmaction2

    To make sure the results can be reproduced, please use our forked version of mmaction2 (version: 0.11.0):

    conda activate vclr
    cd ~
    git clone https://github.com/KuangHaofei/mmaction2
    
    cd mmaction2
    pip install -v -e .
  • Step2: Prepare the pretrained weights

    Our pretrained backbone have different format with the backbone of mmaction2, it should be transferred to mmaction2 format. We provide the transferred version of our K400 pretrained weights, TSN and TSM. We also provide the script for transferring weights, you can find it here.

    Moving the pretrained weights to checkpoints directory:

    cd ~/mmaction2
    mkdir checkpoints
    wget https://haofeik-data.s3.amazonaws.com/VCLR/pretrained/vclr_mm.pth
    wget https://haofeik-data.s3.amazonaws.com/VCLR/pretrained/vclr_mm_tsm.pth

Action recognition

Make sure you have prepared the dataset and environments following the previous step. Now suppose you are in the root directory of mmaction2, follow the subsequent steps to fine tune the TSN or TSM models for action recognition.

For each dataset, the train and test setting can be found in the configuration files.

  • UCF101

    • config file: tsn_ucf101.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_ucf101.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsn/vclr/tsn_ucf101.py \
        work_dirs/vclr/ucf101/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • HMDB51

    • config file: tsn_hmdb51.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_hmdb51.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsn/vclr/tsn_hmdb51.py \
        work_dirs/vclr/hmdb51/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • SomethingSomethingV2: TSN

    • config file: tsn_sthv2.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_sthv2.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsn/vclr/tsn_sthv2.py \
        work_dirs/vclr/tsn_sthv2/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • SomethingSomethingV2: TSM

    • config file: tsm_sthv2.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsm/vclr/tsm_sthv2.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsm/vclr/tsm_sthv2.py \
        work_dirs/vclr/tsm_sthv2/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • ActivityNet

    • config file: tsn_activitynet.py
    • train command:
      ./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_activitynet.py 8 \
        --validate --seed 0 --deterministic
    • test command:
      python tools/test.py configs/recognition/tsn/vclr/tsn_activitynet.py \
        work_dirs/vclr/tsn_activitynet/latest.pth \
        --eval top_k_accuracy mean_class_accuracy --out result.json
  • Results

    Arch Dataset Finetuned model Acc.
    TSN UCF101 Download link 85.6
    TSN HMDB51 Download link 54.1
    TSN SomethingSomethingV2 Download link 33.3
    TSM SomethingSomethingV2 Download link 52.0
    TSN ActivityNet Download link 71.9

Action localization

  • Step 1: Follow the previous section, suppose the finetuned model is saved at work_dirs/vclr/tsn_activitynet/latest.pth

  • Step 2: Extract ActivityNet features

    cd ~/mmaction2/tools/data/activitynet/
    
    python tsn_feature_extraction.py --data-prefix /home/ubuntu/data/ActivityNet/rawframes \
      --data-list /home/ubuntu/data/ActivityNet/anet_train_video.txt \
      --output-prefix /home/ubuntu/data/ActivityNet/rgb_feat \
      --modality RGB --ckpt /home/ubuntu/mmaction2/work_dirs/vclr/tsn_activitynet/latest.pth
    
    python tsn_feature_extraction.py --data-prefix /home/ubuntu/data/ActivityNet/rawframes \
      --data-list /home/ubuntu/data/ActivityNet/anet_val_video.txt \
      --output-prefix /home/ubuntu/data/ActivityNet/rgb_feat \
      --modality RGB --ckpt /home/ubuntu/mmaction2/work_dirs/vclr/tsn_activitynet/latest.pth
    
    python activitynet_feature_postprocessing.py \
      --rgb /home/ubuntu/data/ActivityNet/rgb_feat \
      --dest /home/ubuntu/data/ActivityNet/mmaction_feat

    Note, the root directory of ActivityNey is /home/ubuntu/data/ActivityNet/ in our case. Please replace it according to your real directory.

  • Step 3: Train and test the BMN model

    • train
      cd ~/mmaction2
      ./tools/dist_train.sh configs/localization/bmn/bmn_acitivitynet_feature_vclr.py 2 \
        --work-dir work_dirs/vclr/bmn_activitynet --validate --seed 0 --deterministic --bmn
    • test
      python tools/test.py configs/localization/bmn/bmn_acitivitynet_feature_vclr.py \
        work_dirs/vclr/bmn_activitynet/latest.pth \
        --bmn --eval [email protected] --out result.json
  • Results

    Arch Dataset Finetuned model AUC [email protected]
    BMN ActivityNet Download link 65.5 73.8

Feature visualization

We provide our feature visualization code at here.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.

Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models Code and supplementary materials Repository of the p

Daniel Bogdoll 4 Jul 13, 2022
ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

63 Nov 18, 2022
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020) Official implementation of: Forest R-CNN: Large-Vo

Jialian Wu 54 Jan 06, 2023
Open standard for machine learning interoperability

Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides

Open Neural Network Exchange 13.9k Dec 30, 2022
MoveNetを用いたPythonでの姿勢推定のデモ

MoveNet-Python-Example MoveNetのPythonでの動作サンプルです。 ONNXに変換したモデルも同梱しています。変換自体を試したい方はMoveNet_tf2onnx.ipynbを使用ください。 2021/08/24時点でTensorFlow Hubで提供されている以下モデ

KazuhitoTakahashi 38 Dec 17, 2022
Filtering variational quantum algorithms for combinatorial optimization

Current gate-based quantum computers have the potential to provide a computational advantage if algorithms use quantum hardware efficiently.

1 Feb 09, 2022
Data, notebooks, and articles associated with the RSNA AI Deep Learning Lab at RSNA 2021

RSNA AI Deep Learning Lab 2021 Intro Welcome Deep Learners! This document provides all the information you need to participate in the RSNA AI Deep Lea

RSNA 65 Dec 16, 2022
Probabilistic Gradient Boosting Machines

PGBM Probabilistic Gradient Boosting Machines (PGBM) is a probabilistic gradient boosting framework in Python based on PyTorch/Numba, developed by Air

Olivier Sprangers 112 Dec 28, 2022
Huawei Hackathon 2021 - Sweden (Stockholm)

huawei-hackathon-2021 Contributors DrakeAxelrod Challenge Requirements: python=3.8.10 Standard libraries (no importing) Important factors: Data depend

Drake Axelrod 32 Nov 08, 2022
Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet

Attack classification models with transferability, black-box attack; unrestricted adversarial attacks on imagenet, CVPR2021 安全AI挑战者计划第六期:ImageNet无限制对抗攻击 决赛第四名(team name: Advers)

51 Dec 01, 2022
Make differentially private training of transformers easy for everyone

private-transformers This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers. What is this? Why

Xuechen Li 73 Dec 28, 2022
Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Putting_Game Object tracking and object detection is applied to track golf puts in real time and display stats/games. Works best with the Perfect Prac

Max 1 Dec 29, 2021
deep learning model with only python and numpy with test accuracy 99 % on mnist dataset and different optimization choices

deep_nn_model_with_only_python_100%_test_accuracy deep learning model with only python and numpy with test accuracy 99 % on mnist dataset and differen

0 Aug 28, 2022
A PyTorch Implementation of SphereFace.

SphereFace A PyTorch Implementation of SphereFace. The code can be trained on CASIA-Webface and the best accuracy on LFW is 99.22%. SphereFace: Deep H

carwin 685 Dec 09, 2022
Veri Setinizi Yolov5 Formatına Dönüştürün

Veri Setinizi Yolov5 Formatına Dönüştürün! Bu Repo da Neler Var? Xml Formatındaki Veri Setini .Txt Formatına Çevirme Xml Formatındaki Dosyaları Silme

Kadir Nar 4 Aug 22, 2022
A no-BS, dead-simple training visualizer for tf-keras

A no-BS, dead-simple training visualizer for tf-keras TrainingDashboard Plot inter-epoch and intra-epoch loss and metrics within a jupyter notebook wi

Vibhu Agrawal 3 May 28, 2021
Latent Execution for Neural Program Synthesis

Latent Execution for Neural Program Synthesis This repo provides the code to replicate the experiments in the paper Xinyun Chen, Dawn Song, Yuandong T

Xinyun Chen 16 Oct 02, 2022
[CoRL 21'] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo

TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo Lukas Koestler1*    Nan Yang1,2*,†    Niclas Zeller2,3    Daniel Cremers1

TUM Computer Vision Group 744 Jan 04, 2023
An excellent hash algorithm combining classical sponge structure and RNN.

SHA-RNN Recurrent Neural Network with Chaotic System for Hash Functions Anonymous Authors [摘要] 在这次作业中我们提出了一种新的 Hash Function —— SHA-RNN。其以海绵结构为基础,融合了混

Houde Qian 5 May 15, 2022
A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required.

Fluke289_data_access A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required. Created from informa

3 Dec 08, 2022