Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Last update: Dec 10, 2022

Overview

Contrast and Mix (CoMix)

The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing part of Advances in Neural Information Processing Systems (NeurIPS) 2021.

Aadarsh Sahoo¹, Rutav Shah¹, Rameswar Panda², Kate Saenko^2,3, Abir Das¹

¹ IIT Kharagpur, ² MIT-IBM Watson AI Lab, ³ Boston University

[Paper] [Project Page]

Fig. Temporal Contrastive Learning with Background Mixing and Target Pseudo-labels. Temporal contrastive loss (left) contrasts a single temporally augmented positive (same video, different speed) per anchor against rest of the videos in a mini-batch as negatives. Incorporating background mixing (middle) provides additional positives per anchor possessing same action semantics with a different background alleviating background shift across domains. Incorporating target pseudo-labels (right) additionally enhances the discriminabilty by contrasting the target videos with the same pseudo-label as positives against rest of the videos as negatives.

Preparing the Environment

Conda

Please use the comix_environment.yml file to create the conda environment comix as:

conda env create -f comix_environment.yml

Pip

Please use the requirements.txt file to install all the required dependencies as:

pip install -r requirements.txt

Data Directory Structure

All the datasets should be stored in the folder ./data following the convention ./data/ and it must be passed as an argument to base_dir=./data/.

UCF - HMDB

For ucf_hmdb dataset with base_dir=./data/ucf_hmdb the structure would be as follows:

.
├── ...
├── data
│   ├── ucf_hmdb
│   │   ├── ucf_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
│   │   ├── hmdb_videos
|   |   ├── ucf_BG
|   |   └── hmdb_BG
│   └──
└──

Jester

For Jester dataset with base_dir=./data/jester the structure would be as follows

.
├── ...
├── data
│   ├── jester
|   |   ├── jester_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
|   |   ├── jester_BG
|   |   |   ├── 
       
         | | | | ├── 
        
          | | | ├── ... └── └── └──

Epic-Kitchens

For Epic Kitchens dataset with base_dir=./data/epic_kitchens the structure would be as follows (we follow the same structure as in the original dataset) :

.
├── ...
├── data
│   ├── epic_kitchens
|   |   ├── epic_kitchens_videos
|   |   |   ├── train
|   |   |   |   ├── D1
|   |   |   |   |   ├── 
   
    
|   |   |   |   |   |   ├── 
    
     
|   |   |   |   |   |   ├── 
     
      
|   |   |   |   |   |   ├── ...
|   |   |   |   |   ├── 
      
       
|   |   |   |   |   ├── ...
|   |   |   |   ├── D2
|   |   |   |   └── D3
|   |   |   └── test
└── └── └── epic_kitchens_BG

For using datasets stored in some other directories, please pass the parameter base_dir accordingly.

Background Extraction using Temporal Median Filtering

Please refer to the folder ./background_extraction for the codes to extract backgrounds using temporal median filtering.

Data

All the required split files are provided inside the directory ./video_splits.

The official download links for the datasets used for this paper are: [UCF-101] [HMDB-51] [Jester] [Epic Kitchens]

Training CoMix

Here are some of the sample and recomended commands to train CoMix for the transfer task of:

UCF -> HMDB from UCF-HMDB dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name UCF-HMDB --src_dataset UCF --tgt_dataset HMDB --batch_size 8 --model_root ./checkpoints_ucf_hmdb --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.01 --base_dir ./data/ucf_hmdb

S -> T from Jester dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Jester --src_dataset S --tgt_dataset T --batch_size 8 --model_root ./checkpoints_jester --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.1 --base_dir ./data/jester

D1 -> D2 from Epic-Kitchens dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Epic-Kitchens --src_dataset D1 --tgt_dataset D2 --batch_size 8 --model_root ./checkpoints_epic_d1_d2 --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.01 --lambda_tpl 0.01 --base_dir ./data/epic_kitchens

For detailed description regarding the arguments, use:

python main.py --help

Citing CoMix

If you use codes in this repository, consider citing CoMix. Thanks!

@article{sahoo2021contrast,
  title={Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing},
  author={Sahoo, Aadarsh and Shah, Rutav and Panda, Rameswar and Saenko, Kate and Das, Abir},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Related tags

Overview

Contrast and Mix (CoMix)

Preparing the Environment

Conda

Pip

Data Directory Structure

UCF - HMDB

Jester

Epic-Kitchens

Background Extraction using Temporal Median Filtering

Data

Training CoMix

Citing CoMix

Owner

Computer Vision and Intelligence Research (CVIR)

PyTorch Implementations for DeeplabV3 and PSPNet

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

Erpnext app for make employee salary on payroll entry based on one or more project with percentage for all project equal 100 %

Pytorch0.4.1 codes for InsightFace

torchsummaryDynamic: support real FLOPs calculation of dynamic network or user-custom PyTorch ops

Visualizing lattice vibration information from phonon dispersion to atoms (For GPUMD)

A small library for doing fluid simulation with neural networks.

Julia and Matlab codes to simulated all problems in El-Hachem, McCue and Simpson (2021)

Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.

Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation

Code for ViTAS_Vision Transformer Architecture Search

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

Pixray is an image generation system

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Code for intrusion detection system (IDS) development using CNN models and transfer learning

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Detector for Log4Shell exploitation attempts

Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Related tags

Overview

Contrast and Mix (CoMix)

Preparing the Environment

Conda

Pip

Data Directory Structure

UCF - HMDB

Jester

Epic-Kitchens

Background Extraction using Temporal Median Filtering

Data

Training CoMix

Citing CoMix

Owner

Computer Vision and Intelligence Research (CVIR)

PyTorch Implementations for DeeplabV3 and PSPNet

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

Erpnext app for make employee salary on payroll entry based on one or more project with percentage for all project equal 100 %

Pytorch0.4.1 codes for InsightFace

torchsummaryDynamic: support real FLOPs calculation of dynamic network or user-custom PyTorch ops

Visualizing lattice vibration information from phonon dispersion to atoms (For GPUMD)

A small library for doing fluid simulation with neural networks.

Julia and Matlab codes to simulated all problems in El-Hachem, McCue and Simpson (2021)

Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for *Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances* paper.

Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation

Code for ViTAS_Vision Transformer Architecture Search

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

Pixray is an image generation system

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Code for intrusion detection system (IDS) development using CNN models and transfer learning

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Detector for Log4Shell exploitation attempts

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.