Code for Domain Adaptive Video Segmentation via Temporal Consistency Regularization in ICCV 2021

Last update: Dec 12, 2022

Related tags

Deep Learning DA-VSN

Overview

Domain Adaptive Video Segmentation via Temporal Consistency Regularization

Updates

08/2021: check out our domain adaptation for sematic segmentation paper RDA: Robust Domain Adaptation via Fourier Adversarial Attacking (accepted to ICCV 2021). This paper presents RDA, a robust domain adaptation technique that introduces adversarial attacking to mitigate overfitting in UDA. Code avaliable.
06/2021: check out our domain adaptation for panoptic segmentation paper Cross-View Regularization for Domain Adaptive Panoptic Segmentation (accepted to CVPR 2021). We design a domain adaptive panoptic segmentation network that exploits inter-style consistency and inter-task regularization for optimal domain adaptation in panoptic segmentation.Code avaliable.
06/2021: check out our domain generalization paper FSDR: Frequency Space Domain Randomization for Domain Generalization (accepted to CVPR 2021). Inspired by the idea of JPEG that converts spatial images into multiple frequency components (FCs), we propose Frequency Space Domain Randomization (FSDR) that randomizes images in frequency space by keeping domain-invariant FCs and randomizing domain-variant FCs only. Code avaliable.
06/2021: check out our domain adapation for sematic segmentation paper Scale variance minimization for unsupervised domain adaptation in image segmentation (accepted to Pattern Recognition 2021). We design a scale variance minimization (SVMin) method by enforcing the intra-image semantic structure consistency in the target domain. Code avaliable.
06/2021: check out our domain adapation for object detection paper Uncertainty-Aware Unsupervised Domain Adaptation in Object Detection (accepted to IEEE TMM 2021). We design a uncertainty-aware domain adaptation network (UaDAN) that introduces conditional adversarial learning to align well-aligned and poorly-aligned samples separately in different manners. Code avaliable.

Paper

Domain Adaptive Video Segmentation via Temporal Consistency Regularization

Dayan Guan, Jiaxing Huang, Xiao Aoran, Shijian Lu
School of Computer Science and Engineering, Nanyang Technological University, Singapore
International Conference on Computer Vision, 2021.

If you find this code useful for your research, please cite our paper:

@inproceedings{guan2021domain,
  title={Domain adaptive video segmentation via temporal consistency regularization},
  author={Guan, Dayan and Huang, Jiaxing and Xiao, Aoran and Lu, Shijian},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={8053--8064},
  year={2021}
}

Abstract

Video semantic segmentation is an essential task for the analysis and understanding of videos. Recent efforts largely focus on supervised video segmentation by learning from fully annotated data, but the learnt models often experience clear performance drop while applied to videos of a different domain. This paper presents DA-VSN, a domain adaptive video segmentation network that addresses domain gaps in videos by temporal consistency regularization (TCR) for consecutive frames of target-domain videos. DA-VSN consists of two novel and complementary designs. The first is cross-domain TCR that guides the prediction of target frames to have similar temporal consistency as that of source frames (learnt from annotated source data) via adversarial learning. The second is intra-domain TCR that guides unconfident predictions of target frames to have similar temporal consistency as confident predictions of target frames. Extensive experiments demonstrate the superiority of our proposed domain adaptive video segmentation network which outperforms multiple baselines consistently by large margins.

Installation

Conda enviroment:

conda create -n DA-VSN python=3.6
conda activate DA-VSN
conda install -c menpo opencv
pip install torch==1.2.0 torchvision==0.4.0

Clone the ADVENT:

git clone https://github.com/valeoai/ADVENT.git
pip install -e ./ADVENT

Clone the repo:

git clone https://github.com/Dayan-Guan/DA-VSN.git
pip install -e ./DA-VSN

Preparation

Dataset:

Cityscapes-Seq

DA-VSN/data/Cityscapes/                       % Cityscapes dataset root
DA-VSN/data/Cityscapes/leftImg8bit_sequence   % leftImg8bit_sequence_trainvaltest
DA-VSN/data/Cityscapes/gtFine                 % gtFine_trainvaltest

VIPER:

DA-VSN/data/Viper/                            % VIPER dataset root
DA-VSN/data/Viper/train/img                   % Modality: Images; Frames: *[0-9]; Sequences: 00-77; Format: jpg
DA-VSN/data/Viper/train/cls                   % Modality: Semantic class labels; Frames: *0; Sequences: 00-77; Format: png

SYNTHIA-Seq

DA-VSN/data/SynthiaSeq/                      % SYNTHIA-Seq dataset root
DA-VSN/data/SynthiaSeq/SEQS-04-DAWN          % SYNTHIA-SEQS-04-DAWN

Pre-trained models: Download pre-trained models and put in DA-VSN/pretrained_models

Optical Flow Estimation

For quick preparation: Download the optical flow estimated from Cityscapes-Seq validation set here and unzip in DA-VSN/data

Clone the flownet2-pytorch:

git clone https://github.com/NVIDIA/flownet2-pytorch.git

Download pre-trained FlowNet2 and put in flownet2-pytorch/pretrained_models

DA-VSN/data/Cityscapes_val_optical_flow_scale512/  % unzip Cityscapes_val_optical_flow_scale512.zip

Use the flownet2-pytorch to estimate optical flow

Evaluation on Pretrained Models

VIPER → Cityscapes-Seq:

cd DA-VSN/davsn/scripts
python test.py --cfg configs/davsn_viper2city_pretrained.yml

SYNTHIA-Seq → Cityscapes-Seq:

python test.py --cfg configs/davsn_syn2city_pretrained.yml

Training and Testing

VIPER → Cityscapes-Seq:

cd DA-VSN/davsn/scripts
python train.py --cfg configs/davsn_viper2city.yml
python test.py --cfg configs/davsn_viper2city.yml

SYNTHIA-Seq → Cityscapes-Seq:

python train.py --cfg configs/davsn_syn2city.yml
python test.py --cfg configs/davsn_syn2city.yml

Acknowledgements

This codebase is heavily borrowed from ADVENT and flownet2-pytorch.

Contact

If you have any questions, please contact: [email protected]

Comments

Optical flow is not used for propagating

Hi, author. I have two questions. The first is I find that you didn't use flow to propogate previous frame to current frame. You just use it as a limitation that the pixel appeared in both cf and kf will be retained. This is unreasonable. And I refine the code using resample2D to warp kf to cf, but the result only improve a little.

The second question is that I try to train DAVSN for 3 times on 1080Ti and 2080Ti following the setting you gave, but I only get 46 mIoU which is 2 point less than you.

opened by EDENpraseHAZARD 5
Question on Synthia-seq dataset

Dear authors,

Thank you for your great work. I have several questions about the synthia-seq->cityscape-seq adaptation. The first one is about the scale of training data. It seems like compared with the VIPER dataset, synthia-seq only contains one labeled video with 850 frames in total. Is that true? And the second question is that 11 classes are reported the Table 4, but in the dataloader of synthia-seq, 12 classes are used. So, I'm not sure whether the fence class is considered during adaptation or not. https://github.com/Dayan-Guan/DA-VSN/blob/d110ff70dacec4156a3787eb49e7f2448dfb91a5/davsn/dataset/SynthiaSeq.py#L11

Thanks in advance for your help!

opened by xyIsHere 3
Details of SYNTHIA-Seq dataset

Hi author, I have downloaded SYNTHIA-Seq, but I found there are 'Stereo_Left' and 'Stereo_Right' folders. And each contains 'Omni_B', 'Omni_F', 'Omni_L' and 'Omni_R'. I wonder which one is used for training.

opened by EDENpraseHAZARD 2
Could you please provide 'estimated_optical_flow' for training DA-VSN

Hi @Dayan-Guan , thank you for open-sourcing your work!

I am trying to follow this work. For training DA-VSN from scratch, the optical flows (for the 3 datasets used in your paper) estimated by FlowNet2 are needed. However, the instruction in your README only includes the evaluation part. I also see from the recent issues that you have provided the code and more instructions for the training part. But the code is not a complete one I guess so I cannot generate the optical flows with it.

Could you please provide your generated optical flows for all 3 datasets used in your paper? It would save us time. Or could you please have a look again at the provided 'Code_for_optical_flow_estimation'? So that it is runnable for generating optical flows on our own.

Thanks in advance!

Regards

opened by ldkong1205 1
In train_video_UDA.py, line 251, trg_ prob_ warp = warp_ bilinear(trg_prob, trg_flow_warp), if the image flips, but the optical flow does not flip

Hello! I really enjoy reading your work！！ At the same time, I encountered a problem in the operation of train_video_UDA.py

In line 251 trg_ prob_ warp = warp_ bilinear(trg_prob, trg_flow_warp)， Variable trg_prob is the prediction of trg_img_b_wk, and trg_img_b_wk is obtained by trg_img_b based on a certain probability of flip, but trg_flow_warp does not seem to be flipped, We consider such a situation, If trg_img_b_wk is fliped, trg_flow_warp is not flipped, Then trg_prob_warp and trg_img_d_st do not seem consistent? Because the image flips, but the optical flow does not flip. Although the trg_pl in line 256~258 is fliped.

Chinese discription of my question: 在第251行， trg_ prob_ warp = warp_ bilinear(trg_prob, trg_flow_warp)，变量trg_prob是trg_img_b_wk的语义分割预测，而trg_img_b_wk是由trg_img_b根据一定概率flip得到的，但 trg_flow_warp似乎没有进行翻转，我们考虑这样一种情况，如果trg_img_b_wk经过了flip处理，那么trg_prob_warp和trg_img_d_st的语义貌似不是一致的？因为图像flip了但光流图没有flip。尽管在第256行对trg_pl进行了flip操作

opened by zhe-juanz 0
Some questions about data loading
Hi, This is a very enlightening work!!! @xing0047 @Dayan-Guan I want to ask a question~

When I use./TPS/tps/scripts/train.py to read SynthiaSeq or ViperSeq data, I debug the code and find the following phenomena:

I tried to print some variables of __ getitem__ () ,

When the shuffle of source_loader = data.DataLoader() is set to False, and the batch_size=cfg.TRAIN.BATCH_SIZE_SOURCE is set to 1,

It is found that although the batch_ Size=1, but 4 pictures and the first frame corresponding to them are loaded at one time, Instead of 1 picture and the previous frame.

At the same time, it is found that 4 loaded pictures are disordered, such as 2-1-3-4, rather than 1-2-3-4, it seems to violate the settings of shuffle.

Could you please kindly explain my doubt? Thank you very much!!

The print code are as follows:

The print results are as follows，which the order of each run of print is different:

---index--- 1 ---index--- 0 ---index--- 2 img_file tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000002.png label_file tps/data/SynthiaSeq/SEQS-04-DAWN/label/000002.png ---index--- 3 img_file tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000001.png label_file tps/data/SynthiaSeq/SEQS-04-DAWN/label/000001.png img_file tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000003.png label_file tps/data/SynthiaSeq/SEQS-04-DAWN/label/000003.png img_file tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000004.png label_file tps/data/SynthiaSeq/SEQS-04-DAWN/label/000004.png image_kf tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000003.png image_kf tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000002.png image_kf tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000001.png image_kf tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000000.png label_kf tps/data/SynthiaSeq/SEQS-04-DAWN/label/000003.png label_kf tps/data/SynthiaSeq/SEQS-04-DAWN/label/000002.png label_kf tps/data/SynthiaSeq/SEQS-04-DAWN/label/000001.png label_kf tps/data/SynthiaSeq/SEQS-04-DAWN/label/000000.png
opened by zhe-juanz 0
Regarding Synthia-Seq Dataset

I really enjoyed reading your work. I have a question regarding the synthia-seq dataset. In the paper you mention that you have used 8000 synthesized video frames, but in the github the Synthia-Seq Dawn contain only 850 images. Can you please clarify this ambiguity. Thank you.

opened by Ihsan149 0
Optical flow for training

Thanks for your great job! I want to train DA-VSN, but I don't know how to get Estimated_optical_flow_Viper_train, Estimated_optical_flow_Cityscapes-Seq_train. I didn't find the detail about optical flow from readme or paper.

opened by EDENpraseHAZARD 11

Releases(Latest)

Latest(Jul 24, 2021)

Source code(tar.gz)
Source code(zip)
davsn_syn2city_pretrained.pth(167.70 MB)
davsn_viper2city_pretrained.pth(168.96 MB)
DeepLab_resnet_pretrained_imagenet.pth(168.49 MB)
Estimated_optical_flow_Cityscapes-Seq_val.zip(417.61 MB)

Owner

GitHub Repository

Deep Hedging Demo - An Example of Using Machine Learning for Derivative Pricing.

Deep Hedging Demo Pricing Derivatives using Machine Learning 1) Jupyter version: Run ./colab/deep_hedging_colab.ipynb on Colab. 2) Gui version: Run py

102 Jan 06, 2023

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Tensor2Tensor Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and ac

12.9k Jan 09, 2023

Arabic Car License Recognition. A solution to the kaggle competition Machathon 3.0.

Transformers Arabic licence plate recognition 🚗 Solution to the kaggle competition Machathon 3.0. Ranked in the top 6️⃣ at the final evaluation phase

17 Dec 04, 2022

Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.

AimCLR This is an official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Reco

44 Dec 17, 2022

Unsupervised Foreground Extraction via Deep Region Competition

Unsupervised Foreground Extraction via Deep Region Competition [Paper] [Code] The official code repository for NeurIPS 2021 paper "Unsupervised Foregr

28 Nov 06, 2022

[CVPR'21] Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration

Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration This repository contains the implementation of our paper Locally Aware Pi

70 Dec 19, 2022

Meta-Learning Sparse Implicit Neural Representations (NeurIPS 2021)

Meta-SparseINR Official PyTorch implementation of "Meta-learning Sparse Implicit Neural Representations" (NeurIPS 2021) by Jaeho Lee*, Jihoon Tack*, N

41 Nov 10, 2022

HDMapNet: A Local Semantic Map Learning and Evaluation Framework

HDMapNet_devkit Devkit for HDMapNet. HDMapNet: A Local Semantic Map Learning and Evaluation Framework Qi Li, Yue Wang, Yilun Wang, Hang Zhao [Paper] [

421 Jan 04, 2023

TuckER: Tensor Factorization for Knowledge Graph Completion

TuckER: Tensor Factorization for Knowledge Graph Completion This codebase contains PyTorch implementation of the paper: TuckER: Tensor Factorization f

296 Dec 06, 2022

Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.

Hire-Wave-MLP.pytorch Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP Resul

29 Oct 28, 2022

A visualisation tool for Deep Reinforcement Learning

DRLVIS - Visualising Deep Reinforcement Learning Created by Marios Sirtmatsis with the support of Alex Bäuerle. DRLVis is an application used for visu

1 Nov 04, 2021

Perspective: Julia for Biologists

Perspective: Julia for Biologists 1. Examples Speed: Example 1 - Single cell data and network inference Domain: Single cell data Methodology: Network

55 Dec 02, 2022

AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

AgeGuesser AgeGuesser is an end-to-end, deep-learning based Age Estimation system, presented at the CAIP 2021 conference. You can find the related pap

5 Nov 10, 2022

CNNs for Sentence Classification in PyTorch

Introduction This is the implementation of Kim's Convolutional Neural Networks for Sentence Classification paper in PyTorch. Kim's implementation of t

956 Dec 19, 2022

Code for Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021)

Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021) Single-cause Perturbation (SCP) is a framework to estimate the m

9 Sep 28, 2022

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, L

3 Dec 02, 2022

Code for Domain Adaptive Video Segmentation via Temporal Consistency Regularization in ICCV 2021

Related tags

Overview

Domain Adaptive Video Segmentation via Temporal Consistency Regularization

Updates

Paper

Abstract

Installation

Preparation

Optical Flow Estimation

Evaluation on Pretrained Models

Training and Testing

Acknowledgements

Contact

Comments

Releases(Latest)

Latest(Jul 24, 2021)

Owner

Deep Hedging Demo - An Example of Using Machine Learning for Derivative Pricing.

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Arabic Car License Recognition. A solution to the kaggle competition Machathon 3.0.

Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.

Unsupervised Foreground Extraction via Deep Region Competition

[CVPR'21] Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration

Meta-Learning Sparse Implicit Neural Representations (NeurIPS 2021)

HDMapNet: A Local Semantic Map Learning and Evaluation Framework

TuckER: Tensor Factorization for Knowledge Graph Completion

Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.

A visualisation tool for Deep Reinforcement Learning

Perspective: Julia for Biologists

AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

CNNs for Sentence Classification in PyTorch

Code for Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021)

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

Official Pytorch implementation for video neural representation (NeRV)

Automatic 2D-to-3D Video Conversion with CNNs

MMGeneration is a powerful toolkit for generative models, based on PyTorch and MMCV.

Code for Learning Manifold Patch-Based Representations of Man-Made Shapes, in ICLR 2021.