Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Related tags

Deep Learningda-sac
Overview

Self-supervised Augmentation Consistency
for Adapting Semantic Segmentation

License PyTorch

This repository contains the official implementation of our paper:

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
Nikita Araslanov and Stefan Roth
To appear at CVPR 2021. [arXiv preprint]

drawing

We obtain state-of-the-art accuracy of adapting semantic
segmentation by enforcing consistency across photometric
and similarity transformations. We use neither style transfer
nor adversarial training.

Contact: Nikita Araslanov fname.lname (at) visinf.tu-darmstadt.de


Installation

Requirements. To reproduce our results, we recommend Python >=3.6, PyTorch >=1.4, CUDA >=10.0. At least two Titan X GPUs (12Gb) or equivalent are required for VGG-16; ResNet-101 and VGG-16/FCN need four.

  1. create conda environment:
conda create --name da-sac
source activate da-sac
  1. install PyTorch >=1.4 (see PyTorch instructions). For example,
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
  1. install the dependencies:
pip install -r requirements.txt
  1. download data (Cityscapes, GTA5, SYNTHIA) and create symlinks in the ./data folder, as follows:
./data/cityscapes -> <symlink to Cityscapes>
./data/cityscapes/gtFine2/
./data/cityscapes/leftImg8bit/

./data/game -> <symlink to GTA>
./data/game/labels_cs
./data/game/images

./data/synthia  -> <symlink to SYNTHIA>
./data/synthia/labels_cs
./data/synthia/RGB

Note that all ground-truth label IDs (Cityscapes, GTA5 and SYNTHIA) should be converted to Cityscapes train IDs. The label directories in the above example (gtFine2, labels_cs) therefore refer not to the original labels, but to these converted semantic maps.

Training

Training from ImageNet initialisation proceeds in three steps:

  1. Training the baseline (ABN)
  2. Generating the weights for importance sampling
  3. Training with augmentation consistency from the ABN baseline

1. Training the baseline (ABN)

Here the input are ImageNet models available from the official PyTorch repository. We provide the links to those models for convenience.

Backbone Link
ResNet-101 resnet101-5d3b4d8f.pth (171M)
VGG-16 vgg16_bn-6c64b313.pth (528M)

By default, these models should be placed in ./models/pretrained/ (though configurable with MODEL.INIT_MODEL).

To run the training

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn] base

where the first argument specifies the source domain, the second determines the network architecture. The third argument base instructs to run the training of the baseline.

If you would like to skip this step, you can use our pre-trained models:

Source domain: GTA5

Backbone Arch. IoU (val) Link MD5
ResNet-101 DeepLabv2 40.8 baseline_abn_e040.pth (336M) 9fe17[...]c11fc
VGG-16 DeepLabv2 37.1 baseline_abn_e115.pth (226M) d4ffc[...]ef755
VGG-16 FCN 36.7 baseline_abn_e040.pth (1.1G) aa2e9[...]bae53

Source domain: SYNTHIA

Backbone Arch. IoU (val) Link MD5
ResNet-101 DeepLabv2 36.3 baseline_abn_e090.pth (336M) b3431[...]d1a83
VGG-16 DeepLabv2 34.4 baseline_abn_e070.pth (226M) 3af24[...]5b24e
VGG-16 FCN 31.6 baseline_abn_e040.pth (1.1G) 5f457[...]e4b3a

Tip: You can download these files (as well as the final models below) with tools/download_baselines.sh:

cp tools/download_baselines.sh snapshots/cityscapes/baselines/
cd snapshots/cityscapes/baselines/
bash ./download_baselines.sh

2. Generating weights for importance sampling

To generate the weights you need to

  1. generate mask predictions with your baseline (see inference below);
  2. run tools/compute_image_weights.py that reads in those predictions and counts the predictions per each class.

If you would like to skip this step, you can use our weights we computed for the ABN baselines above:

Backbone Arch. Source: GTA5 Source: SYNTHIA
ResNet-101 DeepLabv2 cs_weights_resnet101_gta.data cs_weights_resnet101_synthia.data
VGG-16 DeepLabv2 cs_weights_vgg16_gta.data cs_weights_vgg16_synthia.data
VGG-16 FCN cs_weights_vgg16fcn_gta.data cs_weights_vgg16fcn_synthia.data

Tip: The bash script data/download_weights.sh will download all these importance sampling weights in the current directory.

3. Training with augmentation consistency

To train the model with augmentation consistency, we use the same shell script as in step 1, but without the argument base:

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn]

Make sure to specify your baseline snapshot with RESUME bash variable set in the environment (export RESUME=...) or directly in the shell script (commented out by default).

We provide our final models for download.

Source domain: GTA5

Backbone Arch. IoU (val) IoU (test) Link MD5
ResNet-101 DeepLabv2 53.8 55.7 final_e136.pth (504M) 59c16[...]5a32f
VGG-16 DeepLabv2 49.8 51.0 final_e184.pth (339M) 0accb[...]d5881
VGG-16 FCN 49.9 50.4 final_e112.pth (1.6G) e69f8[...]f729b

Source domain: SYNTHIA

Backbone Arch. IoU (val) IoU (test) Link MD5
ResNet-101 DeepLabv2 52.6 52.7 final_e164.pth (504M) a7682[...]db742
VGG-16 DeepLabv2 49.1 48.3 final_e164.pth (339M) c5b31[...]5fdb7
VGG-16 FCN 46.8 45.8 final_e098.pth (1.6G) efb74[...]845cc

Inference and evaluation

Inference

To run single-scale inference from your snapshot, use infer_val.py. The bash script launch/infer_val.sh provides an easy way to run the inference by specifying a few variables:

# validation/training set
FILELIST=[val_cityscapes|train_cityscapes] 
# configuration used for training
CONFIG=configs/[deeplabv2_vgg16|deeplab_resnet101|fcn_vgg16]_train.yaml
# the following 3 variables effectively specify the path to the snapshot
EXP=...
RUN_ID=...
SNAPSHOT=...
# the snapshot path is defined as
# SNAPSHOT_PATH=snapshots/cityscapes/${EXP}/${RUN_ID}/${SNAPSHOT}.pth

Evaluation

Please use the Cityscapes' official evaluation tool evalPixelLevelSemanticLabeling from Cityscapes scripts for evaluating your results.

Citation

We hope you find our work useful. If you would like to acknowledge it in your project, please use the following citation:

@inproceedings{Araslanov:2021:DASAC,
  title     = {Self-supervised Augmentation Consistency for Adapting Semantic Segmentation},
  author    = {Araslanov, Nikita and and Roth, Stefan},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}
Owner
Visual Inference Lab @TU Darmstadt
Visual Inference Lab @TU Darmstadt
1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

Lihe Yang 209 Jan 01, 2023
This is a collection of all challenges in HKCERT CTF 2021

香港網絡保安新生代奪旗挑戰賽 2021 (HKCERT CTF 2021) This is a collection of all challenges (and writeups) in HKCERT CTF 2021 Challenges ID Chinese name Name Score S

10 Jan 27, 2022
Implement some metaheuristics and cost functions

Metaheuristics This repot implement some metaheuristics and cost functions. Metaheuristics JAYA Implement Jaya optimizer without constraints. Cost fun

Adri1G 1 Mar 23, 2022
Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021)

Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021) Paper Video Instance Segmentation using Inter-Frame Communicat

Sukjun Hwang 81 Dec 29, 2022
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 360 Dec 10, 2022
Automatic deep learning for image classification.

AutoDL AutoDL automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few line

wenqi 2 Oct 12, 2022
This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)

Elaborative Rehearsal for Zero-shot Action Recognition This is an official implementation of: Shizhe Chen and Dong Huang, Elaborative Rehearsal for Ze

DeLightCMU 26 Sep 24, 2022
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

This is the codebase for the paper: Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs Directory Structur

Peter Hase 19 Aug 21, 2022
The repository includes the code for training cell counting applications. (Keras + Tensorflow)

cell_counting_v2 The repository includes the code for training cell counting applications. (Keras + Tensorflow) Dataset can be downloaded here : http:

Weidi 113 Oct 06, 2022
The official PyTorch implementation for the paper "sMGC: A Complex-Valued Graph Convolutional Network via Magnetic Laplacian for Directed Graphs".

Magnetic Graph Convolutional Networks About The official PyTorch implementation for the paper sMGC: A Complex-Valued Graph Convolutional Network via M

3 Feb 25, 2022
Implement slightly different caffe-segnet in tensorflow

Tensorflow-SegNet Implement slightly different (see below for detail) SegNet in tensorflow, successfully trained segnet-basic in CamVid dataset. Due t

Tseng Kuan Lun 364 Oct 27, 2022
PyTorch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

samplernn-pytorch A PyTorch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model. It's based on the reference implem

DeepSound 261 Dec 14, 2022
A fast python implementation of Ray Tracing in One Weekend using python and Taichi

ray-tracing-one-weekend-taichi A fast python implementation of Ray Tracing in One Weekend using python and Taichi. Taichi is a simple "Domain specific

157 Dec 26, 2022
Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

Proximal Backpropagation Proximal Backpropagation (ProxProp) is a neural network training algorithm that takes implicit instead of explicit gradient s

Thomas Frerix 40 Dec 17, 2022
Generate saved_model, tfjs, tf-trt, EdgeTPU, CoreML, quantized tflite and .pb from .tflite.

tflite2tensorflow Generate saved_model, tfjs, tf-trt, EdgeTPU, CoreML, quantized tflite and .pb from .tflite. 1. Supported Layers No. TFLite Layer TF

Katsuya Hyodo 214 Dec 29, 2022
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

24 Nov 02, 2022
AI4Good project for detecting waste in the environment

Detect waste AI4Good project for detecting waste in environment. www.detectwaste.ml. Our latest results were published in Waste Management journal in

108 Dec 25, 2022
ALBERT-pytorch-implementation - ALBERT pytorch implementation

ALBERT-pytorch-implementation developing... 모델의 개념이해를 돕기 위한 구현물로 현재 변수명을 상세히 적었고

BG Kim 3 Oct 06, 2022
WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region. This repository provides the codebase and dataset for our work WORD: Revisiting Or

Healthcare Intelligence Laboratory 71 Jan 07, 2023