Using Unreliable Pseudo Labels

Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022.

Please refer to our project page for qualitative results.

Abstract. The crux of semi-supervised semantic segmentation is to assign adequate pseudo-labels to the pixels of unlabeled images. A common practice is to select the highly confident predictions as the pseudo ground-truth, but it leads to a problem that most pixels may be left unused due to their unreliability. We argue that every pixel matters to the model training, even its prediction is ambiguous. Intuitively, an unreliable prediction may get confused among the top classes (i.e., those with the highest probabilities), however, it should be confident about the pixel not belonging to the remaining classes. Hence, such a pixel can be convincingly treated as a negative sample to those most unlikely categories. Based on this insight, we develop an effective pipeline to make sufficient use of unlabeled data. Concretely, we separate reliable and unreliable pixels via the entropy of predictions, push each unreliable pixel to a category-wise queue that consists of negative samples, and manage to train the model with all candidate pixels. Considering the training evolution, where the prediction becomes more and more accurate, we adaptively adjust the threshold for the reliable-unreliable partition. Experimental results on various benchmarks and training settings demonstrate the superiority of our approach over the state-of-the-art alternatives.

Results

PASCAL VOC 2012

Labeled images are selected from the train set of original VOC, 1,464 images in total. And the remaining 9,118 images are all considered as unlabeled ones.

For instance, 1/2 (732) represents 732 labeled images and remaining 9,850 (9,118 + 732) are unlabeled.

Method	1/16 (92)	1/8 (183)	1/4 (366)	1/2 (732)	Full (1464)
SupOnly	45.77	54.92	65.88	71.69	72.50
U²PL (w/ CutMix)	67.98	69.15	73.66	76.16	79.49

Labeled images are selected from the train set of augmented VOC, 10,582 images in total.

Method	1/16 (662)	1/8 (1323)	1/4 (2646)	1/2 (5291)
SupOnly	67.87	71.55	75.80	77.13
U²PL (w/ CutMix)	77.21	79.01	79.30	80.50

Cityscapes

Labeled images are selected from the train set, 2,975 images in total.

Method	1/16 (186)	1/8 (372)	1/4 (744)	1/2 (1488)
SupOnly	65.74	72.53	74.43	77.83
U²PL (w/ CutMix)	70.30	74.37	76.47	79.05
U²PL (w/ AEL)	74.90	76.48	78.51	79.12

Checkpoints

Models on Cityscapes with AEL (ResNet101-DeepLabv3+)

1/16 (186)	1/8 (372)	1/4 (744)	1/2 (1488)
Google Drive	Google Drive	Google Drive	Google Drive
Baidu Drive Fetch Code: rrpd	Baidu Drive Fetch Code: welw	Baidu Drive Fetch Code: qwcd	Baidu Drive Fetch Code: 4p8r

Installation

git clone https://github.com/Haochen-Wang409/U2PL.git && cd U2PL
conda create -n u2pl python=3.6.9
conda activate u2pl
pip install -r requirements.txt
pip install pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 -f https://download.pytorch.org/whl/torch_stable.html

Usage

U²PL is evaluated on both Cityscapes and PASCAL VOC 2012 dataset.

Prepare Data

For Cityscapes

Download "leftImg8bit_trainvaltest.zip" and "gtFine_trainvaltest.zip" from: https://www.cityscapes-dataset.com/downloads/.

Next, unzip the files to folder data and make the dictionary structures as follows:

data/cityscapes
├── gtFine
│   ├── test
│   ├── train
│   └── val
└── leftImg8bit
    ├── test
    ├── train
    └── val

For PASCAL VOC 2012

Download "VOCtrainval_11-May-2012.tar" from: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar.

And unzip the files to folder data and make the dictionary structures as follows:

data/VOC2012
├── Annotations
├── ImageSets
├── JPEGImages
├── SegmentationClass
├── SegmentationClassAug
└── SegmentationObject

Finally, the structure of dictionary data should be as follows:

data
├── cityscapes
│   ├── gtFine
│   └── leftImg8bit
├── splits
│   ├── cityscapes
│   └── pascal
└── VOC2012
    ├── Annotations
    ├── ImageSets
    ├── JPEGImages
    ├── SegmentationClass
    ├── SegmentationClassAug
    └── SegmentationObject

Prepare Pretrained Backbone

Before training, please download ResNet101 pretrained on ImageNet-1K from one of the following:

Google Drive
Baidu Drive Fetch Code: 3p9h

After that, modify model_urls in semseg/models/resnet.py to </path/to/resnet101.pth>

Train a Fully-Supervised Model

For instance, we can train a model on PASCAL VOC 2012 with only 1464 labeled data for supervision by:

cd experiments/pascal/1464/suponly
# use torch.distributed.launch
sh train.sh <num_gpu> <port>

# or use slurm
# sh slurm_train.sh <num_gpu> <port> <partition>

Or for Cityscapes, a model supervised by only 744 labeled data can be trained by:

cd experiments/cityscapes/744/suponly
# use torch.distributed.launch
sh train.sh <num_gpu> <port>

# or use slurm
# sh slurm_train.sh <num_gpu> <port> <partition>

After training, the model should be evaluated by

sh eval.sh

Train a Semi-Supervised Model

We can train a model on PASCAL VOC 2012 with 1464 labeled data and 9118 unlabeled data for supervision by:

cd experiments/pascal/1464/ours
# use torch.distributed.launch
sh train.sh <num_gpu> <port>

# or use slurm
# sh slurm_train.sh <num_gpu> <port> <partition>

Or for Cityscapes, a model supervised by 744 labeled data and 2231 unlabeled data can be trained by:

cd experiments/cityscapes/744/ours
# use torch.distributed.launch
sh train.sh <num_gpu> <port>

# or use slurm
# sh slurm_train.sh <num_gpu> <port> <partition>

After training, the model should be evaluated by

sh eval.sh

Train a Semi-Supervised Model on Cityscapes with AEL

First, you should switch the branch:

git checkout with_AEL

Then, we can train a model supervised by 744 labeled data and 2231 unlabeled data by:

cd experiments/city_744
# use torch.distributed.launch
sh train.sh <num_gpu> <port>

# or use slurm
# sh slurm_train.sh <num_gpu> <port> <partition>

After training, the model should be evaluated by

sh eval.sh

Note

<num_gpu> means the number of GPUs for training.

To reproduce our results, we recommend you follow the settings:

Cityscapes: 4 for SupOnly and 8 for Semi-Supervised
PASCAL VOC 2012: 2 for SupOnly and 4 for Semi-Supervised

Or, change the lr in config.yaml in a linear manner (e.g., if you want to train a SupOnly model on Cityscapes with 8 GPUs, you are recommended to change the lr to 0.02).

If you want to train a model on other split, you need to modify data_list and n_sup in config.yaml.

Due to the randomness of function torch.nn.functional.interpolate when mode="bilinear", the results of semantic segmentation will not be the same EVEN IF a fixed random seed is set.

Therefore, we recommend you run 3 times and get the average performance.

License

This project is released under the Apache 2.0 license.

Acknowledgement

The contrastive learning loss and strong data augmentation (CutMix, CutOut, and ClassMix) are borrowed from ReCo. We reproduce our U²PL based on AEL on branch with_AEL.

ReCo: https://github.com/lorenmt/reco
AEL: https://github.com/hzhupku/SemiSeg-AEL

Thanks a lot for their great work!

Citation

@inproceedings{wang2022semi,
    title={Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels},
    author={Wang, Yuchao and Wang, Haochen and Shen, Yujun and Fei, Jingjing and Li, Wei and Jin, Guoqiang and Wu, Liwei and Zhao, Rui and Le, Xinyi},
    booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision and Patern Recognition (CVPR)},
    year={2022}
}

Contact

Yuchao Wang, [email protected]
Haochen Wang, [email protected]
Jingjing Fei, [email protected]
Wei Li, [email protected]

[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Related tags

Overview

Using Unreliable Pseudo Labels

Results

PASCAL VOC 2012

Cityscapes

Checkpoints

Installation

Usage

Prepare Data

Prepare Pretrained Backbone

Train a Fully-Supervised Model

Train a Semi-Supervised Model

Train a Semi-Supervised Model on Cityscapes with AEL

Note

License

Acknowledgement

Citation

Contact

Owner

Haochen Wang

A wrapper around SageMaker ML Lineage Tracking extending ML Lineage to end-to-end ML lifecycles, including additional capabilities around Feature Store groups, queries, and other relevant artifacts.

Agile SVG maker for python

Code accompanying "Evolving spiking neuron cellular automata and networks to emulate in vitro neuronal activity," accepted to IEEE SSCI ICES 2021

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

🙄 Difficult algorithm, Simple code.

Red Team tool for exfiltrating files from a target's Google Drive that you have access to, via Google's API.

Rank 1st in the public leaderboard of ScanRefer (2021-03-18)

Continuous Diffusion Graph Neural Network

Deep Learning Visuals contains 215 unique images divided in 23 categories

ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-tree Complex Wavelet Representation and Contradict Channel Loss (HDCWNet)

A neuroanatomy-based augmented reality experience powered by computer vision. Features 3D visuals of the Atlas Brain Map slices.

[NeurIPS 2021] "Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems"

Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective

Image processing in Python

Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation

Code for the TPAMI paper: "Syntax Customized Video Captioning by Imitating Exemplar Sentences"

Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

[CVPR2021] DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets

Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"