[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast

Related tags

Deep Learningwseg
Overview

wseg

Overview

The Pytorch implementation of Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast.

[arXiv]

Though image-level weakly supervised semantic segmentation (WSSS) has achieved great progress with Class Activation Maps (CAMs) as the cornerstone, the large supervision gap between classification and segmentation still hampers the model to generate more complete and precise pseudo masks for segmentation. In this study, we propose weakly-supervised pixel-to-prototype contrast that can provide pixel-level supervisory signals to narrow the gap. Guided by two intuitive priors, our method is executed across different views and within per single view of an image, aiming to impose cross-view feature semantic consistency regularization and facilitate intra(inter)-class compactness(dispersion) of the feature space. Our method can be seamlessly incorporated into existing WSSS models without any changes to the base networks and does not incur any extra inference burden. Extensive experiments manifest that our method consistently improves two strong baselines by large margins, demonstrating the effectiveness.

图片

Prerequisites

Preparation

  1. Clone this repository.
  2. Data preparation. Download PASCAL VOC 2012 devkit following instructions in http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit. It is suggested to make a soft link toward downloaded dataset. Then download the annotation of VOC 2012 trainaug set (containing 10582 images) from https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0 and place them all as VOC2012/SegmentationClassAug/xxxxxx.png. Download the image-level labels cls_label.npy from https://github.com/YudeWang/SEAM/tree/master/voc12/cls_label.npy and place it into voc12/, or you can generate it by yourself.
  3. Download ImageNet pretrained backbones. We use ResNet-38 for initial seeds generation and ResNet-101 for segmentation training. Download pretrained ResNet-38 from https://drive.google.com/file/d/15F13LEL5aO45JU-j45PYjzv5KW5bn_Pn/view. The ResNet-101 can be downloaded from https://download.pytorch.org/models/resnet101-5d3b4d8f.pth.

Model Zoo

Download the trained models and category performance below.

baseline model train(mIoU) val(mIoU) test (mIoU) checkpoint category performance
SEAM contrast 61.5 58.4 - [download]
affinitynet 69.2 - [download]
deeplabv1 - 67.7* 67.4* [download] [link]
EPS contrast 70.5 - - [download]
deeplabv1 - 72.3* 73.5* [download] [link]
deeplabv2 - 72.6* 73.6* [download] [link]

* indicates using densecrf.

The training results including initial seeds, intermediate products and pseudo masks can be found here.

Usage

Step1: Initial Seed Generation with Contrastive Learning.

  1. Contrast train.

    python contrast_train.py  \
      --weights $pretrained_model \
      --voc12_root VOC2012 \
      --session_name $your_session_name \
      --batch_size $bs
    
  2. Contrast inference.

    Download the pretrained model from https://1drv.ms/u/s!AgGL9MGcRHv0mQSKoJ6CDU0cMjd2?e=dFlHgN or train from scratch, set --weights and then run:

    python contrast_infer.py \
      --weights $contrast_weight \ 
      --infer_list $[voc12/val.txt | voc12/train.txt | voc12/train_aug.txt] \
      --out_cam $your_cam_npy_dir \
      --out_cam_pred $your_cam_png_dir \
      --out_crf $your_crf_png_dir
    
  3. Evaluation.

    Following SEAM, we recommend you to use --curve to select an optimial background threshold.

    python eval.py \
      --list VOC2012/ImageSets/Segmentation/$[val.txt | train.txt] \
      --predict_dir $your_result_dir \
      --gt_dir VOC2012/SegmentationClass \
      --comment $your_comments \
      --type $[npy | png] \
      --curve True
    

Step2: Refine with AffinityNet.

  1. Preparation.

    Prepare the files (la_crf_dir and ha_crf_dir) needed for training AffinityNet. You can also use our processed crf outputs with alpha=4/8 from here.

    python aff_prepare.py \
      --voc12_root VOC2012 \
      --cam_dir $your_cam_npy_dir \
      --out_crf $your_crf_alpha_dir 
    
  2. AffinityNet train.

    python aff_train.py \
      --weights $pretrained_model \
      --voc12_root VOC2012 \
      --la_crf_dir $your_crf_dir_4.0 \
      --ha_crf_dir $your_crf_dir_8.0 \
      --session_name $your_session_name
    
  3. Random walk propagation & Evaluation.

    Use the trained AffinityNet to conduct RandomWalk for refining the CAMs from Step1. Trained model can be found in Model Zoo (https://1drv.ms/u/s!AgGL9MGcRHv0mQXi0SSkbUc2sl8o?e=AY7AzX).

    python aff_infer.py \
      --weights $aff_weights \
      --voc12_root VOC2012 \
      --infer_list $[voc12/val.txt | voc12/train.txt] \
      --cam_dir $your_cam_dir \
      --out_rw $your_rw_dir
    
  4. Pseudo mask generation. Generate the pseudo masks for training the DeepLab Model. Dense CRF is used in this step.

    python aff_infer.py \
      --weights $aff_weights \
      --infer_list voc12/trainaug.txt \
      --cam_dir $your_cam_dir \
      --voc12_root VOC2012 \
      --out_rw $your_rw_dir
    

Step3: Segmentation training with DeepLab

  1. Training.

    we use the segmentation repo from https://github.com/YudeWang/semantic-segmentation-codebase. Training and inference codes are available in segmentation/experiment/. Set DATA_PSEUDO_GT: $your_pseudo_label_path in config.py. Then run:

    python train.py
    
  2. Inference.

    Check test configration in config.py (ckpt path, trained model: https://1drv.ms/u/s!AgGL9MGcRHv0mQgpb3QawPCsKPe9?e=4vly0H) and val/test set selection in test.py. Then run:

    python test.py
    

    For test set evaluation, you need to download test set images and submit the segmentation results to the official voc server.

For integrating our approach into the EPS model, you can change branch to EPS via:

git checkout eps

Then conduct train or inference following instructions above. Segmentation training follows the same repo in segmentation. Trained models & processed files can be download in Model Zoo.

Acknowledgements

We sincerely thank Yude Wang for his great work SEAM in CVPR'20. We borrow codes heavly from his repositories SEAM and Segmentation. We also thank Seungho Lee for his EPS and jiwoon-ahn for his AffinityNet and IRN. Without them, we could not finish this work.

Citation

@inproceedings{du2021weakly,
  title={Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast},
  author={Du, Ye and Fu, Zehua and Liu, Qingjie and Wang, Yunhong},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2022}
}
Owner
Ye Du
Ye Du
R-Drop: Regularized Dropout for Neural Networks

R-Drop: Regularized Dropout for Neural Networks R-drop is a simple yet very effective regularization method built upon dropout, by minimizing the bidi

756 Dec 27, 2022
Code accompanying the paper Shared Independent Component Analysis for Multi-subject Neuroimaging

ShICA Code accompanying the paper Shared Independent Component Analysis for Multi-subject Neuroimaging Install Move into the ShICA directory cd ShICA

8 Nov 07, 2022
PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Study-CSRNet-pytorch This is the PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

0 Mar 01, 2022
Specificity-preserving RGB-D Saliency Detection

Specificity-preserving RGB-D Saliency Detection Authors: Tao Zhou, Huazhu Fu, Geng Chen, Yi Zhou, Deng-Ping Fan, and Ling Shao. 1. Preface This reposi

Tao Zhou 35 Jan 08, 2023
Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

Hybrid solving process for combinatorial optimization problems Combinatorial optimization has found applications in numerous fields, from aerospace to

117 Dec 13, 2022
Implementation of: "Exploring Randomly Wired Neural Networks for Image Recognition"

RandWireNN Unofficial PyTorch Implementation of: Exploring Randomly Wired Neural Networks for Image Recognition. Results Validation result on Imagenet

Seung-won Park 684 Nov 02, 2022
Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation The skip connections in U-Net pass features from the levels of enc

Boheng Cao 1 Dec 29, 2021
Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling

Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling Code for the paper: Greg Ver Steeg and Aram Galstyan. "Hamiltonian Dynamics with N

Greg Ver Steeg 25 Mar 14, 2022
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 72 Dec 14, 2022
A Simple Example for Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env

Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env This repository implements a simple algorithm for imitation learning: DAGGER. In thi

Hao 66 Nov 23, 2022
BBScan py3 - BBScan py3 With Python

BBScan_py3 This repository is forked from lijiejie/BBScan 1.5. I migrated the fo

baiyunfei 12 Dec 30, 2022
Collection of Docker images for ML/DL and video processing projects

Collection of Docker images for ML/DL and video processing projects. Overview of images Three types of images differ by tag postfix: base: Python with

OSAI 87 Nov 22, 2022
A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

CoAtNet Overview This is a PyTorch implementation of CoAtNet specified in "CoAtNet: Marrying Convolution and Attention for All Data Sizes", arXiv 2021

Justin Wu 268 Jan 07, 2023
The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL), NeurIPS-2021

Directed Graph Contrastive Learning Paper | Poster | Supplementary The PyTorch implementation of Directed Graph Contrastive Learning (DiGCL). In this

Tong Zekun 28 Jan 08, 2023
MLPs for Vision and Langauge Modeling (Coming Soon)

MLP Architectures for Vision-and-Language Modeling: An Empirical Study MLP Architectures for Vision-and-Language Modeling: An Empirical Study (Code wi

Yixin Nie 27 May 09, 2022
ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.

ENet This work has been published in arXiv: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. Packages: train contains too

e-Lab 344 Nov 21, 2022
Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease

Heart_Disease_Classification Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease Dataset

Ashish 1 Jan 30, 2022
PyTorch implementation of "PatchGame: Learning to Signal Mid-level Patches in Referential Games" to appear in NeurIPS 2021

PatchGame: Learning to Signal Mid-level Patches in Referential Games This repository is the official implementation of the paper - "PatchGame: Learnin

Kamal Gupta 22 Mar 16, 2022
StableSims is an open-source project aimed at simulating MakerDAO's Dai stablecoin system

StableSims is an open-source project aimed at simulating MakerDAO's Dai stablecoin system, initially used for researching optimal incentive parameters for Liquidations 2.0.

Blockchain at Berkeley 52 Nov 21, 2022
Residual Dense Net De-Interlace Filter (RDNDIF)

Residual Dense Net De-Interlace Filter (RDNDIF) Work in progress deep de-interlacer filter. It is based on the architecture proposed by Bernasconi et

Louis 7 Feb 15, 2022