[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast

Related tags

Deep Learningwseg
Overview

wseg

Overview

The Pytorch implementation of Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast.

[arXiv]

Though image-level weakly supervised semantic segmentation (WSSS) has achieved great progress with Class Activation Maps (CAMs) as the cornerstone, the large supervision gap between classification and segmentation still hampers the model to generate more complete and precise pseudo masks for segmentation. In this study, we propose weakly-supervised pixel-to-prototype contrast that can provide pixel-level supervisory signals to narrow the gap. Guided by two intuitive priors, our method is executed across different views and within per single view of an image, aiming to impose cross-view feature semantic consistency regularization and facilitate intra(inter)-class compactness(dispersion) of the feature space. Our method can be seamlessly incorporated into existing WSSS models without any changes to the base networks and does not incur any extra inference burden. Extensive experiments manifest that our method consistently improves two strong baselines by large margins, demonstrating the effectiveness.

图片

Prerequisites

Preparation

  1. Clone this repository.
  2. Data preparation. Download PASCAL VOC 2012 devkit following instructions in http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit. It is suggested to make a soft link toward downloaded dataset. Then download the annotation of VOC 2012 trainaug set (containing 10582 images) from https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0 and place them all as VOC2012/SegmentationClassAug/xxxxxx.png. Download the image-level labels cls_label.npy from https://github.com/YudeWang/SEAM/tree/master/voc12/cls_label.npy and place it into voc12/, or you can generate it by yourself.
  3. Download ImageNet pretrained backbones. We use ResNet-38 for initial seeds generation and ResNet-101 for segmentation training. Download pretrained ResNet-38 from https://drive.google.com/file/d/15F13LEL5aO45JU-j45PYjzv5KW5bn_Pn/view. The ResNet-101 can be downloaded from https://download.pytorch.org/models/resnet101-5d3b4d8f.pth.

Model Zoo

Download the trained models and category performance below.

baseline model train(mIoU) val(mIoU) test (mIoU) checkpoint category performance
SEAM contrast 61.5 58.4 - [download]
affinitynet 69.2 - [download]
deeplabv1 - 67.7* 67.4* [download] [link]
EPS contrast 70.5 - - [download]
deeplabv1 - 72.3* 73.5* [download] [link]
deeplabv2 - 72.6* 73.6* [download] [link]

* indicates using densecrf.

The training results including initial seeds, intermediate products and pseudo masks can be found here.

Usage

Step1: Initial Seed Generation with Contrastive Learning.

  1. Contrast train.

    python contrast_train.py  \
      --weights $pretrained_model \
      --voc12_root VOC2012 \
      --session_name $your_session_name \
      --batch_size $bs
    
  2. Contrast inference.

    Download the pretrained model from https://1drv.ms/u/s!AgGL9MGcRHv0mQSKoJ6CDU0cMjd2?e=dFlHgN or train from scratch, set --weights and then run:

    python contrast_infer.py \
      --weights $contrast_weight \ 
      --infer_list $[voc12/val.txt | voc12/train.txt | voc12/train_aug.txt] \
      --out_cam $your_cam_npy_dir \
      --out_cam_pred $your_cam_png_dir \
      --out_crf $your_crf_png_dir
    
  3. Evaluation.

    Following SEAM, we recommend you to use --curve to select an optimial background threshold.

    python eval.py \
      --list VOC2012/ImageSets/Segmentation/$[val.txt | train.txt] \
      --predict_dir $your_result_dir \
      --gt_dir VOC2012/SegmentationClass \
      --comment $your_comments \
      --type $[npy | png] \
      --curve True
    

Step2: Refine with AffinityNet.

  1. Preparation.

    Prepare the files (la_crf_dir and ha_crf_dir) needed for training AffinityNet. You can also use our processed crf outputs with alpha=4/8 from here.

    python aff_prepare.py \
      --voc12_root VOC2012 \
      --cam_dir $your_cam_npy_dir \
      --out_crf $your_crf_alpha_dir 
    
  2. AffinityNet train.

    python aff_train.py \
      --weights $pretrained_model \
      --voc12_root VOC2012 \
      --la_crf_dir $your_crf_dir_4.0 \
      --ha_crf_dir $your_crf_dir_8.0 \
      --session_name $your_session_name
    
  3. Random walk propagation & Evaluation.

    Use the trained AffinityNet to conduct RandomWalk for refining the CAMs from Step1. Trained model can be found in Model Zoo (https://1drv.ms/u/s!AgGL9MGcRHv0mQXi0SSkbUc2sl8o?e=AY7AzX).

    python aff_infer.py \
      --weights $aff_weights \
      --voc12_root VOC2012 \
      --infer_list $[voc12/val.txt | voc12/train.txt] \
      --cam_dir $your_cam_dir \
      --out_rw $your_rw_dir
    
  4. Pseudo mask generation. Generate the pseudo masks for training the DeepLab Model. Dense CRF is used in this step.

    python aff_infer.py \
      --weights $aff_weights \
      --infer_list voc12/trainaug.txt \
      --cam_dir $your_cam_dir \
      --voc12_root VOC2012 \
      --out_rw $your_rw_dir
    

Step3: Segmentation training with DeepLab

  1. Training.

    we use the segmentation repo from https://github.com/YudeWang/semantic-segmentation-codebase. Training and inference codes are available in segmentation/experiment/. Set DATA_PSEUDO_GT: $your_pseudo_label_path in config.py. Then run:

    python train.py
    
  2. Inference.

    Check test configration in config.py (ckpt path, trained model: https://1drv.ms/u/s!AgGL9MGcRHv0mQgpb3QawPCsKPe9?e=4vly0H) and val/test set selection in test.py. Then run:

    python test.py
    

    For test set evaluation, you need to download test set images and submit the segmentation results to the official voc server.

For integrating our approach into the EPS model, you can change branch to EPS via:

git checkout eps

Then conduct train or inference following instructions above. Segmentation training follows the same repo in segmentation. Trained models & processed files can be download in Model Zoo.

Acknowledgements

We sincerely thank Yude Wang for his great work SEAM in CVPR'20. We borrow codes heavly from his repositories SEAM and Segmentation. We also thank Seungho Lee for his EPS and jiwoon-ahn for his AffinityNet and IRN. Without them, we could not finish this work.

Citation

@inproceedings{du2021weakly,
  title={Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast},
  author={Du, Ye and Fu, Zehua and Liu, Qingjie and Wang, Yunhong},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2022}
}
Owner
Ye Du
Ye Du
This is an official implementation for "Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation".

Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation This repo is the official implementation of Exploiting Temporal Con

Vegetabird 241 Jan 07, 2023
RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

The first comprehensive Robustness investigation benchmark on large-scale dataset ImageNet regarding ARchitecture design and Training techniques towards diverse noises.

132 Dec 23, 2022
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

OCTIS : Optimizing and Comparing Topic Models is Simple! OCTIS (Optimizing and Comparing Topic models Is Simple) aims at training, analyzing and compa

MIND 478 Jan 01, 2023
Implementation of Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis

acLSTM_motion This folder contains an implementation of acRNN for the CMU motion database written in Pytorch. See the following links for more backgro

Yi_Zhou 61 Sep 07, 2022
Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning The predictive learning of spatiotemporal sequences aims to generate future

THUML: Machine Learning Group @ THSS 243 Dec 26, 2022
QAHOI: Query-Based Anchors for Human-Object Interaction Detection (paper)

QAHOI QAHOI: Query-Based Anchors for Human-Object Interaction Detection (paper) Requirements PyTorch = 1.5.1 torchvision = 0.6.1 pip install -r requ

38 Dec 29, 2022
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Ro

Meta Research 1.2k Jan 02, 2023
Pytorch Lightning code guideline for conferences

Deep learning project seed Use this seed to start new deep learning / ML projects. Built in setup.py Built in requirements Examples with MNIST Badges

Pytorch Lightning 1k Jan 02, 2023
Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning

structshot Code and data for paper "Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning", Yi Yang and Arz

ASAPP Research 47 Dec 27, 2022
Disagreement-Regularized Imitation Learning

Due to a normalization bug the expert trajectories have lower performance than the rl_baseline_zoo reported experts. Please see the following link in

Kianté Brantley 25 Apr 28, 2022
A PaddlePaddle version of Neural Renderer, refer to its PyTorch version

Neural 3D Mesh Renderer in PadddlePaddle A PaddlePaddle version of Neural Renderer, refer to its PyTorch version Install Run: pip install neural-rende

AgentMaker 13 Jul 12, 2022
Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

UniRE Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021. Requirements python: 3.7.6 pytorch: 1.8.1 transformers:

Wang Yijun 109 Nov 29, 2022
pyspark🍒🥭 is delicious,just eat it!😋😋

如何用10天吃掉pyspark? 🔥 🔥 《10天吃掉那只pyspark》 🚀

lyhue1991 578 Dec 30, 2022
AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet buil

3.4k Jan 07, 2023
Neural network pruning for finding a sparse computational model for controlling a biological motor task.

MothPruning Scientific Overview Originally inspired by biological nervous systems, deep neural networks (DNNs) are powerful computational tools for mo

Olivia Thomas 0 Dec 14, 2022
Mesh TensorFlow: Model Parallelism Made Easier

Mesh TensorFlow - Model Parallelism Made Easier Introduction Mesh TensorFlow (mtf) is a language for distributed deep learning, capable of specifying

1.3k Dec 26, 2022
Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Static Features Classifier This is a static features classifier for Point-Could

ABDALKARIM MOHTASIB 1 Jan 25, 2022
Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

Code Transformer This is an official PyTorch implementation of the CodeTransformer model proposed in: D. Zügner, T. Kirschstein, M. Catasta, J. Leskov

Daniel Zügner 131 Dec 13, 2022
It's a powerful version of linebot

CTPS-FINAL Linbot-sever.py 主程式 Algorithm.py 推薦演算法,媒合餐廳端資料與顧客端資料 config.ini 儲存 channel-access-token、channel-secret 資料 Preface 生活在成大將近4年,我們每天的午餐時間看著形形色色

1 Oct 17, 2022
Malware Analysis Neural Network project.

MalanaNeuralNetwork Description Malware Analysis Neural Network project. Table of Contents Getting Started Requirements Installation Clone Set-Up VENV

2 Nov 13, 2021