Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Last update: Jan 02, 2023

Overview

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Exploring Cross-Image Pixel Contrast for Semantic Segmentation,
Wenguan Wang, Tianfei Zhou, Fisher Yu, Jifeng Dai, Ender Konukoglu and Luc Van Gool
arXiv technical report (arXiv 2101.11939)

Abstract

Current semantic segmentation methods focus only on mining “local” context, i.e., dependencies between pixels within individual images, by context-aggregation modules (e.g., dilated convolution, neural attention) or structureaware optimization criteria (e.g., IoU-like loss). However, they ignore “global” context of the training data, i.e., rich semantic relations between pixels across different images. Inspired by the recent advance in unsupervised contrastive representation learning, we propose a pixel-wise contrastive framework for semantic segmentation in the fully supervised setting. The core idea is to enforce pixel embeddings belonging to a same semantic class to be more similar than embeddings from different classes. It raises a pixel-wise metric learning paradigm for semantic segmentation, by explicitly exploring the structures of labeled pixels, which are long ignored in the field. Our method can be effortlessly incorporated into existing segmentation frameworks without extra overhead during testing.

We experimentally show that, with famous segmentation models (i.e., DeepLabV3, HRNet, OCR) and backbones (i.e., ResNet, HRNet), our method brings consistent performance improvements across diverse datasets (i.e., Cityscapes, PASCALContext, COCO-Stuff).

Installation

This implementation is built on openseg.pytorch. Many thanks to the authors for the efforts.

Please follow the Getting Started for installation and dataset preparation.

Running

Cityscapes

Train DeepLabV3

bash scripts/cityscapes/deeplab/run_r_101_d_8_deeplabv3_train_contrast.sh train 'resnet101-deeplabv3-contrast'

Features (in progress)

t-SNE Visualization

Pixel-wise Cross-Entropy Loss

Pixel-wise Contrastive Learning Objective

Citation

@article{wang2021exploring,
  title   = {Exploring Cross-Image Pixel Contrast for Semantic Segmentation},
  author  = {Wang, Wenguan and Zhou, Tianfei and Yu, Fisher and Dai, Jifeng and Konukoglu, Ender and Van Gool, Luc},
  journal = {arXiv preprint arXiv:2101.11939},
  year    = {2021}
}

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Related tags

Overview

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Abstract

Installation

Running

Cityscapes

Features (in progress)

t-SNE Visualization

Citation

Owner

Tianfei Zhou

High performance distributed framework for training deep learning recommendation models based on PyTorch.

Real Time Object Detection and Classification using Yolo Algorithm.

The implementation of "Bootstrapping Semantic Segmentation with Regional Contrast".

The official start-up code for paper "FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark."

SCALoss: Side and Corner Aligned Loss for Bounding Box Regression (AAAI2022).

H&M Fashion Image similarity search with Weaviate and DocArray

[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"

This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

Implementation for paper: Self-Regulation for Semantic Segmentation

CNN Based Meta-Learning for Noisy Image Classification and Template Matching

Learned image compression

PowerGridworld: A Framework for Multi-Agent Reinforcement Learning in Power Systems

Fully Convlutional Neural Networks for state-of-the-art time series classification

PySLM Python Library for Selective Laser Melting and Additive Manufacturing

Powerful and efficient Computer Vision Annotation Tool (CVAT)

Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

Repo for the paper "DiLBERT: Cheap Embeddings for Disease Related Medical NLP"

Disentangled Lifespan Face Synthesis

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"