ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Related tags

Deep Learningmcibi
Overview

Introduction

The official repository for "Mining Contextual Information Beyond Image for Semantic Segmentation". Our full code has been merged into sssegmentation.

Abstract

This paper studies the context aggregation problem in semantic image segmentation. The existing researches focus on improving the pixel representations by aggregating the contextual information within individual images. Though impressive, these methods neglect the significance of the representations of the pixels of the corresponding class beyond the input image. To address this, this paper proposes to mine the contextual information beyond individual images to further augment the pixel representations. We first set up a feature memory module, which is updated dynamically during training, to store the dataset-level representations of various categories. Then, we learn class probability distribution of each pixel representation under the supervision of the ground-truth segmentation. At last, the representation of each pixel is augmented by aggregating the dataset-level representations based on the corresponding class probability distribution. Furthermore, by utilizing the stored dataset-level representations, we also propose a representation consistent learning strategy to make the classification head better address intra-class compactness and inter-class dispersion. The proposed method could be effortlessly incorporated into existing segmentation frameworks (e.g., FCN, PSPNet, OCRNet and DeepLabV3) and brings consistent performance improvements. Mining contextual information beyond image allows us to report state-of-the-art performance on various benchmarks: ADE20K, LIP, Cityscapes and COCO-Stuff.

Framework

img

Performance

COCOStuff-10k

Model Backbone Crop Size Schedule Train/Eval Set mIoU/mIoU (ms+flip) Download
DeepLabV3 R-50-D8 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/16/110 train/test 38.84%/39.68% model | log
DeepLabV3 R-101-D8 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/16/110 train/test 39.84%/41.49% model | log
DeepLabV3 S-101-D8 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/32/150 train/test 41.18%/42.15% model | log
DeepLabV3 HRNetV2p-W48 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/16/110 train/test 39.77%/41.35% model | log
DeepLabV3 ViT-Large 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/16/110 train/test 44.01%/45.23% model | log

ADE20k

Model Backbone Crop Size Schedule Train/Eval Set mIoU/mIoU (ms+flip) Download
DeepLabV3 R-50-D8 512x512 LR/POLICY/BS/EPOCH: 0.01/poly/16/130 train/val 44.39%/45.95% model | log
DeepLabV3 R-101-D8 512x512 LR/POLICY/BS/EPOCH: 0.01/poly/16/130 train/val 45.66%/47.22% model | log
DeepLabV3 S-101-D8 512x512 LR/POLICY/BS/EPOCH: 0.004/poly/16/180 train/val 46.63%/47.36% model | log
DeepLabV3 HRNetV2p-W48 512x512 LR/POLICY/BS/EPOCH: 0.004/poly/16/180 train/val 45.79%/47.34% model | log
DeepLabV3 ViT-Large 512x512 LR/POLICY/BS/EPOCH: 0.01/poly/16/130 train/val 49.73%/50.99% model | log

CityScapes

Model Backbone Crop Size Schedule Train/Eval Set mIoU (ms+flip) Download
DeepLabV3 R-50-D8 512x1024 LR/POLICY/BS/EPOCH: 0.01/poly/16/440 trainval/test 79.90% model | log
DeepLabV3 R-101-D8 512x1024 LR/POLICY/BS/EPOCH: 0.01/poly/16/440 trainval/test 82.03% model | log
DeepLabV3 S-101-D8 512x1024 LR/POLICY/BS/EPOCH: 0.01/poly/16/500 trainval/test 81.59% model | log
DeepLabV3 HRNetV2p-W48 512x1024 LR/POLICY/BS/EPOCH: 0.01/poly/16/500 trainval/test 82.55% model | log

LIP

Model Backbone Crop Size Schedule Train/Eval Set mIoU/mIoU (flip) Download
DeepLabV3 R-50-D8 473x473 LR/POLICY/BS/EPOCH: 0.01/poly/32/150 train/val 53.73%/54.08% model | log
DeepLabV3 R-101-D8 473x473 LR/POLICY/BS/EPOCH: 0.01/poly/32/150 train/val 55.02%/55.42% model | log
DeepLabV3 S-101-D8 473x473 LR/POLICY/BS/EPOCH: 0.007/poly/40/150 train/val 56.21%/56.34% model | log
DeepLabV3 HRNetV2p-W48 473x473 LR/POLICY/BS/EPOCH: 0.007/poly/40/150 train/val 56.40%/56.99% model | log

Citation

If this code is useful for your research, please consider citing:

@article{jin2021mining,
  title={Mining Contextual Information Beyond Image for Semantic Segmentation},
  author={Jin, Zhenchao and Gong, Tao and Yu, Dongdong and Chu, Qi and Wang, Jian and Wang, Changhu and Shao, Jie},
  journal={arXiv preprint arXiv:2108.11819},
  year={2021}
}
Owner
student
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

ELECTRA Introduction ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using

Google Research 2.1k Dec 28, 2022
I will implement Fastai in each projects present in this repository.

DEEP LEARNING FOR CODERS WITH FASTAI AND PYTORCH The repository contains a list of the projects which I have worked on while reading the book Deep Lea

Thinam Tamang 43 Dec 20, 2022
A Python library for Deep Graph Networks

PyDGN Wiki Description This is a Python library to easily experiment with Deep Graph Networks (DGNs). It provides automatic management of data splitti

Federico Errica 194 Dec 22, 2022
Implementation for "Manga Filling Style Conversion with Screentone Variational Autoencoder" (SIGGRAPH ASIA 2020 issue)

Manga Filling with ScreenVAE SIGGRAPH ASIA 2020 | Project Website | BibTex This repository is for ScreenVAE introduced in the following paper "Manga F

30 Dec 24, 2022
Deep Learning for Human Part Discovery in Images - Chainer implementation

Deep Learning for Human Part Discovery in Images - Chainer implementation NOTE: This is not official implementation. Original paper is Deep Learning f

Shintaro Shiba 63 Sep 25, 2022
"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

SOLQ: Segmenting Objects by Learning Queries This repository is an official implementation of the paper SOLQ: Segmenting Objects by Learning Queries.

MEGVII Research 179 Jan 02, 2023
Implement some metaheuristics and cost functions

Metaheuristics This repot implement some metaheuristics and cost functions. Metaheuristics JAYA Implement Jaya optimizer without constraints. Cost fun

Adri1G 1 Mar 23, 2022
ML course - EPFL Machine Learning Course, Fall 2021

EPFL Machine Learning Course CS-433 Machine Learning Course, Fall 2021 Repository for all lecture notes, labs and projects - resources, code templates

EPFL Machine Learning and Optimization Laboratory 1k Jan 04, 2023
an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

revisiting-sepconv This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two f

Simon Niklaus 59 Dec 22, 2022
A Learning-based Camera Calibration Toolbox

Learning-based Camera Calibration A Learning-based Camera Calibration Toolbox Paper The pdf file can be found here. @misc{zhang2022learningbased,

Eason 14 Dec 21, 2022
[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

F8Net Fixed-Point 8-bit Only Multiplication for Network Quantization (ICLR 2022 Oral) OpenReview | arXiv | PDF | Model Zoo | BibTex PyTorch implementa

Snap Research 76 Dec 13, 2022
Pre-Training 3D Point Cloud Transformers with Masked Point Modeling

Point-BERT: Pre-Training 3D Point Cloud Transformers with Masked Point Modeling Created by Xumin Yu*, Lulu Tang*, Yongming Rao*, Tiejun Huang, Jie Zho

Lulu Tang 306 Jan 06, 2023
(EI 2022) Controllable Confidence-Based Image Denoising

Image Denoising with Control over Deep Network Hallucination Paper and arXiv preprint -- Our frequency-domain insights derive from SFM and the concept

Images and Visual Representation Laboratory (IVRL) at EPFL 5 Dec 18, 2022
TensorFlow (Python API) implementation of Neural Style

neural-style-tf This is a TensorFlow implementation of several techniques described in the papers: Image Style Transfer Using Convolutional Neural Net

Cameron 3.1k Jan 02, 2023
A symbolic-model-guided fuzzer for TLS

tlspuffin TLS Protocol Under FuzzINg A symbolic-model-guided fuzzer for TLS Master Thesis | Thesis Presentation | Documentation Disclaimer: The term "

69 Dec 20, 2022
[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Learning to Compose Visual Relations This is the pytorch codebase for the NeurIPS 2021 Spotlight paper Learning to Compose Visual Relations. Demo Imag

Nan Liu 88 Jan 04, 2023
PyTorch implementation of SIFT descriptor

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can

Dmytro Mishkin 150 Dec 24, 2022
Multi Agent Reinforcement Learning for ROS in 2D Simulation Environments

IROS21 information To test the code and reproduce the experiments, follow the installation steps in Installation.md. Afterwards, follow the steps in E

11 Oct 29, 2022
PyTorch META-DATASET (Few-shot classification benchmark)

PyTorch META-DATASET (Few-shot classification benchmark) This repo contains a PyTorch implementation of meta-dataset and a unified implementation of s

Malik Boudiaf 39 Oct 31, 2022
Disentangled Lifespan Face Synthesis

Disentangled Lifespan Face Synthesis Project Page | Paper Demo on Colab Preparation Please follow this github to prepare the environments and dataset.

何森 50 Sep 20, 2022