Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

Last update: Dec 31, 2022

Overview

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Figure 1: Performance of SegFormer-B0 to SegFormer-B5.

Project page | Paper | Demo (Youtube) | Demo (Bilibili)

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers.
Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo.
Technical Report 2021.

This repository contains the PyTorch training/evaluation code and the pretrained models for SegFormer.

SegFormer is a simple, efficient and powerful semantic segmentation method, as shown in Figure 1.

We use MMSegmentation v0.13.0 as the codebase.

Installation

For install and data preparation, please refer to the guidelines in MMSegmentation v0.13.0.

Other requirements: pip install timm==0.3.2

Evaluation

Download trained weights.

Example: evaluate SegFormer-B1 on ADE20K:

# Single-gpu testing
python tools/test.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file

# Multi-gpu testing
./tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM>

# Multi-gpu, multi-scale testing
tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM> --aug-test

Training

Download weights pretrained on ImageNet-1K, and put them in a folder pretrained/.

Example: train SegFormer-B1 on ADE20K:

# Single-gpu training
python tools/train.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py 

# Multi-gpu training
./tools/dist_train.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py <GPU_NUM>

License

Please check the LICENSE file. SegFormer may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Citation

@article{xie2021segformer,
  title={SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers},
  author={Xie, Enze and Wang, Wenhai and Yu, Zhiding and Anandkumar, Anima and Alvarez, Jose M and Luo, Ping},
  journal={arXiv preprint arXiv:2105.15203},
  year={2021}
}

Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

Related tags

Overview

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Project page | Paper | Demo (Youtube) | Demo (Bilibili)

Installation

Evaluation

Training

License

Citation

Owner

NVIDIA Research Projects

On the model-based stochastic value gradient for continuous reinforcement learning

Bi-level feature alignment for versatile image translation and manipulation (Under submission of TPAMI)

💡 Learnergy is a Python library for energy-based machine learning models.

Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)

Structure-Preserving Deraining with Residue Channel Prior Guidance (ICCV2021)

Use evolutionary algorithms instead of gridsearch in scikit-learn

Effective Use of Transformer Networks for Entity Tracking

Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust.

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

The repository for freeCodeCamp's YouTube course, Algorithmic Trading in Python

Codes for "Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier"

1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

Revisting Open World Object Detection

Experiments with the Robust Binary Interval Search (RBIS) algorithm, a Query-Based prediction algorithm for the Online Search problem.

Expert Finding in Legal Community Question Answering

A lightweight face-recognition toolbox and pipeline based on tensorflow-lite

Navigating StyleGAN2 w latent space using CLIP

Time Series Forecasting with Temporal Fusion Transformer in Pytorch

Keras implementation of the GNM model in paper ’Graph-Based Semi-Supervised Learning with Nonignorable Nonresponses‘