Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

Overview

Auto-Seg-Loss

By Hao Li, Chenxin Tao, Xizhou Zhu, Xiaogang Wang, Gao Huang, Jifeng Dai

This is the official implementation of the ICLR 2021 paper Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation.

Introduction

TL; DR.

Auto Seg-Loss is the first general framework for searching surrogate losses for mainstream semantic segmentation metrics.

Abstract.

Designing proper loss functions is essential in training deep networks. Especially in the field of semantic segmentation, various evaluation metrics have been proposed for diverse scenarios. Despite the success of the widely adopted cross-entropy loss and its variants, the mis-alignment between the loss functions and evaluation metrics degrades the network performance. Meanwhile, manually designing loss functions for each specific metric requires expertise and significant manpower. In this paper, we propose to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric. We substitute the non-differentiable operations in the metrics with parameterized functions, and conduct parameter search to optimize the shape of loss surfaces. Two constraints are introduced to regularize the search space and make the search efficient. Extensive experiments on PASCAL VOC and Cityscapes demonstrate that the searched surrogate losses outperform the manually designed loss functions consistently. The searched losses can generalize well to other datasets and networks.

ASL-overview ASL-results

License

This project is released under the Apache 2.0 license.

Citing Auto Seg-Loss

If you find Auto Seg-Loss useful in your research, please consider citing:

@inproceedings{li2020auto,
  title={Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation},
  author={Li, Hao and Tao, Chenxin and Zhu, Xizhou and Wang, Xiaogang and Huang, Gao and Dai, Jifeng},
  booktitle={ICLR},
  year={2021}
}

Configs

PASCAL VOC Search experiments

Target Metric mIoU FWIoU mAcc gAcc BIoU BF1
Parameterization bezier bezier bezier bezier bezier bezier
URL config config config config config config

PASCAL VOC Re-training experiments

Target Metric mIoU FWIoU mAcc gAcc BIoU BF1
Cross Entropy 78.69 91.31 87.31 95.17 70.61 65.30
ASL 80.97 91.93 92.95 95.22 79.27 74.83
URL config
log
config
log
config
log
config
log
config
log
config
log

Note:

1. The search experiments are conducted with R50-DeepLabV3+.

2. The re-training experiments are conducted with R101-DeeplabV3+.

Installation

Our implementation is based on MMSegmentation.

Prerequisites

  • Python>=3.7

    We recommend you to use Anaconda to create a conda environment:

    conda create -n auto_segloss python=3.8 -y

    Then, activate the environment:

    conda activate auto_segloss
  • PyTorch>=1.7.0, torchvision>=0.8.0 (following official instructions).

    For example, if your CUDA version is 10.1, you could install pytorch and torchvision as follows:

    conda install pytorch=1.8.0 torchvision=0.9.0 cudatoolkit=10.1 -c pytorch
  • MMCV>=1.3.0 (following official instruction).

    We recommend installing the pre-built mmcv-full. For example, if your CUDA version is 10.1 and pytorch version is 1.8.0, you could run:

    pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html

Installing the modified mmsegmentation

git clone https://github.com/fundamentalvision/Auto-Seg-Loss.git
cd Auto-Seg-Loss
pip install -e .

Usage

Dataset preparation

Please follow the official guide of MMSegmentation to organize the datasets. It's highly recommended to symlink the dataset root to Auto-Seg-Loss/data. The recommended data structure is as follows:

Auto-Seg-Loss
├── mmseg
├── ASL_search
└── data
    └── VOCdevkit
        ├── VOC2012
        └── VOCaug

Training models with the provided parameters

The re-training command format is

./ASL_retrain.sh {CONFIG_NAME} [{NUM_GPUS}] [{SEED}]

For example, the command for training a ResNet-101 DeepLabV3+ with 4 GPUs for mIoU is as follows:

./ASL_retrain.sh miou_bezier_10k.py 4

You can also follow the provided configs to modify the mmsegmentation configs, and use Auto Seg-Loss for training other models on other datasets.

Searching for semantic segmentation metrics

The search command format is

./ASL_search.sh {CONFIG_NAME} [{NUM_GPUS}] [{SEED}]

For example, the command for searching for surrogate loss functions for mIoU with 8 GPUs is as follows:

./ASL_search.sh miou_bezier_lr=0.2_eps=0.2.py 8
Geometry-Free View Synthesis: Transformers and no 3D Priors

Geometry-Free View Synthesis: Transformers and no 3D Priors Geometry-Free View Synthesis: Transformers and no 3D Priors Robin Rombach*, Patrick Esser*

CompVis Heidelberg 293 Dec 22, 2022
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

Fine-grained Post-training for Multi-turn Response Selection Implements the model described in the following paper Fine-grained Post-training for Impr

Janghoon Han 83 Dec 20, 2022
OCR-D wrapper for detectron2 based segmentation models

ocrd_detectron2 OCR-D wrapper for detectron2 based segmentation models Introduction Installation Usage OCR-D processor interface ocrd-detectron2-segm

Robert Sachunsky 13 Dec 06, 2022
CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper) (Accepted for oral presentation at ACM

Minha Kim 1 Nov 12, 2021
robomimic: A Modular Framework for Robot Learning from Demonstration

robomimic [Homepage]   [Documentation]   [Study Paper]   [Study Website]   [ARISE Initiative] Latest Updates [08/09/2021] v0.1.0: Initial code and pap

ARISE Initiative 178 Jan 05, 2023
FairMOT - A simple baseline for one-shot multi-object tracking

FairMOT - A simple baseline for one-shot multi-object tracking

Yifu Zhang 3.6k Jan 08, 2023
A pyparsing-based library for parsing SOQL statements

CONTRIBUTORS WANTED!! Installation pip install python-soql-parser or, with poetry poetry add python-soql-parser Usage from python_soql_parser import p

Kicksaw 0 Jun 07, 2022
Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Face Identity Disentanglement via Latent Space Mapping Description Official Implementation of the paper Face Identity Disentanglement via Latent Space

150 Dec 07, 2022
SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data for Cancer Type Classification

SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data for Cancer Type Classification

Sayed Hashim 3 Nov 15, 2022
A project to make Amazon Echo respond to sign language using your webcam

Making Alexa respond to Sign Language using Tensorflow.js Try the live demo Read the Blog Post on Tensorflow's Blog Coming Soon Watch the video This p

Abhishek Singh 444 Jan 03, 2023
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity Indic TTS Samples can be found at https://peter-yh-wu.github.io/cross-

Peter Wu 1 Nov 12, 2022
Understanding the Generalization Benefit of Model Invariance from a Data Perspective

Understanding the Generalization Benefit of Model Invariance from a Data Perspective This is the code for our NeurIPS2021 paper "Understanding the Gen

1 Jan 15, 2022
Edison AT is software Depression Assistant personal.

Edison AT Edison AT is software / program Depression Assistant personal. Feature: Analyze emotional real-time from face. Audio Edison(Comingsoon relea

Ananda Rauf 2 Apr 24, 2022
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Hierarchical Token Semantic Audio Transformer Introduction The Code Repository for "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound

Knut(Ke) Chen 134 Jan 01, 2023
The official MegEngine implementation of the ICCV 2021 paper: GyroFlow: Gyroscope-Guided Unsupervised Optical Flow Learning

[ICCV 2021] GyroFlow: Gyroscope-Guided Unsupervised Optical Flow Learning This is the official implementation of our ICCV2021 paper GyroFlow. Our pres

MEGVII Research 36 Sep 07, 2022
PyTorch code for the ICCV'21 paper: "Always Be Dreaming: A New Approach for Class-Incremental Learning"

Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning PyTorch code for the ICCV 2021 paper: Always Be Dreaming: A New Approach f

49 Dec 21, 2022
Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data

VIMuRe Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data. If you use this code please cite this article (preprint). De

6 Dec 15, 2022
The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

John Salib 2 Jan 30, 2022
SingleVC performs any-to-one VC, which is an important component of MediumVC project.

SingleVC performs any-to-one VC, which is an important component of MediumVC project. Here is the official implementation of the paper, MediumVC.

谷下雨 26 Dec 28, 2022