Improving Calibration for Long-Tailed Recognition (CVPR2021)

Last update: Dec 20, 2022

Overview

MiSLAS

Improving Calibration for Long-Tailed Recognition

Authors: Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia

[arXiv] [slide] [BibTeX]

Introduction: This repository provides an implementation for the CVPR 2021 paper: "Improving Calibration for Long-Tailed Recognition" based on LDAM-DRW and Decoupling models. Our study shows, because of the extreme imbalanced composition ratio of each class, networks trained on long-tailed datasets are more miscalibrated and over-confident. MiSLAS is a simple, and efficient two-stage framework for long-tailed recognition, which greatly improves recognition accuracy and markedly relieves over-confidence simultaneously.

Installation

Requirements

Python 3.7
torchvision 0.4.0
Pytorch 1.2.0
yacs 0.1.8

Virtual Environment

conda create -n MiSLAS python==3.7
source activate MiSLAS

Install MiSLAS

git clone https://github.com/Jia-Research-Lab/MiSLAS.git
cd MiSLAS
pip install -r requirements.txt

Dataset Preparation

Change the data_path in config/*/*.yaml accordingly.

Training

Stage-1:

To train a model for Stage-1 with mixup, run:

(one GPU for CIFAR-10-LT & CIFAR-100-LT, four GPUs for ImageNet-LT, iNaturalist 2018, and Places-LT)

python train_stage1.py --cfg ./config/DATASETNAME/DATASETNAME_ARCH_stage1_mixup.yaml

DATASETNAME can be selected from cifar10, cifar100, imagenet, ina2018, and places.

ARCH can be resnet32 for cifar10/100, resnet50/101/152 for imagenet, resnet50 for ina2018, and resnet152 for places, respectively.

Stage-2:

To train a model for Stage-2 with one GPU (all the above datasets), run:

python train_stage2.py --cfg ./config/DATASETNAME/DATASETNAME_ARCH_stage2_mislas.yaml resume /path/to/checkpoint/stage1

The saved folder (including logs and checkpoints) is organized as follows.

MiSLAS
├── saved
│   ├── modelname_date
│   │   ├── ckps
│   │   │   ├── current.pth.tar
│   │   │   └── model_best.pth.tar
│   │   └── logs
│   │       └── modelname.txt
│   ...

Evaluation

To evaluate a trained model, run:

python eval.py --cfg ./config/DATASETNAME/DATASETNAME_ARCH_stage1_mixup.yaml  resume /path/to/checkpoint/stage1
python eval.py --cfg ./config/DATASETNAME/DATASETNAME_ARCH_stage2_mislas.yaml resume /path/to/checkpoint/stage2

Results and Models

1) CIFAR-10-LT and CIFAR-100-LT

Stage-1 (mixup):

Dataset	Top-1 Accuracy	ECE (15 bins)	Model
CIFAR-10-LT IF=10	87.6%	11.9%	link
CIFAR-10-LT IF=50	78.1%	2.49%	link
CIFAR-10-LT IF=100	72.8%	2.14%	link
CIFAR-100-LT IF=10	59.1%	5.24%	link
CIFAR-100-LT IF=50	45.4%	4.33%	link
CIFAR-100-LT IF=100	39.5%	8.82%	link

Stage-2 (MiSLAS):

Dataset	Top-1 Accuracy	ECE (15 bins)	Model
CIFAR-10-LT IF=10	90.0%	1.20%	link
CIFAR-10-LT IF=50	85.7%	2.01%	link
CIFAR-10-LT IF=100	82.5%	3.66%	link
CIFAR-100-LT IF=10	63.2%	1.73%	link
CIFAR-100-LT IF=50	52.3%	2.47%	link
CIFAR-100-LT IF=100	47.0%	4.83%	link

Note: To obtain better performance, we highly recommend changing the weight decay 2e-4 to 5e-4 on CIFAR-LT.

2) Large-scale Datasets

Stage-1 (mixup):

Dataset	Arch	Top-1 Accuracy	ECE (15 bins)	Model
ImageNet-LT	ResNet-50	45.5%	7.98%	link
iNa'2018	ResNet-50	66.9%	5.37%	link
Places-LT	ResNet-152	29.4%	16.7%	link

Stage-2 (MiSLAS):

Dataset	Arch	Top-1 Accuracy	ECE (15 bins)	Model
ImageNet-LT	ResNet-50	52.7%	1.78%	link
iNa'2018	ResNet-50	71.6%	7.67%	link
Places-LT	ResNet-152	40.4%	3.41%	link

Citation

Please consider citing MiSLAS in your publications if it helps your research. :)

@inproceedings{zhong2021mislas,
    title={Improving Calibration for Long-Tailed Recognition},
    author={Zhisheng Zhong, Jiequan Cui, Shu Liu, and Jiaya Jia},
    booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021},
}

Contact

If you have any questions about our work, feel free to contact us through email (Zhisheng Zhong: [email protected]) or Github issues.

Improving Calibration for Long-Tailed Recognition (CVPR2021)

Related tags

Overview

MiSLAS

Installation

Training

Evaluation

Results and Models

Citation

Contact

Owner

DV Lab

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Dynamic Token Normalization Improves Vision Transformers

This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking

People log into different sites every day to get information and browse through these sites one by one

NeuralForecast is a Python library for time series forecasting with deep learning models

Simple tutorials on Pytorch DDP training

YOLOv5 detection interface - PyQt5 implementation

A pytorch implementation of Paper "Improved Training of Wasserstein GANs"

JDet is Object Detection Framework based on Jittor.

2021搜狐校园文本匹配算法大赛分比我们低的都是帅哥队

Neural Factorization of Shape and Reflectance Under An Unknown Illumination

AI Summer's complete catalog of articles

This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend.

Deep and online learning with spiking neural networks in Python

Steerable discovery of neural audio effects

Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

Text completion with Hugging Face and TensorFlow.js running on Node.js

Continual Learning of Electronic Health Records (EHR).

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

A Real-ESRGAN equipped Colab notebook for CLIP Guided Diffusion

Improving Calibration for Long-Tailed Recognition (CVPR2021)

Related tags

Overview

MiSLAS

Installation

Training

Evaluation

Results and Models

Citation

Contact

Owner

DV Lab

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Dynamic Token Normalization Improves Vision Transformers

This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking

People log into different sites every day to get information and browse through these sites one by one

NeuralForecast is a Python library for time series forecasting with deep learning models

Simple tutorials on Pytorch DDP training

YOLOv5 detection interface - PyQt5 implementation

A pytorch implementation of Paper "Improved Training of Wasserstein GANs"

JDet is Object Detection Framework based on Jittor.

2021搜狐校园文本匹配算法大赛 分比我们低的都是帅哥队

Neural Factorization of Shape and Reflectance Under An Unknown Illumination

AI Summer's complete catalog of articles

This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend.

Deep and online learning with spiking neural networks in Python

Steerable discovery of neural audio effects

Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

Text completion with Hugging Face and TensorFlow.js running on Node.js

Continual Learning of Electronic Health Records (EHR).

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

A Real-ESRGAN equipped Colab notebook for CLIP Guided Diffusion

2021搜狐校园文本匹配算法大赛分比我们低的都是帅哥队