[NeurIPS 2020] This project provides a strong single-stage baseline for Long-Tailed Classification, Detection, and Instance Segmentation (LVIS).

Overview

A Strong Single-Stage Baseline for Long-Tailed Problems

Python PyTorch

This project provides a strong single-stage baseline for Long-Tailed Classification (under ImageNet-LT, Long-Tailed CIFAR-10/-100 datasets), Detection, and Instance Segmentation (under LVIS dataset). It is also a PyTorch implementation of the NeurIPS 2020 paper Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect, which proposes a general solution to remove the bad momentum causal effect for a variety of Long-Tailed Recognition tasks. The codes are organized into three folders:

  1. The classification folder supports long-tailed classification on ImageNet-LT, Long-Tailed CIFAR-10/CIFAR-100 datasets.
  2. The lvis_old folder (deprecated) supports long-tailed object detection and instance segmentation on LVIS V0.5 dataset, which is built on top of mmdet V1.1.
  3. The latest version of long-tailed detection and instance segmentation is under lvis1.0 folder. Since both LVIS V0.5 and mmdet V1.1 are no longer available on their homepages, we have to re-implement our method on mmdet V2.4 using LVIS V1.0 annotations.

Slides

If you want to present our work in your group meeting / introduce it to your friends / seek answers for some ambiguous parts in the paper, feel free to use our slides. It has two versions: one-hour full version and five-minute short version.

Installation

The classification part allows the lower version of the following requirements. However, in detection and instance segmentation (mmdet V2.4), I tested some lower versions of python and pytorch, which are all failed. If you want to try other environments, please check the updates of mmdetection.

Requirements:

  • PyTorch >= 1.6.0
  • Python >= 3.7.0
  • CUDA >= 10.1
  • torchvision >= 0.7.0
  • gcc version >= 5.4.0

Step-by-step installation

conda create -n longtail pip python=3.7 -y
source activate longtail
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
pip install pyyaml tqdm matplotlib sklearn h5py

# download the project
git clone https://github.com/KaihuaTang/Long-Tailed-Recognition.pytorch.git
cd Long-Tailed-Recognition.pytorch

# the following part is only used to build mmdetection 
cd lvis1.0
pip install mmcv-full
pip install mmlvis
pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop"

Additional Notes

When we wrote the paper, we are using lvis V0.5 and mmdet V1.1 for our long-tailed instance segmentation experiments, but they've been deprecated by now. If you want to reproduce our results on lvis V0.5, you have to find a way to build mmdet V1.1 environments and use the code in lvis_old folder.

Datasets

ImageNet-LT

ImageNet-LT is a long-tailed subset of original ImageNet, you can download the dataset from its homepage. After you download the dataset, you need to change the data_root of 'ImageNet' in ./classification/main.py file.

CIFAR-10/-100

When you run the code for the first time, our dataloader will automatically download the CIFAR-10/-100. You need to set the data_root in ./classification/main.py to the path where you want to put all CIFAR data.

LVIS

Large Vocabulary Instance Segmentation (LVIS) dataset uses the COCO 2017 train, validation, and test image sets. If you have already downloaded the COCO images, you only need to download the LVIS annotations. LVIS val set contains images from COCO 2017 train in addition to the COCO 2017 val split.

You need to put all the annotations and images under ./data/LVIS like this:

data
  |-- LVIS
    |--lvis_v1_train.json
    |--lvis_v1_val.json
      |--images
        |--train2017
          |--.... (images)
        |--test2017
          |--.... (images)
        |--val2017
          |--.... (images)

Getting Started

For long-tailed classification, please go to [link]

For long-tailed object detection and instance segmentation, please go to [link]

Advantages of the Proposed Method

  • Compared with previous state-of-the-art Decoupling, our method only requires one-stage training.
  • Most of the existing methods for long-tailed problems are using data distribution to conduct re-sampling or re-weighting during training, which is based on an inappropriate assumption that you can know the future distribution before you start to learn. Meanwhile, the proposed method doesn't need to know the data distribution during training, we only need to use an average feature for inference after we train the model.
  • Our method can be easily transferred to any tasks. We outperform the previous state-of-the-arts Decoupling, BBN, OLTR in image classification, and we achieve better results than 2019 Winner of LVIS challenge EQL in long-tailed object detection and instance segmentation (under the same settings with even fewer GPUs).

Citation

If you find our paper or this project helps your research, please kindly consider citing our paper in your publications.

@inproceedings{tang2020longtailed,
  title={Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect},
  author={Tang, Kaihua and Huang, Jianqiang and Zhang, Hanwang},
  booktitle= {NeurIPS},
  year={2020}
}
Owner
Kaihua Tang
@kaihuatang.github.io/
Kaihua Tang
Official project repository for 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination'

NCAE_UAD Official project repository of 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination' Abstract In this p

Jongmin Andrew Yu 2 Feb 10, 2022
GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

GEP (GDB Enhanced Prompt) GEP (GDB Enhanced Prompt) is a GDB plug-in which make your GDB command prompt more convenient and flexibility. Why I need th

Alan Li 23 Dec 21, 2022
Sum-Product Probabilistic Language

Sum-Product Probabilistic Language SPPL is a probabilistic programming language that delivers exact solutions to a broad range of probabilistic infere

MIT Probabilistic Computing Project 57 Nov 17, 2022
CNN designed for pansharpening

PROGRESSIVE BAND-SEPARATED CONVOLUTIONAL NEURAL NETWORK FOR MULTISPECTRAL PANSHARPENING This repository contains main code for the paper PROGRESSIVE B

SerendipitysX 3 Dec 29, 2021
Official implement of Paper:A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images 深度监督影像融合网络DSIFN用于高分辨率双时相遥感影像变化检测 Of

Chenxiao Zhang 135 Dec 19, 2022
A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

W.I.P-Aim-Memory-Game A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squar

dE_soot 1 Dec 08, 2021
Study of human inductive biases in CNNs and Transformers.

Are Convolutional Neural Networks or Transformers more like human vision? This repository contains the code and fine-tuned models of popular Convoluti

Shikhar Tuli 39 Dec 08, 2022
AI-based, context-driven network device ranking

Batea A batea is a large shallow pan of wood or iron traditionally used by gold prospectors for washing sand and gravel to recover gold nuggets. Batea

Secureworks Taegis VDR 269 Nov 26, 2022
Controlling the MicriSpotAI robot from scratch

Project-MicroSpot-AI Controlling the MicriSpotAI robot from scratch Colaborators Alexander Dennis Components from MicroSpot The MicriSpotAI has the fo

Dennis Núñez-Fernández 5 Oct 20, 2022
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.

7 Jan 08, 2023
Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

Recursive-NeRF: An Efficient and Dynamically Growing NeRF This is a Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

33 Nov 30, 2022
Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

Luca Moschella 520 Dec 30, 2022
A non-linear, non-parametric Machine Learning method capable of modeling complex datasets

Fast Symbolic Regression Symbolic Regression is a non-linear, non-parametric Machine Learning method capable of modeling complex data sets. fastsr aim

VAMSHI CHOWDARY 3 Jun 22, 2022
Speedy Implementation of Instance-based Learning (IBL) agents in Python

A Python library to create single or multi Instance-based Learning (IBL) agents that are built based on Instance Based Learning Theory (IBLT) 1 Instal

0 Nov 18, 2021
My take on a practical implementation of Linformer for Pytorch.

Linformer Pytorch Implementation A practical implementation of the Linformer paper. This is attention with only linear complexity in n, allowing for v

Peter 349 Dec 25, 2022
Automatically align face images 🙃→🙂. Can also do windowing and warping.

Automatic Face Alignment (AFA) Carl M. Gaspar & Oliver G.B. Garrod You have lots of photos of faces like this: But you want to line up all of the face

Carl Michael Gaspar 15 Dec 12, 2022
This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Diverse Motion Stylization (Official) This is the official Pytorch implementation of this paper. Diverse Motion Stylization for Multiple Style Domains

Soomin Park 28 Dec 16, 2022
FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

18 Sep 01, 2022
Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics This repository is the official PyTorch implementation of "Physics-aware Differ

USC-Melady 46 Nov 20, 2022
The official implementation of Variable-Length Piano Infilling (VLI).

Variable-Length-Piano-Infilling The official implementation of Variable-Length Piano Infilling (VLI). (paper: Variable-Length Music Score Infilling vi

29 Sep 01, 2022