The official implementation of Equalization Loss v1 & v2 (CVPR 2020, 2021) based on MMDetection.

Related tags

Deep Learningeqlv2
Overview

The Equalization Losses for Long-tailed Object Detection and Instance Segmentation

This repo is official implementation CVPR 2021 paper: Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection and CVPR 2020 paper: Equalization loss for long-tailed object recognition

Besides the equalization losses, this repo also includes some other algorithms:

  • BAGS (Balance GroupSoftmax)
  • cRT (classifier re-training)
  • LWS (Learnable Weight Scaling)

Requirements

We test our codes on MMDetection V2.3, other versions should also be ok.

Prepare LVIS Dataset

for images

LVIS uses same images as COCO's, so you need to donwload COCO dataset at folder ($COCO), and link those train, val under folder lvis($LVIS).

mkdir -p data/lvis
ln -s $COCO/train $LVIS
ln -s $COCO/val $LVIS
ln -s $COCO/test $LVIS

for annotations

Download the annotations from lvis webset

cd $LVIS
mkdir annotations

then places the annotations at folder ($LVIS/annotations)

Finally you will have the file structure like below:

data
  ├── lvis
  |   ├── annotations
  │   │   │   ├── lvis_v1_val.json
  │   │   │   ├── lvis_v1_train.json
  │   ├── train2017
  │   │   ├── 000000004134.png
  │   │   ├── 000000031817.png
  │   │   ├── ......
  │   ├── val2017
  │   ├── test2017

for API

The official lvis-api and mmlvis can lead to some bugs of multiprocess. See issue

So you can install this LVIS API from my modified repo.

pip install git+https://github.com/tztztztztz/lvis-api.git

Testing with pretrain_models

# ./tools/dist_test.sh ${CONFIG} ${CHECKPOINT} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
./tools/dist_test.sh configs/eqlv2/eql_r50_8x2_1x.py data/pretrain_models/eql_r50_8x2_1x.pth 8 --out results.pkl --eval bbox segm

Training

# ./tools/dist_train.sh ${CONFIG} ${GPU_NUM}
./tools/dist_train.sh ./configs/end2end/eql_r50_8x2_1x.py 8 

Once you finished the training, you will get the evaluation metric like this:

bbox AP

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=all] = 0.242
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=300 catIds=all] = 0.401
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=300 catIds=all] = 0.254
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     s | maxDets=300 catIds=all] = 0.181
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     m | maxDets=300 catIds=all] = 0.317
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     l | maxDets=300 catIds=all] = 0.367
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  r] = 0.135
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  c] = 0.225
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  f] = 0.308
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=all] = 0.331
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     s | maxDets=300 catIds=all] = 0.223
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     m | maxDets=300 catIds=all] = 0.417
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     l | maxDets=300 catIds=all] = 0.497
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  r] = 0.197
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  c] = 0.308
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  f] = 0.415

mask AP

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=all] = 0.237
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=300 catIds=all] = 0.372
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=300 catIds=all] = 0.251
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     s | maxDets=300 catIds=all] = 0.169
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     m | maxDets=300 catIds=all] = 0.316
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=     l | maxDets=300 catIds=all] = 0.370
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  r] = 0.149
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  c] = 0.228
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  f] = 0.286
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=all] = 0.326
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     s | maxDets=300 catIds=all] = 0.210
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     m | maxDets=300 catIds=all] = 0.415
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=     l | maxDets=300 catIds=all] = 0.495
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  r] = 0.213
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  c] = 0.313
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 catIds=  f] = 0.389

We place ours configs file in ./configs/

  • ./configs/end2end: eqlv2 and other end2end methods
  • ./configs/decouple decoupled-based methods

How to train decouple training methods.

  1. Train the baseline model (or EQL v2).
  2. Prepare the pretrained checkpoint
  # suppose you've trained baseline model
  cd r50_1x
  python ../tools/ckpt_surgery.py --ckpt-path epoch_12.pth --method remove
  # if you want to train LWS, you should choose method 'reset'
  1. Start training with configs
  # ./tools/dist_train.sh ./configs/decouple/bags_r50_8x2_1x.py 8
  # ./tools/dist_train.sh ./configs/decouple/lws_r50_8x2_1x.py 8
  ./tools/dist_train.sh ./configs/decouple/crt_r50_8x2_1x.py 8

Pretrained Models on LVIS

Methods end2end AP APr APc APf pretrained_model
Baseline 16.1 0.0 12.0 27.4 model
EQL 18.6 2.1 17.4 27.2 model
RFS 22.2 11.5 21.2 28.0 model
LWS × 17.0 2.0 13.5 27.4 model
cRT × 22.1 11.9 20.2 29.0 model
BAGS × 23.1 13.1 22.5 28.2 model
EQLv2 23.7 14.9 22.8 28.6 model

How to train EQLv2 on OpenImages

1. Download the data

Download openimages v5 images from link, The folder will be

openimages
    ├── train
    ├── validation
    ├── test

Download the annotations for Challenge 2019 from link, The folder will be

annotations
    ├── challenge-2019-classes-description-500.csv
    ├── challenge-2019-train-detection-human-imagelabels.csv
    ├── challenge-2019-train-detection-bbox.csv
    ├── challenge-2019-validation-detection-bbox.csv
    ├── challenge-2019-validation-detection-human-imagelabels.csv
    ├── ...

2. Convert the .csv to coco-like .json file.

cd tools/openimages2coco/
python convert_annotations.py -p PATH_TO_OPENIMAGES --version challenge_2019 --task bbox 

You may need to donwload the data directory from https://github.com/bethgelab/openimages2coco/tree/master/data and place it at $project_dir/tools/openimages2coco/

3. Train models

  ./tools/dist_train.sh ./configs/openimages/eqlv2_r50_fpn_8x2_2x.py 8

Other configs can be found at ./configs/openimages/

4. Inference and output the json results file

./tools/dist_test.sh ./configs/openimages/eqlv2_r50_fpn_8x2_2x.py openimage_eqlv2_2x/epoch_1.pth 8 --format-only --options "jsonfile_prefix=openimage_eqlv2_2x/results"" 

Then you will get results.bbox.json under folder openimage_eqlv2

5. Convert coco-like json result file to openimage-like csv results file

cd $project_dir/tools/openimages2coco/
python convert_predictions.py -p ../../openimage_eqlv2/results.bbox.json --subset validation

Then you will get results.bbox.csv under folder openimage_eqlv2

6. Evaluate results file using official API

Please refer this link

After this, you will see something like this.

OpenImagesDetectionChallenge_Precision/[email protected],0.5263230244227198                                                                                                                     OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/061hd_',0.4198356678732905                                                                                             OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/06m11',0.40262261023434986                                                                                             OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/03120',0.5694096972722996                                                                                              OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/01kb5b',0.20532245532245533                                                                                            OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/0120dh',0.7934685035604202                                                                                             OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/0dv5r',0.7029194449221794                                                                                              OpenImagesDetectionChallenge_PerformanceByCategory/[email protected]/b'/m/0jbk',0.5959245714028935

7. Parse the AP file and output the grouped AP

cd $project_dir

PYTHONPATH=./:$PYTHONPATH python tools/parse_openimage_metric.py --file openimage_eqlv2_2x/metric

And you will get:

mAP 0.5263230244227198
mAP0: 0.4857693606436219
mAP1: 0.52047262478471
mAP2: 0.5304580597832517
mAP3: 0.5348747991854581
mAP4: 0.5588236678031849

Main Results on OpenImages

Methods AP AP1 AP2 AP3 AP4 AP5
Faster-R50 43.1 26.3 42.5 45.2 48.2 52.6
EQL 45.3 32.7 44.6 47.3 48.3 53.1
EQLv2 52.6 48.6 52.0 53.0 53.4 55.8
Faster-R101 46.0 29.2 45.5 49.3 50.9 54.7
EQL 48.0 36.1 47.2 50.5 51.0 55.0
EQLv2 55.1 51.0 55.2 56.6 55.6 57.5

Citation

If you use the equalization losses, please cite our papers.

@article{tan2020eqlv2,
  title={Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection},
  author={Tan, Jingru and Lu, Xin and Zhang, Gang and Yin, Changqing and Li, Quanquan},
  journal={arXiv preprint arXiv:2012.08548},
  year={2020}
}
@inproceedings{tan2020equalization,
  title={Equalization loss for long-tailed object recognition},
  author={Tan, Jingru and Wang, Changbao and Li, Buyu and Li, Quanquan and Ouyang, Wanli and Yin, Changqing and Yan, Junjie},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11662--11671},
  year={2020}
}

Credits

The code for converting openimage to LVIS is from this repo.

Owner
Jingru Tan
Jingru Tan
The aim of the game, as in the original one, is to find a specific image from a group of different images of a person's face

GUESS WHO Main Links: [Github] [App] Related Links: [CLIP] [Celeba] The aim of the game, as in the original one, is to find a specific image from a gr

Arnau - DIMAI 3 Jan 04, 2022
Pytorch implementation of the paper: "SAPNet: Segmentation-Aware Progressive Network for Perceptual Contrastive Image Deraining"

SAPNet This repository contains the official Pytorch implementation of the paper: "SAPNet: Segmentation-Aware Progressive Network for Perceptual Contr

11 Oct 17, 2022
Implementation of E(n)-Transformer, which extends the ideas of Welling's E(n)-Equivariant Graph Neural Network to attention

E(n)-Equivariant Transformer (wip) Implementation of E(n)-Equivariant Transformer, which extends the ideas from Welling's E(n)-Equivariant G

Phil Wang 132 Jan 02, 2023
KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control

KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control Tomas Jakab, Richard Tucker, Ameesh Makadia, Jiajun Wu, Noah Snavely, Angjoo Ka

Tomas Jakab 87 Nov 30, 2022
EigenGAN Tensorflow, EigenGAN: Layer-Wise Eigen-Learning for GANs

Gender Bangs Body Side Pose (Yaw) Lighting Smile Face Shape Lipstick Color Painting Style Pose (Yaw) Pose (Pitch) Zoom & Rotate Flush & Eye Color Mout

Zhenliang He 321 Dec 01, 2022
Solution to the Weather4cast 2021 challenge

This code was used for the entry by the team "antfugue" for the Weather4cast 2021 Challenge. Below, you can find the instructions for generating predi

Jussi Leinonen 13 Jan 03, 2023
Diffusion Normalizing Flow (DiffFlow) Neurips2021

Diffusion Normalizing Flow (DiffFlow) Reproduce setup environment The repo heavily depends on jam, a personal toolbox developed by Qsh.zh. The API may

76 Jan 01, 2023
Official code for 'Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentationon Complex Urban Driving Scenes'

PEBAL This repo contains the Pytorch implementation of our paper: Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urb

Yu Tian 117 Jan 03, 2023
Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Adaptive Task-Relational Context (ATRC) This repository provides source code for the ICCV 2021 paper Exploring Relational Context for Multi-Task Dense

David Brüggemann 35 Dec 05, 2022
Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies.

Crypto_Bot Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies. Steps to get started using the bot: Sign up

21 Oct 03, 2022
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning 🎆 🎆 🎆 Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab 398 Dec 30, 2022
Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation"

1 Introduction Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation". The code s

Liang Zhang 10 Dec 10, 2022
Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation Our paper is accepted by ICCV2021. Picture: Overview of the proposed Plug-an

Yunfei Liu 32 Dec 10, 2022
A comprehensive and up-to-date developer education platform for Urbit.

curriculum A comprehensive and up-to-date developer education platform for Urbit. This project organizes developer capabilities into a hierarchy of co

Sigilante 36 Oct 04, 2022
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

ELECTRA Introduction ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using

Google Research 2.1k Dec 28, 2022
A Keras implementation of CapsNet in the paper: Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules

NOTE This implementation is fork of https://github.com/XifengGuo/CapsNet-Keras , applied to IMDB texts reviews dataset. CapsNet-Keras A Keras implemen

Lauro Moraes 5 Oct 23, 2022
pix2pix in tensorflow.js

pix2pix in tensorflow.js This repo is moved to https://github.com/yining1023/pix2pix_tensorflowjs_lite See a live demo here: https://yining1023.github

Yining Shi 47 Oct 04, 2022
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI

Yizhi Wang 17 Dec 22, 2022