Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

Related tags

Deep LearningOTA
Overview

OTA: Optimal Transport Assignment for Object Detection

GitHub

This project provides an implementation for our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" on PyTorch.

Requirements

Get Started

  • install cvpods locally (requires cuda to compile)
python3 -m pip install 'git+https://github.com/Megvii-BaseDetection/cvpods.git'
# (add --user if you don't have permission)

# Or, to install it from a local clone:
git clone https://github.com/Megvii-BaseDetection/cvpods.git
python3 -m pip install -e cvpods

# Or,
pip install -r requirements.txt
python3 setup.py build develop
  • prepare datasets
cd /path/to/cvpods/datasets
ln -s /path/to/your/coco/dataset coco
  • Train & Test
git clone https://github.com/Megvii-BaseDetection/OTA.git
cd playground/detection/coco/ota.res50.fpn.coco.800size.1x  # for example

# Train
pods_train --num-gpus 8

# Test
pods_test --num-gpus 8 \
    MODEL.WEIGHTS /path/to/your/save_dir/ckpt.pth # optional
    OUTPUT_DIR /path/to/your/save_dir # optional

# Multi node training
## sudo apt install net-tools ifconfig
pods_train --num-gpus 8 --num-machines N --machine-rank 0/1/.../N-1 --dist-url "tcp://MASTER_IP:port"

Results on COCO val set

Model Backbone LR Sched. mAP Recall AP50/AP75/APs/APm/APl Download
RetinaNet R50 1x 36.5 53.4 56.2/39.3/21.9/40.5/47.7 -
Faster R-CNN R50 1x 38.1 52.2 58.9/41.0/22.5/41.5/48.9 -
FCOS R50 1x 38.7 57.0 57.5/41.7/22.6/42.7/49.9 -
FreeAnchor R50 1x 38.4 55.4 57.0/41.1/21.9/41.7/51.8 -
ATSS R50 1x 39.4 57.7 57.5/42.7/22.9/42.9/51.2 -
PAA(w/. Voting) R50 1x 40.4 - - -
OTA R50 1x 40.7 59.0 58.4/44.3/23.2/45.0/53.6 weights

Results on COCO test-dev

Model Backbone LR Sched. Training Scale (ShortSide) mAP AP50/AP75/APs/APm/APl Download
OTA R101 2x 640~800 45.3 63.5/49.3/26.9/48.8/56.1 weights
OTA X101 2x 640~800 47.0 65.8/51.1/29.2/50.4/57.9 weights
OTA X101-DCN 2x 640~800 49.2 67.6/53.5/30.0/52.5/62.3 weights
OTA* X101-DCN 2x 640~800 51.5 68.6/57.1/34.1/53.7/64.1 weights

* stands for ATSS-style testing time augmentation. To enable testing time augmentation, add/modify the following code frac in the corresponding config.py

TEST=dict(
    DETECTIONS_PER_IMAGE=300,
    AUG=dict(
        ENABLED=True,
        MAX_SIZE=3000,
        MIN_SIZES=(400, 500, 600, 640, 700, 900, 1000, 1100, 1200, 1300, 1400, 1800),
        EXTRA_SIZES=((800, 1333),),
        SCALE_FILTER=True,
        SCALE_RANGES=(
        [96, 10000], [96, 10000], [64, 10000], [64, 10000], [64, 10000], [0, 10000], [0, 10000], [0, 256], [0, 256], [0, 192], [0, 192], [0, 96], [0, 10000])
    )
),

Acknowledgement

This repo is developed based on cvpods. Please check cvpods for more details and features.

License

This repo is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Owner
BaseDetection Team of Megvii
Implementation of Geometric Vector Perceptron, a simple circuit for 3d rotation equivariance for learning over large biomolecules, in Pytorch. Idea proposed and accepted at ICLR 2021

Geometric Vector Perceptron Implementation of Geometric Vector Perceptron, a simple circuit with 3d rotation equivariance for learning over large biom

Phil Wang 59 Nov 24, 2022
Repo for CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

CReST in Tensorflow 2 Code for the paper: "CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning" by Chen Wei, Ki

Google Research 75 Nov 01, 2022
Deep-learning-roadmap - All You Need to Know About Deep Learning - A kick-starter

Deep Learning - All You Need to Know Sponsorship To support maintaining and upgrading this project, please kindly consider Sponsoring the project deve

Instill AI 4.4k Dec 26, 2022
Learning nonlinear operators via DeepONet

DeepONet: Learning nonlinear operators The source code for the paper Learning nonlinear operators via DeepONet based on the universal approximation th

Lu Lu 239 Jan 02, 2023
基于tensorflow 2.x的图片识别工具集

Classification.tf2 基于tensorflow 2.x的图片识别工具集 功能 粗粒度场景图片分类 细粒度场景图片分类 其他场景图片分类 模型部署 tensorflow serving本地推理和docker部署 tensorRT onnx ... 数据集 https://hyper.a

Wei Qi 1 Nov 03, 2021
Official source code of Fast Point Transformer, CVPR 2022

Fast Point Transformer Project Page | Paper This repository contains the official source code and data for our paper: Fast Point Transformer Chunghyun

182 Dec 23, 2022
A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

Lobe This is a Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images. This component lets you easily use an exported m

Kendell R 4 Feb 28, 2022
use machine learning to recognize gesture on raspberrypi

Raspberrypi_Gesture-Recognition use machine learning to recognize gesture on raspberrypi 說明 利用 tensorflow lite 訓練手部辨識模型 分辨 "剪刀"、"石頭"、"布" 之手勢 再將訓練模型匯入

1 Dec 10, 2021
Naszilla is a Python library for neural architecture search (NAS)

A repository to compare many popular NAS algorithms seamlessly across three popular benchmarks (NASBench 101, 201, and 301). You can implement your ow

270 Jan 03, 2023
Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

One2Set This repository contains the code for our ACL 2021 paper “One2Set: Generating Diverse Keyphrases as a Set”. Our implementation is built on the

Jiacheng Ye 63 Jan 05, 2023
CMSC320 - Introduction to Data Science - Fall 2021

CMSC320 - Introduction to Data Science - Fall 2021 Instructors: Elias Jonatan Gonzalez and José Manuel Calderón Trilla Lectures: MW 3:30-4:45 & 5:00-6

Introduction to Data Science 6 Sep 12, 2022
Python library to receive live stream events like comments and gifts in realtime from TikTok LIVE.

TikTokLive A python library to connect to and read events from TikTok's LIVE service A python library to receive and decode livestream events such as

Isaac Kogan 277 Dec 23, 2022
Official implementation of Sparse Transformer-based Action Recognition

STAR Official implementation of S parse T ransformer-based A ction R ecognition Dataset download NTU RGB+D 60 action recognition of 2D/3D skeleton fro

Chonghan_Lee 15 Nov 02, 2022
Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Minimal PyTorch implementation of Generative Latent Optimization This is a reimplementation of the paper Piotr Bojanowski, Armand Joulin, David Lopez-

Thomas Neumann 117 Nov 27, 2022
PyTorch implementation of PSPNet

PSPNet with PyTorch Unofficial implementation of "Pyramid Scene Parsing Network" (https://arxiv.org/abs/1612.01105). This repository is just for caffe

Kazuto Nakashima 52 Nov 16, 2022
Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness (NeurIPS2021) This repository contains code for the paper "Smo

Jongheon Jeong 17 Dec 27, 2022
AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

AT-BMC Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction (AAAI 2022) Paper Prerequisites Install pac

16 Nov 26, 2022
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

Bo Wan 32 Dec 23, 2022
Kaggle | 9th place single model solution for TGS Salt Identification Challenge

UNet for segmenting salt deposits from seismic images with PyTorch. General We, tugstugi and xuyuan, have participated in the Kaggle competition TGS S

Erdene-Ochir Tuguldur 276 Dec 20, 2022
Riemannian Convex Potential Maps

Modeling distributions on Riemannian manifolds is a crucial component in understanding non-Euclidean data that arises, e.g., in physics and geology. The budding approaches in this space are limited b

Facebook Research 61 Nov 28, 2022