[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

Overview

OW-DETR: Open-world Detection Transformer (CVPR 2022)

[Paper]

Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah

( 🌟 denotes equal contribution)

Introduction

Open-world object detection (OWOD) is a challenging computer vision problem, where the task is to detect a known set of object categories while simultaneously identifying unknown objects. Additionally, the model must incrementally learn new classes that become known in the next training episodes. Distinct from standard object detection, the OWOD setting poses significant challenges for generating quality candidate proposals on potentially unknown objects, separating the unknown objects from the background and detecting diverse unknown objects. Here, we introduce a novel end-to-end transformer-based framework, OW-DETR, for open-world object detection. The proposed OW-DETR comprises three dedicated components namely, attention-driven pseudo-labeling, novelty classification and objectness scoring to explicitly address the aforementioned OWOD challenges. Our OW-DETR explicitly encodes multi-scale contextual information, possesses less inductive bias, enables knowledge transfer from known classes to the unknown class and can better discriminate between unknown objects and background. Comprehensive experiments are performed on two benchmarks: MS-COCO and PASCAL VOC. The extensive ablations reveal the merits of our proposed contributions. Further, our model outperforms the recently introduced OWOD approach, ORE, with absolute gains ranging from $1.8%$ to $3.3%$ in terms of unknown recall on MS-COCO. In the case of incremental object detection, OW-DETR outperforms the state-of-the-art for all settings on PASCAL VOC.


Installation

Requirements

We have trained and tested our models on Ubuntu 16.0, CUDA 10.2, GCC 5.4, Python 3.7

conda create -n owdetr python=3.7 pip
conda activate owdetr
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Compiling CUDA operators

cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

Dataset & Results

OWOD proposed splits



The splits are present inside data/VOC2007/OWOD/ImageSets/ folder. The remaining dataset can be downloaded using this link

The files should be organized in the following structure:

OW-DETR/
└── data/
    └── VOC2007/
        └── OWOD/
        	├── JPEGImages
        	├── ImageSets
        	└── Annotations

Results

Task1 Task2 Task3 Task4
Method U-Recall mAP U-Recall mAP U-Recall mAP mAP
ORE-EBUI 4.9 56.0 2.9 39.4 3.9 29.7 25.3
OW-DETR 7.5 59.2 6.2 42.9 5.7 30.8 27.8

Our proposed splits



The splits are present inside data/VOC2007/OWDETR/ImageSets/ folder. The remaining dataset can be downloaded using this link

The files should be organized in the following structure:

OW-DETR/
└── data/
    └── VOC2007/
        └── OWDETR/
        	├── JPEGImages
        	├── ImageSets
        	└── Annotations

Currently, Dataloader and Evaluator followed for OW-DETR is in VOC format.

Results

Task1 Task2 Task3 Task4
Method U-Recall mAP U-Recall mAP U-Recall mAP mAP
ORE-EBUI 1.5 61.4 3.9 40.6 3.6 33.7 31.8
OW-DETR 5.7 71.5 6.2 43.8 6.9 38.5 33.1

Training

Training on single node

To train OW-DETR on a single node with 8 GPUS, run

./run.sh

Training on slurm cluster

To train OW-DETR on a slurm cluster having 2 nodes with 8 GPUS each, run

sbatch run_slurm.sh

Evaluation

For reproducing any of the above mentioned results please run the run_eval.sh file and add pretrained weights accordingly.

Note: For more training and evaluation details please check the Deformable DETR reposistory.

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Citation

If you use OW-DETR, please consider citing:

@inproceedings{gupta2021ow,
    title={OW-DETR: Open-world Detection Transformer}, 
    author={Gupta, Akshita and Narayan, Sanath and Joseph, KJ and 
    Khan, Salman and Khan, Fahad Shahbaz and Shah, Mubarak},
    booktitle={CVPR},
    year={2022}
}

Contact

Should you have any question, please contact 📧 [email protected]

Acknowledgments:

OW-DETR builds on previous works code base such as Deformable DETR, Detreg, and OWOD. If you found OW-DETR useful please consider citing these works as well.

Owner
Akshita Gupta
Sem @IITR | Outreachy @mozilla | Research Engineer @IIAI
Akshita Gupta
GUI for TOAD-GAN, a PCG-ML algorithm for Token-based Super Mario Bros. Levels.

If you are using this code in your own project, please cite our paper: @inproceedings{awiszus2020toadgan, title={TOAD-GAN: Coherent Style Level Gene

Maren A. 13 Dec 14, 2022
Code for "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection", ICRA 2021

FGR This repository contains the python implementation for paper "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection"(I

Yi Wei 31 Dec 08, 2022
Implementation of the famous Image Manipulation\Forgery Detector "ManTraNet" in Pytorch

Who has never met a forged picture on the web ? No one ! Everyday we are constantly facing fake pictures touched up in Photoshop but it is not always

Rony Abecidan 77 Dec 16, 2022
This is an official implementation for "Self-Supervised Learning with Swin Transformers".

Self-Supervised Learning with Vision Transformers By Zhenda Xie*, Yutong Lin*, Zhuliang Yao, Zheng Zhang, Qi Dai, Yue Cao and Han Hu This repo is the

Swin Transformer 529 Jan 02, 2023
Fibonacci Method Gradient Descent

An implementation of the Fibonacci method for gradient descent, featuring a TKinter GUI for inputting the function / parameters to be examined and a matplotlib plot of the function and results.

Emma 1 Jan 28, 2022
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks (SDPoint) This repository contains the cod

Jason Kuen 17 Jul 04, 2022
codes for IKM (arXiv2021, Submitted to IEEE Trans)

Image-specific Convolutional Kernel Modulation for Single Image Super-resolution This repository is for IKM introduced in the following paper Yuanfei

Yuanfei Huang 9 Dec 29, 2022
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 72 Dec 14, 2022
Object Detection with YOLOv3

Object Detection with YOLOv3 Bu projede YOLOv3-608 modeli kullanılmıştır. Requirements Python 3.8 OpenCV Numpy Documentation Yolo ile ilgili detaylı b

Ayşe Konuş 0 Mar 27, 2022
Code release of paper Improving neural implicit surfaces geometry with patch warping

NeuralWarp: Improving neural implicit surfaces geometry with patch warping Project page | Paper Code release of paper Improving neural implicit surfac

François Darmon 167 Dec 30, 2022
This package contains deep learning models and related scripts for RoseTTAFold

RoseTTAFold This package contains deep learning models and related scripts to run RoseTTAFold This repository is the official implementation of RoseTT

1.6k Jan 03, 2023
ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available.

ERISHA: Multilingual Multispeaker Expressive Text-to-Speech Library ERISHA is a multilingual multispeaker expressive speech synthesis framework. It ca

Ajinkya Kulkarni 43 Nov 27, 2022
Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

LUNAR Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks" Adam Goodge, Bryan Hooi, Ng See Kiong and

Adam Goodge 25 Dec 28, 2022
Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

xTune Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning. Environment DockerFile: dancingsoul/pytorch:xTune Install the f

Bo Zheng 42 Dec 09, 2022
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Chinese mandarin text to speech based on Fastspeech2 and Unet This is a modification and adpation of fastspeech2 to mandrin(普通话). Many modifications t

291 Jan 02, 2023
Python scripts using the Mediapipe models for Halloween.

Mediapipe-Halloween-Examples Python scripts using the Mediapipe models for Halloween. WHY Mainly for fun. But this repository also includes useful exa

Ibai Gorordo 23 Jan 06, 2023
Analysis of Antarctica sequencing samples contaminated with SARS-CoV-2

Analysis of SARS-CoV-2 reads in sequencing of 2018-2019 Antarctica samples in PRJNA692319 The samples analyzed here are described in this preprint, wh

Jesse Bloom 4 Feb 09, 2022
Deep Learning Slide Captcha

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 55 Jan 02, 2023
SOTA model in CIFAR10

A PyTorch Implementation of CIFAR Tricks 调研了CIFAR10数据集上各种trick,数据增强,正则化方法,并进行了实现。目前项目告一段落,如果有更好的想法,或者希望一起维护这个项目可以提issue或者在我的主页找到我的联系方式。 0. Requirement

PJDong 58 Dec 21, 2022
Random Walk Graph Neural Networks

Random Walk Graph Neural Networks This repository is the official implementation of Random Walk Graph Neural Networks. Requirements Code is written in

Giannis Nikolentzos 38 Jan 02, 2023