PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Related tags

Deep LearningDeFRCN
Overview

Introduction

This repo contains the official PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Updates!!

  • 【2021/10/10】 We release the official PyTorch implementation of DeFRCN.
  • 【2021/08/20】 We have uploaded our paper (long version with supplementary material) on arxiv, review it for more details.

Quick Start

1. Check Requirements

  • Linux with Python >= 3.6
  • PyTorch >= 1.6 & torchvision that matches the PyTorch version.
  • CUDA 10.1, 10.2
  • GCC >= 4.9

2. Build DeFRCN

  • Clone Code
    git clone https://github.com/er-muyue/DeFRCN.git
    cd DeFRCN
    
  • Create a virtual environment (optional)
    virtualenv defrcn
    cd /path/to/venv/defrcn
    source ./bin/activate
    
  • Install PyTorch 1.6.0 with CUDA 10.1
    pip3 install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
  • Install Detectron2
    python3 -m pip install detectron2==0.3 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html
    
    • If you use other version of PyTorch/CUDA, check the latest version of Detectron2 in this page: Detectron2.
    • Sorry for that I don’t have enough time to test on more versions, if you run into problems with other versions, please let me know.
  • Install other requirements.
    python3 -m pip install -r requirements.txt
    

3. Prepare Data and Weights

  • Data Preparation
    • We evaluate our models on two datasets for both FSOD and G-FSOD settings:

      Dataset Size GoogleDrive BaiduYun Note
      VOC2007 0.8G download download -
      VOC2012 3.5G download download -
      vocsplit <1M download download refer from TFA
      COCO ~19G - - download from offical
      cocosplit 174M download download refer from TFA
    • Unzip the downloaded data-source to datasets and put it into your project directory:

        ...
        datasets
          | -- coco (trainval2014/*.jpg, val2014/*.jpg, annotations/*.json)
          | -- cocosplit
          | -- VOC2007
          | -- VOC2012
          | -- vocsplit
        defrcn
        tools
        ...
      
  • Weights Preparation
    • We use the imagenet pretrain weights to initialize our model. Download the same models from here: GoogleDrive BaiduYun
    • The extract code for all BaiduYun link is 0000

4. Training and Evaluation

For ease of training and evaluation over multiple runs, we integrate the whole pipeline of few-shot object detection into one script run_*.sh, including base pre-training and novel-finetuning (both FSOD and G-FSOD).

  • To reproduce the results on VOC, EXP_NAME can be any string (e.g defrcn, or something) and SPLIT_ID must be 1 or 2 or 3 (we consider 3 random splits like other papers).
    bash run_voc.sh EXP_NAME SPLIT_ID (1, 2 or 3)
    
  • To reproduce the results on COCO, EXP_NAME can be any string (e.g defrcn, or something)
    bash run_coco.sh EXP_NAME
    
  • Please read the details of few-shot object detection pipeline in run_*.sh, you need change IMAGENET_PRETRAIN* to your path.

Results on COCO Benchmark

  • Few-shot Object Detection

    Method mAPnovel
    Shot 1 2 3 5 10 30
    FRCN-ft 1.0* 1.8* 2.8* 4.0* 6.5 11.1
    FSRW - - - - 5.6 9.1
    MetaDet - - - - 7.1 11.3
    MetaR-CNN - - - - 8.7 12.4
    TFA 4.4* 5.4* 6.0* 7.7* 10.0 13.7
    MPSR 5.1* 6.7* 7.4* 8.7* 9.8 14.1
    FSDetView 4.5 6.6 7.2 10.7 12.5 14.7
    DeFRCN (Our Paper) 9.3 12.9 14.8 16.1 18.5 22.6
    DeFRCN (This Repo) 9.7 13.1 14.5 15.6 18.4 22.6
  • Generalized Few-shot Object Detection

    Method mAPnovel
    Shot 1 2 3 5 10 30
    FRCN-ft 1.7 3.1 3.7 4.6 5.5 7.4
    TFA 1.9 3.9 5.1 7 9.1 12.1
    FSDetView 3.2 4.9 6.7 8.1 10.7 15.9
    DeFRCN (Our Paper) 4.8 8.5 10.7 13.6 16.8 21.2
    DeFRCN (This Repo) 4.8 8.5 10.7 13.5 16.7 21.0
  • * indicates that the results are reproduced by us with their source code.
  • It's normal to observe -0.3~+0.3AP noise between your results and this repo.
  • The results of mAPbase and mAPall for G-FSOD are list here GoogleDrive, BaiduYun.
  • If you have any problem of above results in this repo, you can download configs and train logs from GoogleDrive, BaiduYun.

Results on VOC Benchmark

  • Few-shot Object Detection

    Method Split-1 Split-2 Split-3
    Shot 1 2 3 5 10 1 2 3 5 10 1 2 3 5 10
    YOLO-ft 6.6 10.7 12.5 24.8 38.6 12.5 4.2 11.6 16.1 33.9 13.0 15.9 15.0 32.2 38.4
    FRCN-ft 13.8 19.6 32.8 41.5 45.6 7.9 15.3 26.2 31.6 39.1 9.8 11.3 19.1 35.0 45.1
    FSRW 14.8 15.5 26.7 33.9 47.2 15.7 15.2 22.7 30.1 40.5 21.3 25.6 28.4 42.8 45.9
    MetaDet 18.9 20.6 30.2 36.8 49.6 21.8 23.1 27.8 31.7 43.0 20.6 23.9 29.4 43.9 44.1
    MetaR-CNN 19.9 25.5 35.0 45.7 51.5 10.4 19.4 29.6 34.8 45.4 14.3 18.2 27.5 41.2 48.1
    TFA 39.8 36.1 44.7 55.7 56.0 23.5 26.9 34.1 35.1 39.1 30.8 34.8 42.8 49.5 49.8
    MPSR 41.7 - 51.4 55.2 61.8 24.4 - 39.2 39.9 47.8 35.6 - 42.3 48.0 49.7
    DeFRCN (Our Paper) 53.6 57.5 61.5 64.1 60.8 30.1 38.1 47.0 53.3 47.9 48.4 50.9 52.3 54.9 57.4
    DeFRCN (This Repo) 55.1 57.4 61.1 64.6 61.5 32.1 40.5 47.9 52.9 47.5 48.9 51.9 52.3 55.7 59.0
  • Generalized Few-shot Object Detection

    Method Split-1 Split-2 Split-3
    Shot 1 2 3 5 10 1 2 3 5 10 1 2 3 5 10
    FRCN-ft 9.9 15.6 21.6 28.0 52.0 9.4 13.8 17.4 21.9 39.7 8.1 13.9 19 23.9 44.6
    FSRW 14.2 23.6 29.8 36.5 35.6 12.3 19.6 25.1 31.4 29.8 12.5 21.3 26.8 33.8 31.0
    TFA 25.3 36.4 42.1 47.9 52.8 18.3 27.5 30.9 34.1 39.5 17.9 27.2 34.3 40.8 45.6
    FSDetView 24.2 35.3 42.2 49.1 57.4 21.6 24.6 31.9 37.0 45.7 21.2 30.0 37.2 43.8 49.6
    DeFRCN (Our Paper) 40.2 53.6 58.2 63.6 66.5 29.5 39.7 43.4 48.1 52.8 35.0 38.3 52.9 57.7 60.8
    DeFRCN (This Repo) 43.8 57.5 61.4 65.3 67.0 31.5 40.9 45.6 50.1 52.9 38.2 50.9 54.1 59.2 61.9
  • Note that we change the λGDL-RCNN for VOC to 0.001 (0.01 in paper) and get better performance, check the configs for more details.

  • The results of mAPbase and mAPall for G-FSOD are list here GoogleDrive, BaiduYun.

  • If you have any problem of above results in this repo, you can download configs and logs from GoogleDrive, BaiduYun.

Acknowledgement

This repo is developed based on TFA and Detectron2. Please check them for more details and features.

Citing

If you use this work in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@inproceedings{qiao2021defrcn,
  title={DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection},
  author={Qiao, Limeng and Zhao, Yuxuan and Li, Zhiyuan and Qiu, Xi and Wu, Jianan and Zhang, Chi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={8681--8690},
  year={2021}
}
Hand Gesture Volume Control | Open CV | Computer Vision

Gesture Volume Control Hand Gesture Volume Control | Open CV | Computer Vision Use gesture control to change the volume of a computer. First we look i

Jhenil Parihar 3 Jun 15, 2022
T2F: text to face generation using Deep Learning

⭐ [NEW] ⭐ T2F - 2.0 Teaser (coming soon ...) Please note that all the faces in the above samples are generated ones. The T2F 2.0 will be using MSG-GAN

Animesh Karnewar 533 Dec 22, 2022
A foreign language learning aid using a neural network to predict probability of translating foreign words

Langy Langy is a reading-focused foreign language learning aid orientated towards young children. Reading is an activity that every child knows. It is

Shona Lowden 6 Nov 17, 2021
Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe

Traductor de señas Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe Requerimientos 🔧 Python 3.8 o inferior para evitar

Jahaziel Hernandez Hoyos 3 Nov 12, 2022
🧑‍🔬 verify your TEAL program by experiment and observation

Graviton - Testing TEAL with Dry Runs Tutorial Local Installation The following instructions assume that you have make available in your local environ

Algorand 18 Jan 03, 2023
Position detection system of mobile robot in the warehouse enviroment

Autonomous-Forklift-System About | GUI | Tests | Starting | License | Author | 🎯 About An application that run the autonomous forklift paletization a

Kamil Goś 1 Nov 24, 2021
Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

scikit-opt Swarm Intelligence in Python (Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Algorithm, Immune Algorithm,A

郭飞 3.7k Jan 03, 2023
Unofficial PyTorch Implementation of "DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features"

Pytorch Implementation of Deep Orthogonal Fusion of Local and Global Features (DOLG) This is the unofficial PyTorch Implementation of "DOLG: Single-St

DK 96 Jan 06, 2023
🎁 3,000,000+ Unsplash images made available for research and machine learning

The Unsplash Dataset The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of

Unsplash 2k Jan 03, 2023
Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

Image Translation with ASAPNets Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021 Webpage | Paper | Video Installation insta

Tamar Rott Shaham 100 Dec 28, 2022
The world's largest toxicity dataset.

The Toxicity Dataset by Surge AI Saving the internet is fun. Combing through thousands of online comments to build a toxicity dataset isn't. That's wh

Surge AI 134 Dec 19, 2022
Answering Open-Domain Questions of Varying Reasoning Steps from Text

This repository contains the authors' implementation of the Iterative Retriever, Reader, and Reranker (IRRR) model in the EMNLP 2021 paper "Answering Open-Domain Questions of Varying Reasoning Steps

26 Dec 22, 2022
i-RevNet Pytorch Code

i-RevNet: Deep Invertible Networks Pytorch implementation of i-RevNets. i-RevNets define a family of fully invertible deep networks, built from a succ

Jörn Jacobsen 378 Dec 06, 2022
The Submission for SIMMC 2.0 Challenge 2021

The Submission for SIMMC 2.0 Challenge 2021 challenge website Requirements python 3.8.8 pytorch 1.8.1 transformers 4.8.2 apex for multi-gpu nltk Prepr

5 Jul 26, 2022
基于深度强化学习的原神自动钓鱼AI

原神自动钓鱼AI由YOLOX, DQN两部分模型组成。使用迁移学习,半监督学习进行训练。 模型也包含一些使用opencv等传统数字图像处理方法实现的不可学习部分。

4.2k Jan 01, 2023
Training Very Deep Neural Networks Without Skip-Connections

DiracNets v2 update (January 2018): The code was updated for DiracNets-v2 in which we removed NCReLU by adding per-channel a and b multipliers without

Sergey Zagoruyko 585 Oct 12, 2022
Text Generation by Learning from Demonstrations

Text Generation by Learning from Demonstrations The README was last updated on March 7, 2021. The repo is based on fairseq (v0.9.?). Paper arXiv Prere

38 Oct 21, 2022
The Official Repository for "Generalized OOD Detection: A Survey"

Generalized Out-of-Distribution Detection: A Survey 1. Overview This repository is with our survey paper: Title: Generalized Out-of-Distribution Detec

Jingkang Yang 338 Jan 03, 2023
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 108 Dec 27, 2022
Self-labelling via simultaneous clustering and representation learning. (ICLR 2020)

Self-labelling via simultaneous clustering and representation learning 🆗 🆗 🎉 NEW models (20th August 2020): Added standard SeLa pretrained torchvis

Yuki M. Asano 469 Jan 02, 2023