Augmenting Anchors by the Detector Itself

Related tags

Computer Visionaadi
Overview

Augmenting Anchors by the Detector Itself

Introduction

It is difficult to determine the scale and aspect ratio of anchors for anchor-based object detection methods. Current state-of-the-art object detectors either determine anchor parameters according to objects' shape and scale in a dataset, or avoid this problem by utilizing anchor-free method. In this paper, we propose a gradient-free anchor augmentation method named AADI, which means Augmenting Anchors by the Detector Itself. AADI is not an anchor-free method, but it converts the scale and aspect ratio of anchors from a continuous space to a discrete space, which greatly alleviates the problem of anchors' designation. Furthermore, AADI does not add any parameters or hyper-parameters, which is beneficial for future research and downstream tasks. Extensive experiments on COCO dataset show that AADI has obvious advantages for both two-stage and single-stage methods, specifically, AADI achieves at least 2.1 AP improvements on Faster R-CNN and 1.6 AP improvements on RetinaNet, using ResNet-50 model. We hope that this simple and cost-efficient method can be widely used in object detection.

  • For RPN

    • Baseline

      Num anchors AR100 AR1000 ARs ARm ARl
      1 45.5 55.6 31.4 52.8 60.0
      3 45.7 58.0 31.4 52.7 61.1
    • Ablation Study

      dilation Anchor Guided AR100 AR1000 ARs ARm ARl
      1 52.8 60.6 40.2 60.8 63.6
      2 54.8 64.7 39.0 63.1 70.6
      2 56.3 66.7 39.5 64.9 73.4
      3 53.7 64.0 35.4 62.1 73.9
      3 55.6 67.6 36.1 64.3 77.6
      4 52.2 60.5 30.9 61.3 76.6
      4 54.4 65.5 33.0 63.7 78.9
  • For RetinaNet

    • Ablation Study

      AADI dilation AP AP50 AP75 APs APm APl
      1 38.2 58.4 41.1 24.3 42.2 48.5
      1 37.3 56.4 40.2 22.0 39.9 46.8
      2 39.8 57.5 43.5 22.1 43.5 50.6
      3 38.3 54.6 41.7 20.0 43.1 51.1
    • With IoU

      AP AP50 AP75 APs APm APl
      40.2 57.7 43.8 24.1 43.1 52.2
    • With 3x schedule (RetinaNet with giou, AADI with smooth l1)

      Model AP AP50 AP75 APs APm APl
      RetinaNet 39.6 59.3 42.2 24.9 43.3 50.7
      AADI-RetinaNet 41.4 59.3 45.2 24.8 44.9 54.0
  • For Faster R-CNN

    • Ablation Study

      AADI dilation AP AP50 AP75 APs APm APl FPS
      1(3 anchors) 37.9 58.8 41.1 22.4 41.1 49.1 26.3
      2 40.3 59.3 44.3 24.2 43.3 52.2 22.4
      3 40.8 59.5 45.0 24.0 44.6 53.1 22.4
      4 40.5 58.7 44.6 23.2 44.8 52.7 22.3
    • 3x schedule

      Backbone AP AP50 AP75 APs APm APl FPS
      R-50 FPN 42.5 61.2 46.5 25.3 46.2 55.5 22.6
      DCN-50 FPN 44.1 63.1 48.2 28.3 46.9 58.4 20.1
      R-101 FPN 44.5 63.2 48.7 26.9 48.3 57.4 17.4
  • Detectron2

Detectron2 is Facebook AI Research's next generation library that provides state-of-the-art detection and segmentation algorithms. It is the successor of Detectron and maskrcnn-benchmark. It supports a number of computer vision research projects and production applications in Facebook.

Installation

See installation instructions.

Getting Started

See Getting Started with Detectron2, and the Colab Notebook to learn about basic usage.

Learn more at our documentation.

Citing Detectron2

@misc{wu2019detectron2,
  author =       {Yuxin Wu and Alexander Kirillov and Francisco Massa and
                  Wan-Yen Lo and Ross Girshick},
  title =        {Detectron2},
  howpublished = {\url{https://github.com/facebookresearch/detectron2}},
  year =         {2019}
}

@misc{wan2021augmenting,
      title={Augmenting Anchors by the Detector Itself}, 
      author={Xiaopei Wan and Shengjie Chen and Yujiu Yang and Zhenhua Guo and Fangbo Tao},
      year={2021},
      eprint={2105.14086},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
document image degradation

ocrodeg The ocrodeg package is a small Python library implementing document image degradation for data augmentation for handwriting recognition and OC

NVIDIA Research Projects 134 Nov 18, 2022
Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

Morphologycal-edge-detection-using-erosion-and-dialation the task is to detect object boundary using erosion or dialation . Here, use the kernel or st

Tamzid hasan 3 Nov 25, 2022
LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

Murtaza Hassan 815 Dec 29, 2022
Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

camloop Forget the boilerplate from OpenCV camera loops and get to coding the interesting stuff Table of Contents Usage Install Quickstart More advanc

Gabriel Lefundes 9 Nov 12, 2021
Pixie - A full-featured 2D graphics library for Python

Pixie - A full-featured 2D graphics library for Python Pixie is a 2D graphics library similar to Cairo and Skia. pip install pixie-python Features: Ty

treeform 65 Dec 30, 2022
This repository summarized computer vision theories.

This repository summarized computer vision theories.

3 Feb 04, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
Steve Tu 71 Dec 30, 2022
A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

NOTES: To ensure best results, make sure you are running this on a computer that has decent specs. 1920x1080 fullscreen is required in League, game mu

francis 125 Dec 30, 2022
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022
SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

NVIDIA Research Projects 31 Nov 22, 2022
Amazing 3D explosion animation using Pygame module.

3D Explosion Animation 💣 💥 🔥 Amazing explosion animation with Pygame. 💣 Explosion physics An Explosion instance is made of a set of Particle objec

Dylan Tintenfich 12 Mar 11, 2022
轻量级公式 OCR 小工具:一键识别各类公式图片,并转换为 LaTeX 格式

QC-Formula | 青尘公式 OCR 介绍 轻量级开源公式 OCR 小工具:一键识别公式图片,并转换为 LaTeX 格式。 支持从 电脑本地 导入公式图片;(后续版本将支持直接从网页导入图片) 公式图片支持 .png / .jpg / .bmp,大小为 4M 以内均可; 支持印刷体及手写体,前

青尘工作室 26 Jan 07, 2023
Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

This repository is mainly for camera intrinsic calibration and hand-eye calibration. Synthetic experiments are conducted in PyBullet simulator. 1. Tes

CAI Junhao 7 Oct 03, 2022
Tool which allow you to detect and translate text.

Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr

Damian Panek 176 Nov 28, 2022
Introduction to image processing, most used and popular functions of OpenCV

👀 OpenCV 101 Introduction to image processing, most used and popular functions of OpenCV go here.

Vusal Ismayilov 3 Jul 02, 2022
An easy to use an (hopefully useful) captcha solution for pyTelegramBotAPI

pyTelegramBotCAPTCHA An easy to use and (hopefully useful) image CAPTCHA soltion for pyTelegramBotAPI. Installation: pip install pyTelegramBotCAPTCHA

29 Dec 26, 2022
Handwritten_Text_Recognition

Deep Learning framework for Line-level Handwritten Text Recognition Short presentation of our project Introduction Installation 2.a Install conda envi

24 Jul 15, 2022
Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text

Christian Bartz 572 Jan 05, 2023
TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

Minghui Liao 930 Jan 04, 2023