LQM - Improving Object Detection by Estimating Bounding Box Quality Accurately

Related tags

Deep LearningLQM
Overview

Improving Object Detection by Estimating Bounding Box Quality Accurately

Abstract

Object detection aims to locate and classify object instances in images. Therefore, the object detection model is generally implemented with two parallel branches to optimize localization and classification. After training the detection model, we should select the best bounding box of each class among a number of estimations for reliable inference. Generally, NMS (Non Maximum Suppression) is operated to suppress low-quality bounding boxes by referring to classification scores or center-ness scores. However, since the quality of bounding boxes is not considered, the low-quality bounding boxes can be accidentally selected as a positive bounding box for the corresponding class. We believe that this misalignment between two parallel tasks causes degrading of the object detection performance. In this paper, we propose a method to estimate bounding boxes' quality using four-directional Gaussian quality modeling, which leads the consistent results between two parallel branches. Extensive experiments on the MS COCO benchmark show that the proposed method consistently outperforms the baseline (FCOS). Eventually, our best model offers the state-of-the-art performance by achieving 48.9% in AP. We also confirm the efficiency of the method by comparing the number of parameters and computational overhead.

Overall Architecture

Implementation Details

We implement our detection model on top of MMDetection (v2.6), an open source object detection toolbox. If not specified separately, the default settings of FCOS implementation are not changed. We train and validate our network on four RTX TITAN GPUs in the environment of Pytorch v1.6 and CUDA v10.2.

Please see GETTING_STARTED.md for the basic usage of MMDetection.

Installation


  1. Clone the this repository.

    git clone https://github.com/POSTECH-IMLAB/LQM.git
    cd LQM
  2. Create a conda virtural environment and install dependencies.

    conda env create -f environment.yml
  3. Activate conda environment

    conda activate lqm
  4. Install build requirements and then install MMDetection.

    pip install --force-reinstall mmcv-full==1.1.5 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.6.0/index.html
    pip install -v -e .

Preparing MS COCO dataset


bash download_coco.sh

Preparing Pre-trained model weights


bash download_weights.sh

Train


# assume that you are under the root directory of this project,
# and you have activated your virtual environment if needed.
# and with COCO dataset in 'data/coco/'

./tools/dist_train.sh configs/uncertainty_guide/uncertainty_guide_r50_fpn_1x.py 4 --validate

Inference


./tools/dist_test.sh configs/uncertainty_guide/uncertainty_guide_r50_fpn_1x.py work_dirs/uncertainty_guide_r50_fpn_1x/epoch_12.pth 4 --eval bbox

Image demo using pretrained model weight


# Result will be saved under the demo directory of this project (detection_result.jpg)
# config, checkpoint, source image path are needed (If you need pre-trained weights, you can download them from provided google drive link)
# score threshold is optional

python demo/LQM_image_demo.py --config configs/uncertainty_guide/uncertainty_guide_r50_fpn_1x.py --checkpoint work_dirs/pretrained/LQM_r50_fpn_1x.pth --img data/coco/test2017/000000011245.jpg --score-thr 0.3

Webcam demo using pretrained model weight


# config, checkpoint path are needed (If you need pre-trained weights, you can download them from provided google drive link)
# score threshold is optional

python demo/webcam_demo.py configs/uncertainty_guide/uncertainty_guide_r50_fpn_1x.py work_dirs/pretrained/LQM_r50_fpn_1x.pth

Models


For your convenience, we provide the following trained models. All models are trained with 16 images in a mini-batch with 4 GPUs.

Model Multi-scale training AP (minival) Link
LQM_R50_FPN_1x No 40.0 Google
LQM_R101_FPN_2x Yes 44.8 Google
LQM_R101_dcnv2_FPN_2x Yes 47.4 Google
LQM_X101_FPN_2x Yes 47.2 Google
LQM_X101_dcnv2_FPN_2x Yes 48.9 Google
Owner
IM Lab., POSTECH
IM Lab., POSTECH
Instance-conditional Knowledge Distillation for Object Detection

Instance-conditional Knowledge Distillation for Object Detection This is a MegEngine implementation of the paper "Instance-conditional Knowledge Disti

MEGVII Research 47 Nov 17, 2022
Project of 'TBEFN: A Two-branch Exposure-fusion Network for Low-light Image Enhancement '

TBEFN: A Two-branch Exposure-fusion Network for Low-light Image Enhancement Codes for TMM20 paper "TBEFN: A Two-branch Exposure-fusion Network for Low

KUN LU 31 Nov 06, 2022
Adversarial Reweighting for Partial Domain Adaptation

Adversarial Reweighting for Partial Domain Adaptation Code for paper "Xiang Gu, Xi Yu, Yan Yang, Jian Sun, Zongben Xu, Adversarial Reweighting for Par

12 Dec 01, 2022
An implementation of chunked, compressed, N-dimensional arrays for Python.

Zarr Latest Release Package Status License Build Status Coverage Downloads Gitter Citation What is it? Zarr is a Python package providing an implement

Zarr Developers 1.1k Dec 30, 2022
Code for "ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on", accepted at WACV 2021 Generation of Human Behavior Workshop.

ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on [ Paper ] [ Project Page ] This repository contains the code fo

Andrew Jong 97 Dec 13, 2022
Geneva is an artificial intelligence tool that defeats censorship by exploiting bugs in censors

Geneva is an artificial intelligence tool that defeats censorship by exploiting bugs in censors

Kevin Bock 1.5k Jan 06, 2023
A Lightweight Experiment & Resource Monitoring Tool πŸ“Ί

Lightweight Experiment & Resource Monitoring πŸ“Ί "Did I already run this experiment before? How many resources are currently available on my cluster?"

170 Dec 28, 2022
Official repository for Hierarchical Opacity Propagation for Image Matting

HOP-Matting Official repository for Hierarchical Opacity Propagation for Image Matting 🚧 🚧 🚧 Under Construction 🚧 🚧 🚧 🚧 🚧 🚧   Coming Soon   

Li Yaoyi 54 Dec 30, 2021
Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

54 Dec 15, 2022
Code for our paper "Sematic Representation for Dialogue Modeling" in ACL2021

AMR-Dialogue An implementation for paper "Semantic Representation for Dialogue Modeling". You may find our paper here. Requirements python 3.6 pytorch

xfbai 45 Dec 26, 2022
Patch2Pix: Epipolar-Guided Pixel-Level Correspondences [CVPR2021]

Patch2Pix for Accurate Image Correspondence Estimation This repository contains the Pytorch implementation of our paper accepted at CVPR2021: Patch2Pi

Qunjie Zhou 199 Nov 29, 2022
This project contains an implemented version of Face Detection using OpenCV and Mediapipe. This is a code snippet and can be used in projects.

Live-Face-Detection Project Description: In this project, we will be using the live video feed from the camera to detect Faces. It will also detect so

Hassan Shahzad 3 Oct 02, 2021
Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation Woncheol Shin1, Gyubok Lee1, Jiyoung Lee1, Joonseok Lee2,3, Edward Ch

Woncheol Shin 7 Sep 26, 2022
Simulating an AI playing 2048 using the Expectimax algorithm

2048-expectimax Simulating an AI playing 2048 using the Expectimax algorithm The base game engine uses code from here. The AI player is modeled as a m

Subha Ramesh 2 Jan 31, 2022
PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

36 Sep 13, 2022
Elegy is a framework-agnostic Trainer interface for the Jax ecosystem.

Elegy Elegy is a framework-agnostic Trainer interface for the Jax ecosystem. Main Features Easy-to-use: Elegy provides a Keras-like high-level API tha

435 Dec 30, 2022
Face Mask Detector by live camera using tensorflow-keras, openCV and Python

Face Mask Detector 😷 by Live Camera Detecting masked or unmasked faces by live camera with percentange of mask occupation About Project: This an Arti

Karan Shingde 2 Apr 04, 2022
Open Source Light Field Toolbox for Super-Resolution

BasicLFSR BasicLFSR is an open-source and easy-to-use Light Field (LF) image Super-Ressolution (SR) toolbox based on PyTorch, including a collection o

Squidward 50 Nov 18, 2022
Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Adaptive Class Suppression Loss for Long-Tail Object Detection This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppressio

CASIA-IVA-Lab 67 Dec 04, 2022
Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

T2I_CL This is the official Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning Requirements Linux Python

42 Dec 31, 2022