[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

Overview

SoCo

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu, Stephen Lin.

* Equal contribution.

Introduction

Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning. Such generality for transfer learning, however, sacrifices specificity if we are interested in a certain downstream task. We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task. In this paper, we follow this principle with a pretraining method specifically designed for the task of object detection. We attain alignment in the following three aspects:

  1. object-level representations are introduced via selective search bounding boxes as object proposals;
  2. the pretraining network architecture incorporates the same dedicated modules used in the detection pipeline (e.g. FPN);
  3. the pretraining is equipped with object detection properties such as object-level translation invariance and scale invariance. Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection using a Mask R-CNN framework.

Architecture

Main results

The pretrained models will be available soon.

SoCo pre-trained models

Model Arch Epochs Scripts Download
SoCo ResNet50-C4 100 SoCo_C4_100ep
SoCo ResNet50-C4 400 SoCo_C4_400ep
SoCo ResNet50-FPN 100 SoCo_FPN_100ep
SoCo ResNet50-FPN 400 SoCo_FPN_400ep
SoCo* ResNet50-FPN 400 SoCo_FPN_Star_400ep

Results on COCO with MaskRCNN R50-FPN

Methods Epoch APbb APbb50 APbb75 APmk APmk50 APmk75 Detectron2 trained
Scratch - 31.0 49.5 33.2 28.5 46.8 30.4 --
Supervised 90 38.9 59.6 42.7 35.4 56.5 38.1 --
SoCo 100 42.3 62.5 46.5 37.6 59.1 40.5
SoCo 400 43.0 63.3 47.1 38.2 60.2 41.0
SoCo* 400 43.2 63.5 47.4 38.4 60.2 41.4

Results on COCO with MaskRCNN R50-C4

Methods Epoch APbb APbb50 APbb75 APmk APmk50 APmk75 Detectron2 trained
Scratch - 26.4 44.0 27.8 29.3 46.9 30.8 --
Supervised 90 38.2 58.2 41.2 33.3 54.7 35.2 --
SoCo 100 40.4 60.4 43.7 34.9 56.8 37.0
SoCo 400 40.9 60.9 44.3 35.3 57.5 37.3

Get started

Requirements

The Dockerfile is included, please refer to it.

Prepare data with Selective Search

  1. Generate Selective Search proposals
    python selective_search/generate_imagenet_ss_proposals.py
  2. Filter out not valid proposals with filter strategy
    python selective_search/filter_ss_proposals_json.py
  3. Post preprocessing for no proposals images
    python selective_search/filter_ss_proposals_json_post_no_prop.py

Pretrain with SoCo

Use SoCo FPN 100 epoch as example.

bash ./tools/SoCo_FPN_100ep.sh

Finetune detector

  1. Copy the folder detectron2_configs to the root folder of Detectron2
  2. Train the detectors with Detectron2

Citation

@article{wei2021aligning,
  title={Aligning Pretraining for Detection via Object-Level Contrastive Learning},
  author={Wei, Fangyun and Gao, Yue and Wu, Zhirong and Hu, Han and Lin, Stephen},
  journal={arXiv preprint arXiv:2106.02637},
  year={2021}
}
Owner
Yue Gao
Researcher at Microsoft Research Asia
Yue Gao
Open-Set Recognition: A Good Closed-Set Classifier is All You Need

Open-Set Recognition: A Good Closed-Set Classifier is All You Need Code for our paper: "Open-Set Recognition: A Good Closed-Set Classifier is All You

194 Jan 03, 2023
You Only 👀 One Sequence

You Only 👀 One Sequence TL;DR: We study the transferability of the vanilla ViT pre-trained on mid-sized ImageNet-1k to the more challenging COCO obje

Hust Visual Learning Team 666 Jan 03, 2023
Python code for the paper How to scale hyperparameters for quickshift image segmentation

How to scale hyperparameters for quickshift image segmentation Python code for the paper How to scale hyperparameters for quickshift image segmentatio

0 Jan 25, 2022
Understanding Hyperdimensional Computing for Parallel Single-Pass Learning

Understanding Hyperdimensional Computing for Parallel Single-Pass Learning Authors: Tao Yu* Yichi Zhang* Zhiru Zhang Christopher De Sa *: Equal Contri

Cornell RelaxML 4 Sep 08, 2022
PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer [Paper] [PyTorch Implementation] [Paddle Implementation] Overview This reposit

148 Dec 30, 2022
This repo contains the code required to train the multivariate time-series Transformer.

Multi-Variate Time-Series Transformer This repo contains the code required to train the multivariate time-series Transformer. Download the data The No

Gregory Duthé 4 Nov 24, 2022
[NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images

Unsupervised Object-Level Representation Learning from Scene Images This repository contains the official PyTorch implementation of the ORL algorithm

Jiahao Xie 55 Dec 03, 2022
SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation, CVPR 2022

SparseInst 🚀 A simple framework for real-time instance segmentation, CVPR 2022 by Tianheng Cheng, Xinggang Wang†, Shaoyu Chen, Wenqiang Zhang, Qian Z

Hust Visual Learning Team 458 Jan 05, 2023
GPU-accelerated Image Processing library using OpenCL

pyclesperanto pyclesperanto is a python package for clEsperanto - a multi-language framework for GPU-accelerated image processing. clEsperanto uses Op

17 Dec 25, 2022
Unofficial Implementation of Oboe (SIGCOMM'18').

Oboe-Reproduce This is the unofficial implementation of the paper "Oboe: Auto-tuning video ABR algorithms to network conditions, Zahaib Akhtar, Yun Se

Tianchi Huang 13 Nov 04, 2022
Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features

[Paper] [Project page] This repository contains code for the paper: Andrew Owens, Alexei A. Efros. Audio-Visual Scene Analysis with Self-Supervised Mu

Andrew Owens 202 Dec 13, 2022
Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.

Fully Adaptive Bayesian Algorithm for Data Analysis FABADA FABADA is a novel non-parametric noise reduction technique which arise from the point of vi

18 Oct 20, 2022
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

This is the official PyTorch implementation of the ALBEF paper [Blog]. This repository supports pre-training on custom datasets, as well as finetuning on VQA, SNLI-VE, NLVR2, Image-Text Retrieval on

Salesforce 805 Jan 09, 2023
this is a lite easy to use virtual keyboard project for anyone to use

virtual_Keyboard this is a lite easy to use virtual keyboard project for anyone to use motivation I made this for this year's recruitment for RobEn AA

Mohamed Emad 3 Oct 23, 2021
Simple Linear 2nd ODE Solver GUI - A 2nd constant coefficient linear ODE solver with simple GUI using euler's method

Simple_Linear_2nd_ODE_Solver_GUI Description It is a 2nd constant coefficient li

:) 4 Feb 05, 2022
Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data

Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data This is the official PyTorch implementation of the SeCo paper: @articl

ElementAI 101 Dec 12, 2022
A Python library for generating new text from existing samples.

ReMarkov is a Python library for generating text from existing samples using Markov chains. You can use it to customize all sorts of writing from birt

8 May 17, 2022
A cross-document event and entity coreference resolution system, trained and evaluated on the ECB+ corpus.

A Comprehensive Comparison of Word Embeddings in Event & Entity Coreference Resolution. Introduction This repo contains experimental code derived from

2 May 09, 2022
Implementation of Change-Based Exploration Transfer (C-BET)

Implementation of Change-Based Exploration Transfer (C-BET), as presented in Interesting Object, Curious Agent: Learning Task-Agnostic Exploration.

Simone Parisi 29 Dec 04, 2022
MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

MiraiML Mirai: future in japanese. MiraiML is an asynchronous engine for continuous & autonomous machine learning, built for real-time usage. Usage In

Arthur Paulino 25 Jul 27, 2022