Official implementation of NeurIPS 2021 paper "Contextual Similarity Aggregation with Self-attention for Visual Re-ranking"

Related tags

Deep LearningCSA
Overview

CSA: Contextual Similarity Aggregation with Self-attention for Visual Re-ranking

PyTorch training code for CSA (Contextual Similarity Aggregation). We propose a visual re-ranking method by contextual similarity aggregation with transformer, obtaining 80.3 mAP on ROxf with Medium evaluation protocols. Inference in 50 lines of PyTorch.

CSA

What it is. Unlike traditional visual reranking techniques, CSA uses the similarity between the image and the anchor image as a representation of the image, which is defined as affinity feature. It consists of a contrastive loss that forces the relevant images to have larger cosine similarity and vice versa, an MSE loss that preserves the information of the original affinity features, and a Transformer encoder architecture. Given ranking list returned by the first-round retrieval, CSA first choose the top-L images in ranking list as the anchor images and calculates the affinity features of the top-K candidates,then dynamically refine the affinity features of different candiates in parallel. Due to this parallel nature, CSA is very fast and efficient.

About the code. CSA is very simple to implement and experiment with, and we provide a Notebook showing how to do inference with CSA in only a few lines of PyTorch code. Training code follows this idea - it is not a library, but simply a train.py importing model and criterion definitions with standard training loops.

mAP performance of the proposed model

We provide results of baseline CSA and CSA trained with data augmentation. mAP is computed with Medium and Hard evaluation protocols. model will come soon. CSA

Requirements

  • Python 3
  • PyTorch tested on 1.7.1+, torchvision 0.8.2+
  • numpy
  • matplotlib

Usage - Visual Re-ranking

There are no extra compiled components in CSA and package dependencies are minimal, so the code is very simple to use. We provide instructions how to install dependencies via conda. Install PyTorch 1.7.1+ and torchvision 0.8.2+:

conda install -c pytorch pytorch torchvision

Data preparation

Before going further, please check out Filip Radenovic's great repository on image retrieval. We use his code and model to extract features for training images. If you use this code in your research, please also cite their work! link to license

Download and extract rSfm120k train and val images with annotations from http://cmp.felk.cvut.cz/cnnimageretrieval/.

Download ROxf and RPar datastes with annotations. Prepare features for testing and training images with Filip Radenovic's model and code. We expect the directory structure to be the following:

path/to/data/
  ├─ annotations # annotation pkl files
  │   ├─ retrieval-SfM-120k.pkl
  │   ├─ roxford5k
  |   |   ├─ gnd_roxford5k.mat
  |   |   └─ gnd_roxford5k.pkl
  |   └─ rparis6k
  |   |   ├─ gnd_rparis6k.mat
  |   |   └─ gnd_rparis6k.pkl
  ├─ test # test features		
  |   ├─ r1m
  |   |   ├─ gl18-tl-resnet101-gem-w.pkl
  |   |   └─ rSfM120k-tl-resnet101-gem-w.pkl
  │   ├─ roxford5k
  |   |   ├─ gl18-tl-resnet101-gem-w.pkl
  |   |   └─ rSfM120k-tl-resnet101-gem-w.pkl
  |   └─ rparis6k
  |   |   ├─ gl18-tl-resnet101-gem-w.pkl
  |   |   └─ rSfM120k-tl-resnet101-gem-w.pkl
  └─ train # train features
      ├─ gl18-tl-resnet50-gem-w.pkl
      ├─ gl18-tl-resnet101-gem-w.pkl
      └─ gl18-tl-resnet152-gem-w.pkl

Training

To train baseline CSA on a single node with 4 gpus for 100 epochs run:

sh experiment_rSfm120k.sh

A single epoch takes 10 minutes, so 100 epoch training takes around 17 hours on a single machine with 4 2080Ti cards. To ease reproduction of our results we provide results and training logs for 200 epoch schedule (34 hours on a single machine).

We train CSA with SGD setting learning rate in the transformer to 0.1. The transformer is trained with dropout of 0.1, and the whole model is trained with grad clip of 1.0. To train CSA with data augmentation a single node with 4 gpus for 100 epochs run:

sh experiment_augrSfm120k.sh

Evaluation

To evaluate CSA on Roxf and Rparis with a single GPU run:

sh test.sh

and get results as below

>> Test Dataset: roxford5k *** fist-stage >>
>> gl18-tl-resnet101-gem-w: mAP Medium: 67.3, Hard: 44.24
>> gl18-tl-resnet101-gem-w: [email protected][1, 5, 10] Medium: [95.71 90.29 84.57], Hard: [87.14 69.71 59.86]

>> Test Dataset: roxford5k *** rerank-topk1024 >>
>> gl18-tl-resnet101-gem-w: mAP Medium: 77.92, Hard: 58.43
>> gl18-tl-resnet101-gem-w: [email protected][1, 5, 10] Medium: [94.29 93.14 89.71], Hard: [87.14 83.43 73.14]

>> Test Dataset: rparis6k *** fist-stage >>
>> gl18-tl-resnet101-gem-w: mAP Medium: 80.57, Hard: 61.46
>> gl18-tl-resnet101-gem-w: [email protected][1, 5, 10] Medium: [100.    98.    96.86], Hard: [97.14 93.14 90.57]

>> Test Dataset: rparis6k *** query-rerank-1024 >>
>> gl18-tl-resnet101-gem-w: mAP Medium: 87.2, Hard: 74.41
>> gl18-tl-resnet101-gem-w: [email protected][1, 5, 10] Medium: [100.    97.14  96.57], Hard: [95.71 92.86 90.14]

Qualitative examples

Selected qualitative examples of our re-ranking method. Top-10 results are shown in the figure. The figure is divided into four groups which consist of a result of initial retrieval and a result of our re-ranking method. The first two groups are the successful cases and the other two groups arethe failed cases. The images on the left with orange bounding boxes are the queries. The image with green denotes the true positives and the red bounding boxes are false positives. CSA

License

CSA is released under the MIT license. Please see the LICENSE file for more information.

Owner
Hui Wu
Department of Electronic Engineering and Information Science University of Science and Technology of China
Hui Wu
PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"

Efficient Neural Architecture Search (ENAS) in PyTorch PyTorch implementation of Efficient Neural Architecture Search via Parameters Sharing. ENAS red

Taehoon Kim 2.6k Dec 31, 2022
Efficient Online Bayesian Inference for Neural Bandits

Efficient Online Bayesian Inference for Neural Bandits By Gerardo Durán-Martín, Aleyna Kara, and Kevin Murphy AISTATS 2022.

Probabilistic machine learning 49 Dec 27, 2022
Le dataset des images du projet d'IA de 2021

face-mask-dataset-ilc-2021 Le dataset des images du projet d'IA de 2021, Indiquez vos id git dans la issue pour les droits TL;DR: Choisir 200 images J

7 Nov 15, 2021
Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch

Lie Transformer - Pytorch (wip) Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch. Only the SE3 version will be present in thi

Phil Wang 78 Oct 26, 2022
Job Assignment System by Real-time Emotion Detection

Emotion-Detection Job Assignment System by Real-time Emotion Detection Emotion is the essential role of facial expression and it could provide a lot o

1 Feb 08, 2022
IOT: Instance-wise Layer Reordering for Transformer Structures

Introduction This repository contains the code for Instance-wise Ordered Transformer (IOT), which is introduced in the ICLR2021 paper IOT: Instance-wi

IOT 19 Nov 15, 2022
World Models with TensorFlow 2

World Models This repo reproduces the original implementation of World Models. This implementation uses TensorFlow 2.2. Docker The easiest way to hand

Zac Wellmer 234 Nov 30, 2022
The-Secret-Sharing-Schemes - This interactive script demonstrates the Secret Sharing Schemes algorithm

The-Secret-Sharing-Schemes This interactive script demonstrates the Secret Shari

Nishaant Goswamy 1 Jan 02, 2022
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

DeepCTR DeepCTR is a Easy-to-use,Modular and Extendible package of deep-learning based CTR models along with lots of core components layers which can

浅梦 6.6k Jan 08, 2023
Pytorch implementation of MalConv

MalConv-Pytorch A Pytorch implementation of MalConv Desciprtion This is the implementation of MalConv proposed in Malware Detection by Eating a Whole

Alexander H. Liu 58 Oct 26, 2022
Making a music video with Wav2CLIP and VQGAN-CLIP

music2video Overview A repo for making a music video with Wav2CLIP and VQGAN-CLIP. The base code was derived from VQGAN-CLIP The CLIP embedding for au

Joel Jang | 장요엘 163 Dec 26, 2022
ColBERT: Contextualized Late Interaction over BERT (SIGIR'20)

Update: if you're looking for ColBERTv2 code, you can find it alongside a new simpler API, in the branch new_api. ColBERT ColBERT is a fast and accura

Stanford Future Data Systems 637 Jan 08, 2023
Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

fwhr-calc-website This project is to automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azur

SoohyunPark 1 Feb 07, 2022
Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation"

SharinGAN Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation" The official project we

Koutilya PNVR 23 Oct 19, 2022
Keras implementation of the GNM model in paper ’Graph-Based Semi-Supervised Learning with Nonignorable Nonresponses‘

Graph-based joint model with Nonignorable Missingness (GNM) This is a Keras implementation of the GNM model in paper ’Graph-Based Semi-Supervised Lear

Fan Zhou 2 Apr 17, 2022
Space Ship Simulator using python

FlyOver Basic space-ship simulator using python How to run? Just double click run.py What modules do i need? All modules that i currently using is bui

0 Oct 09, 2022
A booklet on machine learning systems design with exercises

Machine Learning Systems Design Read this booklet here. This booklet covers four main steps of designing a machine learning system: Project setup Data

Chip Huyen 7.6k Jan 08, 2023
mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms. It provides easily interchangeable modeling and planning components, and a set of utility function

Facebook Research 724 Jan 04, 2023
Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021) 99% of the code in this repository originates from this link. ICCV 2021 pap

Jeesoo Kim 10 Feb 01, 2022