PyTorch implementation of SIFT descriptor

Last update: Dec 24, 2022

Overview

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can be used for descriptop-based learning shape of affine feature.

UPD 08/2019 : pytorch-sift is added to kornia and available by kornia.features.SIFTDescriptor

There are different implementations of the SIFT on the web. I tried to match Michal Perdoch implementation, which gives high quality features for image retrieval CVPR2009. However, on planar datasets, it is inferior to vlfeat implementation. The main difference is gaussian weighting window parameters, so I have made a vlfeat-like version too. MP version weights patch center much more (see image below, left) and additionally crops everything outside the circular region. Right is vlfeat version

descriptor_mp_mode = SIFTNet(patch_size = 65,
                        sigma_type= 'hesamp',
                        masktype='CircularGauss')

descriptor_vlfeat_mode = SIFTNet(patch_size = 65,
                        sigma_type= 'vlfeat',
                        masktype='Gauss')

Results:

OPENCV-SIFT - mAP 
   Easy     Hard      Tough     mean
-------  -------  ---------  -------
0.47788  0.20997  0.0967711  0.26154

VLFeat-SIFT - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.466584  0.203966  0.0935743  0.254708

PYTORCH-SIFT-VLFEAT-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.472563  0.202458  0.0910371  0.255353

NUMPY-SIFT-VLFEAT-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.449431  0.197918  0.0905395  0.245963

PYTORCH-SIFT-MP-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.430887  0.184834  0.0832707  0.232997

NUMPY-SIFT-MP-65 - mAP 
    Easy     Hard      Tough      mean
--------  -------  ---------  --------
0.417296  0.18114  0.0820582  0.226832

Speed:

0.00246 s per 65x65 patch - numpy SIFT
0.00028 s per 65x65 patch - C++ SIFT
0.00074 s per 65x65 patch - CPU, 256 patches per batch
0.00038 s per 65x65 patch - GPU (GM940, mobile), 256 patches per batch
0.00038 s per 65x65 patch - GPU (GM940, mobile), 256 patches per batch

If you use this code for academic purposes, please cite the following paper:

@InProceedings{AffNet2018,
    title = {Repeatability Is Not Enough: Learning Affine Regions via Discriminability},
    author = {Dmytro Mishkin, Filip Radenovic, Jiri Matas},
    booktitle = {Proceedings of ECCV},
    year = 2018,
    month = sep
}

PyTorch implementation of SIFT descriptor

Related tags

Overview

Owner

Dmytro Mishkin

'Aligned mixture of latent dynamical systems' (amLDS) for stimulus decoding probabilistic manifold alignment across animals. P. Herrero-Vidal et al. NeurIPS 2021 code.

Improving Deep Network Debuggability via Sparse Decision Layers

This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

Implements Stacked-RNN in numpy and torch with manual forward and backward functions

Mouse Brain in the Model Zoo

Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite

Code Impementation for "Mold into a Graph: Efficient Bayesian Optimization over Mixed Spaces"

The world's largest toxicity dataset.

A universal framework for learning timestamp-level representations of time series

[NeurIPS 2021] Low-Rank Subspaces in GANs

This is the open-source reference implementation of the SIGGRAPH 2021 paper Intersection-free Rigid Body Dynamics.

FreeSOLO for unsupervised instance segmentation, CVPR 2022

Train Scene Graph Generation for Visual Genome and GQA in PyTorch >= 1.2 with improved zero and few-shot generalization.

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

AI Virtual Calculator: This is a simple virtual calculator based on Artificial intelligence.

(CVPR 2022) Energy-based Latent Aligner for Incremental Learning

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization