A basic duplicate image detection service using perceptual image hash functions and nearest neighbor search, implemented using faiss, fastapi, and imagehash

Overview

Duplicate Image Detection

Getting Started

  • Install dependencies pip install -r requirements.txt
  • Run service python main.py

Testing

  • Test with pytest

How it Works

alt text

This system uses a perceptual hashing function, similar to Apple's CSAM Detection. Instead of generating image hashes using NeuralHash, it uses a difference hash (dHash), which is simpler and less computationally intensive as it doesn't require neural networks. Since we don't have the same privacy constraints as Apple, we will be using nearest neighbor searches to identify duplicate images.

Difference Hash

dHash is a perceptual hashing function that produces hash values that are resilient to image scaling, as well as changes in color, brightness, and aspect ratio [1]. There are 4 main steps for creating a difference hash for an image:

alt text

  1. Convert to greyscale*
  2. Resize image to (hash_size+1, hash_size)
  3. Calculate horizontal gradient, reducing image size to (hash_size, hash_size)
  4. Assign bits based on horizontal gradient values

*We convert the image to greyscale before resizing for optimal performance

Nearest Neighbors

Image hashes that we want to check for duplicates against will be stored in a binary index for fast and efficient nearest neighbor searches. We will use Hamming distance as a metric to determine the similarity between image hashes, for dHash, distances less than 10 (96.09% similarity) likely indicate similar/duplicate images [1].

References

[1] https://www.hackerfactor.com/blog/?/archives/529-Kind-of-Like-That.html

A program that uses computer vision to detect hand gestures, used for controlling movie players.

HandGestureDetection This program uses a Haar Cascade algorithm to detect the presence of your hand, and then passes it on to a self-created and self-

2 Nov 22, 2022
A simple baseline for 3d human pose estimation in PyTorch.

3d_pose_baseline_pytorch A PyTorch implementation of a simple baseline for 3d human pose estimation. You can check the original Tensorflow implementat

weigq 312 Jan 06, 2023
MG-GCN: Scalable Multi-GPU GCN Training Framework

MG-GCN MG-GCN: multi-GPU GCN training framework. For more information, please read our paper. After cloning our repository, run git submodule update -

Translational Data Analytics (TDA) Lab @GaTech 6 Oct 24, 2022
Hyper-parameter optimization for sklearn

hyperopt-sklearn Hyperopt-sklearn is Hyperopt-based model selection among machine learning algorithms in scikit-learn. See how to use hyperopt-sklearn

1.4k Jan 01, 2023
DanceTrack: Multiple Object Tracking in Uniform Appearance and Diverse Motion

DanceTrack DanceTrack is a benchmark for tracking multiple objects in uniform appearance and diverse motion. DanceTrack provides box and identity anno

260 Dec 28, 2022
Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders"

AAVAE Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders" Abstract Recent methods for self-supervised learnin

Grid AI Labs 48 Dec 12, 2022
Bounding Wasserstein distance with couplings

BoundWasserstein These scripts reproduce the results of the article Bounding Wasserstein distance with couplings by Niloy Biswas and Lester Mackey. ar

Niloy Biswas 1 Jan 11, 2022
some academic posters as references. May we have in-person poster session soon!

some academic posters as references. May we have in-person poster session soon!

Bolei Zhou 472 Jan 06, 2023
Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho

Lightly 25 Dec 03, 2022
LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection.

LightLog Introduction LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection. Function description [BG

25 Dec 17, 2022
【steal piano】GitHub偷情分析工具!

【steal piano】GitHub偷情分析工具! 你是否有这样的困扰,有一天你的仓库被很多人加了star,但是你却不知道这些人都是从哪来的? 别担心,GitHub偷情分析工具帮你轻松解决问题! 原理 GitHub偷情分析工具透过分析star的时间以及他们之间的follow关系,可以推测出每个st

黄巍 442 Dec 21, 2022
Optimal space decomposition based-product quantization for approximate nearest neighbor search

Optimal space decomposition based-product quantization for approximate nearest neighbor search Abstract Product quantization(PQ) is an effective neare

Mylove 1 Nov 19, 2021
Age Progression/Regression by Conditional Adversarial Autoencoder

Age Progression/Regression by Conditional Adversarial Autoencoder (CAAE) TensorFlow implementation of the algorithm in the paper Age Progression/Regre

Zhifei Zhang 603 Dec 22, 2022
A PyTorch implementation of "Graph Wavelet Neural Network" (ICLR 2019)

Graph Wavelet Neural Network ⠀⠀ A PyTorch implementation of Graph Wavelet Neural Network (ICLR 2019). Abstract We present graph wavelet neural network

Benedek Rozemberczki 490 Dec 16, 2022
Repository for reproducing `Model-Based Robust Deep Learning`

Model-Based Robust Deep Learning (MBRDL) In this repository, we include the code necessary for reproducing the code used in Model-Based Robust Deep Le

Alex Robey 16 Sep 19, 2022
Stitch it in Time: GAN-Based Facial Editing of Real Videos

STIT - Stitch it in Time [Project Page] Stitch it in Time: GAN-Based Facial Edit

1.1k Jan 04, 2023
This repository contains the code for the paper Neural RGB-D Surface Reconstruction

Neural RGB-D Surface Reconstruction Paper | Project Page | Video Neural RGB-D Surface Reconstruction Dejan Azinović, Ricardo Martin-Brualla, Dan B Gol

Dejan 406 Jan 04, 2023
Fast, flexible and easy to use probabilistic modelling in Python.

Please consider citing the JMLR-MLOSS Manuscript if you've used pomegranate in your academic work! pomegranate is a package for building probabilistic

Jacob Schreiber 3k Dec 29, 2022
Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting

Decoupled Spatial-Temporal Transformer for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu Sun, Xiaogang Wang, J

51 Dec 13, 2022
kullanışlı ve işinizi kolaylaştıracak bir araç

Hey merhaba! işte çok sorulan sorularının cevabı ve sorunlarının çözümü; Soru= İçinde var denilen birçok şeyi göremiyorum bunun sebebi nedir? Cevap= B

Sexettin 16 Dec 17, 2022