Sort By Face

Related tags

Computer VisionSBF
Overview

Sort-By-Face

This is an application with which you can either sort all the pictures by faces from a corpus of photos or retrieve all your photos from the corpus
by submitting a picture of yours.

Setup:

Requirements:

  • python 3.8.5
  • Anaconda 4.9.2+

If anaconda isn't installed, install it from here

  • Clone the repository
  • Download the folder called Models/ from here into the same directory where you cloned the repository.
  • Run conda env create -f environment.yml to create the environment.
  • Run conda activate sorter.
  • Run pip install -r requirements.txt
  • In case you want to run the notebook then make sure Jupyter notebook is installed and accessible for all environments in your system.

Instructions:

  • Put the directory where the folders are located into the project folder.
  • Run python embedder.py -src /path/to/images. Any non image file extensions are safely ignored. This command utilizes all the cores in the system for parallel processing.
  • In case you want to reduce the number of parallel processes, run python embedder.py -src /path/to/images --processes number-of-processes.
  • Both absolute and relative paths work but relative paths are recommended.
  • The above command then calculates all the embeddings for the faces in the pictures. NOTE: It takes a significant amount of time for large directories.
  • The embeddings are saved in a pickle file called embeddings.pickle.

Sort an entire corpus of photos:

  • Run python sort_images.py. This runs the clustering algorithm with the default parameters of threshold and iterations for the clustering algorithm.
  • If you want to tweak the parameters, run python sort_images.py -t threshold -itr num-iterations to alter the threshold and iterations respectively.
  • If you think pictures are missing try reducing the threshold and increasing the iterations. Something like 0.64 and 35 iterations should work.
  • Once the clustering is finished all the images are stored into a folder called Sorted-pictures. Each subdirectory in it corresponds to the unique person identified.

Get pictures of a single person from the corpus:

  • To get pictures of a single person you will need to provide a picture of that person. It is recommended that the picture clears the following requirements for better results:
    • Image must have width and height greater than 160px.
    • Image must consist of only one face (The program is exited when multiple faces are detected)
    • Image must be preferably well lit and recognizable by a human.
  • Run python get_individual.py -src /path/to/person's/image -dest /path/to/copy/images.
  • This script also allows to tweak with the parameters with the same arguments as mentioned before.
  • Once clustering is done all the pictures are copied into the destination

Evaluation of clustering algorithm:

The notebook On testing on the Labeled Faces in the Wild dataset the following results were obtained. (threshold = 0.67, iterations=30)

  • Precision: 0.89
  • Recall: 0.99
  • F-measure: 0.95
  • Clusters formed: 6090 (5749 unique labels in the dataset)

The code for evaluation has been uploaded in this notebook

The LFW dataset has many images containing more than one face but only has a single label. This can have an effect on the evaluation metrics and the clusters formed. These factors have been discussed in detail in the notebook.
For example by running the script get_individual.py and providing a photo of George Bush will result in some images like this.

In Layman terms we have gathered all the 'photobombs' of George Bush in the dataset, but all the labels for the 'photobombs' correspond to a different person.
NOTE: this does not effect the clustering for the original person as the scripts treat each face seperately but refer to the same image.

How it works:

  • Given a corpus of photos inside a directory this application first detects the faces in the photos.
  • Face alignment is then done using dlib, such that the all the eyes for the faces is at the same coordinates.
  • Then the image is passed through a Convolutional Neural Network to generate 128-Dimensional embeddings.
  • These embeddings are then used in a graph based clustering algorithm called 'Chinese Whispers'.
  • The clustering algorithm assigns a cluster to each individual identified by it.
  • After the algorithm the images are copied into seperate directories corresponding to their clusters.
  • For a person who wants to retrieve only his images, only the images which are in the same cluster as the picture submitted by the user is copied.

Model used for embedding extraction:

The project uses a model which was first introduced in this [4] . It uses a keras model converted from David Sandberg's implementation in this repository.
In particular it uses the model with the name 20170512-110547 which was converted using this script.

All the facenet models are trained using a loss called triplet loss. This loss ensures that the model gives closer embeddings for same people and farther embeddings for different people.
The models are trained on a huge amount of images out of which triplets are generated.

The clustering algorithm:


This project uses a graph based algorithm called Chinese Whispers to cluster the faces. It was first introduced for Natural Language Processing tasks by Chris Biemann in [3] paper.
The authors in [1] and [2] used the concept of a threshold to assign edges to the graphs. i.e there is an edge between two nodes (faces) only if their (dis)similarity metric of their representations is above/below a certain threshold.
In this implementation I have used cosine similarity between face embeddings as the similarity metric.

By combining these ideas we draw the graph like this:

  1. Assign a node to every face detected in the dataset (not every image, because there can be multiple faces in a single image)
  2. Add an edge between two nodes only if the cosine similarity between their embeddings is greater than a threshold.

And the algorithm used for clustering is:

  1. Initially all the nodes are given a seperate cluster.
  2. The algorithm does a specific number of iterations.
  3. For each iteration the nodes are traversed randomly.
  4. Each node is given the cluster which has the highest rank in it's neighbourhood.
  5. The rank of a cluster here is the sum of weights between the current node and the neighbours belonging to that cluster.
  6. In case of a tie between clusters, any one of them is assigned randomly.

The Chinese Whispers algorithm does not converge nor is it deterministic, but it turns out be a very efficient algorithm for some tasks.

References:

This project is inspired by the ideas presented in the following papers

[1] Roy Klip. Fuzzy Face Clustering For Forensic Investigations

[2] Chang L, Pérez-Suárez A, González-Mendoza M. Effective and Generalizable Graph-Based Clustering for Faces in the Wild.

[3] Biemann, Chris. (2006). Chinese whispers: An efficient graph clustering algorithm and its application to natural language processing problems.
[4] Florian Schroff and Dmitry Kalenichenko and James Philbin (2015). FaceNet, a Unified Embedding for Face Recognition and Clustering.

Libraries used:

  • NumPy
  • Tensorflow
  • Keras
  • dlib
  • OpenCv
  • networkx
  • imutils
  • tqdm

Future Scope:

  • A Graphical User Interface (GUI) to help users use the app with ease.
  • GPU optimization to calculate embeddings.
  • Implementation of other clustering methods.
Distort a video using Seam Carving (video) and Vibrato effect (sound)

Distort videos Applies a Seam Carving algorithm (aka liquid rescale) on every frame of a video, and a vibrato effect on the audio to distort the video

AlexZeGamer 6 Dec 06, 2022
Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

Morphologycal-edge-detection-using-erosion-and-dialation the task is to detect object boundary using erosion or dialation . Here, use the kernel or st

Tamzid hasan 3 Nov 25, 2022
Single Shot Text Detector with Regional Attention

Single Shot Text Detector with Regional Attention Introduction SSTD is initially described in our ICCV 2017 spotlight paper. A third-party implementat

Pan He 215 Dec 07, 2022
Repository of conference publications and source code for first-/ second-authored papers published at NeurIPS, ICML, and ICLR.

Repository of conference publications and source code for first-/ second-authored papers published at NeurIPS, ICML, and ICLR.

Daniel Jarrett 26 Jun 17, 2021
Text language identification using Wikipedia data

Text language identification using Wikipedia data The aim of this project is to provide high-quality language detection over all the web's languages.

Vsevolod Dyomkin 28 Jul 09, 2022
一键翻译各类图片内文字

一键翻译各类图片内文字 针对群内、各个图站上大量不太可能会有人去翻译的图片设计,让我这种日语小白能够勉强看懂图片 主要支持日语,不过也能识别汉语和小写英文 支持简单的涂白和嵌字

574 Dec 28, 2022
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex

Joe Sutherland 1.3k Jan 04, 2023
MXNet OCR implementation. Including text recognition and detection.

insightocr Text Recognition Accuracy on Chinese dataset by caffe-ocr Network LSTM 4x1 Pooling Gray Test Acc SimpleNet N Y Y 99.37% SE-ResNet34 N Y Y 9

Deep Insight 99 Nov 01, 2022
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
Lightning Fast Language Prediction 🚀

whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req

Indix 152 Oct 16, 2022
An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss This is an unofficial implementation of AutoVC based on the official one. The reposi

Chien-yu Huang 27 Jun 16, 2022
【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿,我们会帮你完成一切✨

原神钓鱼辅助工具 ✨ 作者正在努力重构代码中……会尽快带给大家一个更完美的脚本 ✨ 「您只需抛出鱼竿,然后我们会帮您搞定一切」 如果你觉得这个脚本好用,请点一个 Star ⭐ ,你的 Star 就是作者更新最大的动力 点击这里 查看演示视频 ✨ 欢迎大家在 Issues 中分享自己的配置文件 ✨ ✨

261 Jan 02, 2023
Web interface for browsing arXiv papers

Currently, arxivbox considers only major computer vision and machine learning conferences

Ankan Kumar Bhunia 12 Sep 11, 2022
Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

CSCBLI Code for our ACL Findings 2021 paper, "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction". Require

Jinpeng Zhang 12 Oct 08, 2022
MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI. It is an open-source and easy-to-install ecosystem that can run locally on a machine with one

Project MONAI 344 Dec 23, 2022
QED-C: The Quantum Economic Development Consortium provides these computer programs and software for use in the fields of quantum science and engineering.

Application-Oriented Performance Benchmarks for Quantum Computing This repository contains a collection of prototypical application- or algorithm-cent

SRI International 67 Nov 30, 2022
This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks (CVPR 2021 Oral) This repository contains the official PyTorch implementation

Shunsuke Saito 235 Dec 18, 2022
A little but useful tool to explore OCR data extracted with `pytesseract` and `opencv`

Screenshot OCR Tool Extracting data from screen time screenshots in iOS and Android. We are exploring 3 options: Simple OCR with no text position usin

Gabriele Marini 1 Dec 07, 2021
Python rubik's cube solver

This program makes a 3D representation of a rubiks cube and solves it step by step.

Pablo QB 4 May 29, 2022
A Python wrapper for Google Tesseract

Python Tesseract Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded i

Matthias A Lee 4.6k Jan 06, 2023