Face detection using deep learning.

Overview

Face Detection Docker Solution Using Faster R-CNN



Dockerface is a deep learning face detector. It deploys a trained Faster R-CNN network on Caffe through an easy to use docker image. Bring your videos and images, run dockerface and obtain videos and images with bounding boxes of face detections and an easy to use face detection annotation text file.

The docker image is large for now because OpenCV has to be compiled and stored in the image to be able to use video and it takes up a lot of space.

Technical details and some experiments are described in the Arxiv Tech Report.

Citing Dockerface

If you find Dockerface useful in your research please consider citing:

@ARTICLE{2017arXiv170804370R,
   author = {{Ruiz}, N. and {Rehg}, J.~M.},
    title = "{Dockerface: an easy to install and use Faster R-CNN face detector in a Docker container}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1708.04370},
 primaryClass = "cs.CV",
 keywords = {Computer Science - Computer Vision and Pattern Recognition},
     year = 2017,
    month = aug,
   adsurl = {http://adsabs.harvard.edu/abs/2017arXiv170804370R},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

Instructions

Install NVIDIA CUDA (8 - preferably) and cuDNN (v5 - preferably)

https://developer.nvidia.com/cuda-downloads
https://developer.nvidia.com/cudnn

Install docker

https://docs.docker.com/engine/installation/

Install nvidia-docker

wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb

Go to your working folder and create a directory called data, your videos and images should go here. Also create a folder called output.

cd $WORKING_DIR
mkdir data
mkdir output

Run the docker container

sudo nvidia-docker run -it -v $PWD/data:/opt/py-faster-rcnn/edata -v $PWD/output/video:/opt/py-faster-rcnn/output/video -v $PWD/output/images:/opt/py-faster-rcnn/output/images natanielruiz/dockerface:latest

Now we have to recompile Caffe for it to work on your own machine.

cd caffe-fast-rcnn
rm -rf build
mkdir build
cd build
cmake -DUSE_CUDNN=1 ..
make -j20 && make pycaffe
cd ../..

Finally use this command to process a video

python tools/run_face_detection_on_video.py --gpu 0 --video edata/YOUR_VIDEO_FILENAME --output_string STRING_TO_BE_APPENDED_TO_OUTPUTFILE_NAME --conf_thresh CONFIDENCE_THRESHOLD_FOR_DETECTIONS

Use this command to process an image

python tools/run_face_detection_on_image.py --gpu 0 --image edata/YOUR_IMAGE_FILENAME --output_string STRING_TO_BE_APPENDED_TO_OUTPUTFILE_NAME --conf_thresh CONFIDENCE_THRESHOLD_FOR_DETECTIONS

Also if you are looking to conveniently process all images in one folder use this command

python tools/facedetection_images.py --gpu 0 --image_folder edata/IMAGE_FOLDER_NAME --output_folder OUTPUT_FOLDER_PATH --conf_thresh CONFIDENCE_THRESHOLD_FOR_DETECTIONS

The default confidence threshold is 0.85 which works for high quality videos or images where the faces are clearly visible. You can play around with this value.

The columns contained in the output text files are:

For videos:

frame_number x_min y_min x_max y_max confidence_score

For images:

image_path x_min y_min x_max y_max confidence_score

Where (x_min,y_min) denote the coordinates of the upper-left corner of the bounding box in image intrinsic coordinates and (x_max, y_max) denote the coordinates of the lower-right corner of the bounding box in image intrinsic coordinates. (ref. https://www.mathworks.com/help/images/image-coordinate-systems.html) confidence_score denotes the probability output of the model that the detection is correct (it is a number included in [0,1])

Voila, that easy!

After you're done with the docker container you can exit.

exit

You want to restart and re-attach to this same docker container so as to avoid compiling Caffe again. To do this first get the id for that container.

sudo docker ps -a

It should be the last one that was launched. Take note of CONTAINER ID. Then start and attach to that container.

sudo docker start CONTAINER_ID
sudo docker attach CONTAINER_ID

You can now continue processing videos.

Nataniel Ruiz and James M. Rehg
Georgia Institute of Technology

Credits: Original dockerface logo made by Freepik from Flaticon is licensed by Creative Commons BY 3.0, modified by Nataniel Ruiz.

Owner
Nataniel Ruiz
PhD candidate at Boston University doing Computer Vision and ML. M.S. from Georgia Tech, BA/M.S. from Ecole Polytechnique
Nataniel Ruiz
[CVPR 2021] Monocular depth estimation using wavelets for efficiency

Single Image Depth Prediction with Wavelet Decomposition Michaël Ramamonjisoa, Michael Firman, Jamie Watson, Vincent Lepetit and Daniyar Turmukhambeto

Niantic Labs 205 Jan 02, 2023
2021 Artificial Intelligence Diabetes Datathon

A.I.D.D. 2021 2021 Artificial Intelligence Diabetes Datathon A.I.D.D. 2021은 ‘2021 인공지능 학습용 데이터 구축사업’을 통해 만들어진 학습용 데이터를 활용하여 당뇨병을 효과적으로 예측할 수 있는가에 대한 A

2 Dec 27, 2021
Estimation of human density in a closed space using deep learning.

Siemens HOLLZOF challenge - Human Density Estimation Add project description here. Installing Dependencies: Install Python3 either system-wide, user-w

3 Aug 08, 2021
Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data

Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data arXiv This is the code base for weakly supervised NER. We provide a

Amazon 92 Jan 04, 2023
🧑‍🔬 verify your TEAL program by experiment and observation

Graviton - Testing TEAL with Dry Runs Tutorial Local Installation The following instructions assume that you have make available in your local environ

Algorand 18 Jan 03, 2023
Second-Order Neural ODE Optimizer, NeurIPS 2021 spotlight

Second-order Neural ODE Optimizer (NeurIPS 2021 Spotlight) [arXiv] ✔️ faster convergence in wall-clock time | ✔️ O(1) memory cost | ✔️ better test-tim

Guan-Horng Liu 39 Oct 22, 2022
ArcaneGAN by Alex Spirin

ArcaneGAN by Alex Spirin

Alex 617 Dec 28, 2022
Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection Main requirements torch = 1.0 torchvision = 0.2.0 Python 3 Environm

15 Apr 04, 2022
Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive

google-drive-to-sqlite Create a SQLite database containing metadata from Google

Simon Willison 140 Dec 04, 2022
[ECCV2020] Content-Consistent Matching for Domain Adaptive Semantic Segmentation

[ECCV20] Content-Consistent Matching for Domain Adaptive Semantic Segmentation This is a PyTorch implementation of CCM. News: GTA-4K list is available

Guangrui Li 88 Aug 25, 2022
LBK 20 Dec 02, 2022
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022
Code for Neurips2021 Paper "Topology-Imbalance Learning for Semi-Supervised Node Classification".

Topology-Imbalance Learning for Semi-Supervised Node Classification Introduction Code for NeurIPS 2021 paper "Topology-Imbalance Learning for Semi-Sup

Victor Chen 40 Nov 23, 2022
source code of “Visual Saliency Transformer” (ICCV2021)

Visual Saliency Transformer (VST) source code for our ICCV 2021 paper “Visual Saliency Transformer” by Nian Liu, Ni Zhang, Kaiyuan Wan, Junwei Han, an

89 Dec 21, 2022
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

Yonglong Tian 2.2k Jan 08, 2023
Code and data accompanying our SVRHM'21 paper.

Code and data accompanying our SVRHM'21 paper. Requires tensorflow 1.13, python 3.7, scikit-learn, and pytorch 1.6.0 to be installed. Python scripts i

5 Nov 17, 2021
A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

Lobe This is a Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images. This component lets you easily use an exported m

Kendell R 4 Feb 28, 2022
Code for the paper "Zero-shot Natural Language Video Localization" (ICCV2021, Oral).

Zero-shot Natural Language Video Localization (ZSNLVL) by Pseudo-Supervised Video Localization (PSVL) This repository is for Zero-shot Natural Languag

Computer Vision Lab. @ GIST 37 Dec 27, 2022
The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Code for "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval" (ACL 2021, Long) This is the repository for baseline m

Akari Asai 25 Oct 30, 2022
Codebase for BMVC 2021 paper "Text Based Person Search with Limited Data"

Text Based Person Search with Limited Data This is the codebase for our BMVC 2021 paper. Please bear with me refactoring this codebase after CVPR dead

Xiao Han 33 Nov 24, 2022