GazeScroller - Using Facial Movements to perform Hands-free Gesture on the system

Overview

GazeScroller

Using Facial Movements to perform Hands-free Gesture on the system

Abstract

As our world is getting digitized on an fast rate, every person is having a device that is making life better. Also, there is a considerate amount of the society that do not have interactions as others to these devices. One such example are the quadriplegic people (people suffering from paralysis) which constitute to 5.4 million people people in the world*. Our aim here is to make them interact with the digital world. In this project, facial movements of the person's face is fed to the system on real-time and a certain list of operations can be performed on the system using these facial actions.Additionally, we will extend this system to mini-games on the internet like the Dino Game. Finally, I have evaluated the system by five people and found that they have positively to the system. These results imply that we can generalise this system to the entire world.

Approach

The project captures live stream of the video via webcam of the system. It then maps the face to 68 landmark points via the library Dlib. The movements of the points corresponding to the eye and nose are monitored continously. The functionalities covered in the project include : • Detect blink of one eye to enable/disable scrolling. • Detect the scroll movement based on the movement of the point on the nose. Using Blink to toggle scroll and head direction to scroll

Background Study

Blinking is an involuntary action of a human being.Blinks can be spontaneous, reflex and voluntary, and eye blink rate depends on various factors including environmental factors, type of activity.

In order to segregate natural blink of the eye with the intentional blink of one eye of the user for functionality 1 as discussed above, I have studied the eye width ratios of by conducting experiments study over 5 users with each subject testing for 10 times. This data analysis is used to understand to difference in the eye width ratio between both the eyes to when a user blinks one of the eye. Secondly, the intentional blink of the eye is put on a threshold for 3 frames to detect blink. These procedures helped detect the intentional one eye blink from the natural blink of the eyes. The information from the Fig 1 gives us the details of the eye ratio and the delta (difference between the eye ratios). We take the mean and use them as a reference in our code as threshold.

Technical Tools :

• Dlib - a library used to detect face per frame via webcam • Python - language to write the code • landmarksPoints.dat file - this file is used to superimpose landmarks onto the face detected. • pynput - library to invoke keyboard and mouse keys.

System Setup :

By using the tools of mentioned above, we get the face of the user per frame superimposed by landmark points. Calculations for each frame include :

rightEyeWidthRatio = height of the right eye/ width of the right eye leftEyeWidthRatio = height of the left eye/ width of the left eye delta = abs(leftEyeWidthRatio - rightEyeWidthRatio) Whenever a user blinks one eye, following cases are checked • Check 1 : if delta > threshold of delta taken from fig.1 • Check 2 : if leftEyeWidthRatio < threshold value of blink and frame count is 3. • If Check 1 and Check 2 true , trigger Blink and enable scrolling. UX Aspects : Trigger notifications in the system when scrolling is toggled.

Discussion & Future Scope:

In the present work I have not made much effort into perfectly the model and in CV. I have worked towards the thresholds and correlating to the use case I mentioned in the abstract. If substantial work is detecting the exact eye wink using ML models, the system would be much better. The false blinks being recorded is because we lack a model here. In the future scope , we can use this feature to build interactive games to the quadriplegic people to improve their psychological status too.

Conclusion :

All the subjects who have tested responded positively to the system and felt good about it. Therefore, we can say that our system is performing good to scroll pages using the nose and to capture the blink of the eye as a toggle gesture.

Hence, such a model will be beneficial to quadriplegic people and help them to interact with the digital world.Since the false blinks are low, the system is good to be used. It can be further perfected with ML models to give better accuracy to be used by the quadriplegic people.

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 201 Dec 21, 2022
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 08, 2022
The official PyTorch code for 'DER: Dynamically Expandable Representation for Class Incremental Learning' accepted by CVPR2021

DER.ClassIL.Pytorch This repo is the official implementation of DER: Dynamically Expandable Representation for Class Incremental Learning (CVPR 2021)

rhyssiyan 108 Jan 01, 2023
This is the official code release for the paper Shape and Material Capture at Home

This is the official code release for the paper Shape and Material Capture at Home. The code enables you to reconstruct a 3D mesh and Cook-Torrance BRDF from one or more images captured with a flashl

89 Dec 10, 2022
The official PyTorch code for NeurIPS 2021 ML4AD Paper, "Does Thermal data make the detection systems more reliable?"

MultiModal-Collaborative (MMC) Learning Framework for integrating RGB and Thermal spectral modalities This is the official code for NeurIPS 2021 Machi

NeurAI 12 Nov 02, 2022
Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps

Henghui Ding 143 Dec 23, 2022
Voxel-based Network for Shape Completion by Leveraging Edge Generation (ICCV 2021, oral)

Voxel-based Network for Shape Completion by Leveraging Edge Generation This is the PyTorch implementation for the paper "Voxel-based Network for Shape

10 Dec 04, 2022
DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

DSAC* for Visual Camera Re-Localization (RGB or RGB-D) Introduction Installation Data Structure Supported Datasets 7Scenes 12Scenes Cambridge Landmark

Visual Learning Lab 143 Dec 22, 2022
LSTMs (Long Short Term Memory) RNN for prediction of price trends

Price Prediction with Recurrent Neural Networks LSTMs BTC-USD price prediction with deep learning algorithm. Artificial Neural Networks specifically L

5 Nov 12, 2021
The implementation for paper Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets.

Joint t-sne This is the implementation for paper Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets. abstract: We present Jo

IDEAS Lab 7 Dec 18, 2022
FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows

FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows.

Meta Incubator 272 Jan 02, 2023
A Jinja extension (compatible with Flask and other frameworks) to compile and/or compress your assets.

A Jinja extension (compatible with Flask and other frameworks) to compile and/or compress your assets.

Jayson Reis 94 Nov 21, 2022
Implementation of the state of the art beat-detection, downbeat-detection and tempo-estimation model

The ISMIR 2020 Beat Detection, Downbeat Detection and Tempo Estimation Model Implementation. This is an implementation in TensorFlow to implement the

Koen van den Brink 1 Nov 12, 2021
Code for the paper "Reinforced Active Learning for Image Segmentation"

Reinforced Active Learning for Image Segmentation (RALIS) Code for the paper Reinforced Active Learning for Image Segmentation Dependencies python 3.6

Arantxa Casanova 79 Dec 19, 2022
Deep Learning applied to Integral data analysis

DeepIntegralCompton Deep Learning applied to Integral data analysis Module installation Move to the root directory of the project and execute : pip in

Thomas Vuillaume 1 Dec 10, 2021
[SIGGRAPH Asia 2019] Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning

AGIS-Net Introduction This is the official PyTorch implementation of the Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning. paper | suppl

Yue Gao 102 Jan 02, 2023
SeMask: Semantically Masked Transformers for Semantic Segmentation.

SeMask: Semantically Masked Transformers Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi This repo co

Picsart AI Research (PAIR) 186 Dec 30, 2022
ArcaneGAN by Alex Spirin

ArcaneGAN by Alex Spirin

Alex 617 Dec 28, 2022
Python based framework for Automatic AI for Regression and Classification over numerical data.

Python based framework for Automatic AI for Regression and Classification over numerical data. Performs model search, hyper-parameter tuning, and high-quality Jupyter Notebook code generation.

BlobCity, Inc 141 Dec 21, 2022
The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

ISC-Track2-Submission The codes and related files to reproduce the results for Image Similarity Challenge Track 2. Required dependencies To begin with

Wenhao Wang 89 Jan 02, 2023