This tool uses Deep Learning to help you draw and write with your hand and webcam.

Last update: Dec 10, 2022

Related tags

Overview

air-drawing 👆

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

Try it online : loicmagne.github.io/air-drawing

Technical Details

This pipeline is made up of two steps: detecting the hand, and predicting the drawing. Both steps are done using Deep Learning.
The handpose detection is performed using MediaPipe toolbox
The drawing prediction part uses only the finger position, not the image. The input is a sequence of 2D points (actually i'm using the speed and acceleration of the finger instead of the position to make the prediction translation-invariant), and the output is a binary classification 'pencil up' or 'pencil down'. I used a simple bidirectionnal LSTM architecture. I made a small dataset myself (~50 samples) which I annotated thanks to tools provided in the python-stuff/data-wrangling/. At first I wanted to make the 'pencil up'/'pencil down' prediction in real-time, i.e. make the predictions at the same time the user draws. However this task was too difficult and I had poor results, which is why I'm now using bidirectionnal LSTM. You can find details of the deep learning pipeline in the jupyter-notebook in python-stuff/deep-learning/
The application is entirely client-side. I deployed the deep learning model by converting the PyTorch model to .onnx, and then using the ONNX Runtime which is very convenient and compatible with a lot of layers.

Going Forward

Overall the pipeline still struggles and needs some improvement. Ideas of amelioration include :

Having a bigger dataset, with more diverse user data.
Process and smooth the finger signal, to be less dependent on camera quality, and to improve model generalization.

This tool uses Deep Learning to help you draw and write with your hand and webcam.

Related tags

Overview

air-drawing 👆

Technical Details

Going Forward

Owner

lmagne

Gym-TORCS is the reinforcement learning (RL) environment in TORCS domain with OpenAI-gym-like interface.

Official PyTorch implementation of Learning Intra-Batch Connections for Deep Metric Learning (ICML 2021) published at International Conference on Machine Learning

CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms

Implementation of "Glancing Transformer for Non-Autoregressive Neural Machine Translation"

A large-scale database for graph representation learning

Detect roadway lanes using Python OpenCV for project during the 5th semester at DHBW Stuttgart for lecture in digital image processing.

A universal memory dumper using Frida

Spatial Transformer Nets in TensorFlow/ TensorLayer

Implementation of gaze tracking and demo

Steer OpenAI's Jukebox with Music Taggers

Dense Prediction Transformers

Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

Constraint-based geometry sketcher for blender

The open source code of SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation.

SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

Deep Q-network learning to play flappybird.

Code accompanying "Adaptive Methods for Aggregated Domain Generalization"

Updated for TTS(CE) = Also Known as TTN V3. The code requires the first server to be 'ttn' protocol.

A Python training and inference implementation of Yolov5 helmet detection in Jetson Xavier nx and Jetson nano