Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

Last update: Aug 20, 2022

Related tags

Deep Learning Pose-Network

Overview

RealTime Sign Language Detection using Action Recognition

Approach

Real-Time Sign Language is commonly predicted using models whose architecture consists of multiple CNN layers followed by multiple LSTM layers. However , the accuracy of these state of the art models is pretty low. On the other hand, this approach , Mediapipe Holistic with LSTM Model gives a much better accuracy. This approach produced better results with very less amount of data . Since this model trained on fewer parameters, it trained much faster thus resulting in lesser computation time.

Project

This project is divided into two parts:

Keypoints extraction using MediaPipe Holistic
LSTM Model trained on these keypoints to predict realtime sign language using video sequences.

Dataset

Data is collected using MediaPipe Holistic for 3 actions :

Hello
Thanks
I Love You

30 frames have been collected for each action and 30 sequences for each frame have been collected from real time actions using Computer Vision and MediaPipe Holistic. For each sequence , 1662 keypoints have been extracted.

Face Landmarks - 468*3
Pose Landmarks - 33*4
Left Hand Landmarks - 21*3
Right Hand Landmarks - 21*3

The dataset can be accessed from the Feature_Extraction Folder.

Model

LSTM Model is trained using the extracted keypoints from the Feature_Extraction folder and later used for real time predictions.

The Weights of the model are saved in the lstm_model.h5 file.

How to Use

Clone the repository using :

  $ git clone https://github.com/rishusiva/Pose-Network

Install the requirements using:

  $ cd Pose-Network/
  $ pip install -r requirements.txt

To Predict Sign Languages in Real Time , run :

  $ cd Pose-Network/Code
  $ python3 realtime_testing.py

Results

Our LSTM Model, after training for only 100 epochs, has an accuracy of 70%
It produced an accuracy score of 1.0 on a test set of 5 images.
Our Trained LSTM Model is then used for real time testing.

Prediction Results:

Author

Rishikesh Sivakumar

by Rishikesh Sivakumar

Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

Related tags

Overview

RealTime Sign Language Detection using Action Recognition

Approach

Project

Dataset

Model

How to Use

Results

Prediction Results:

Author

Owner

Rishikesh S

A 2D Visual Localization Framework based on Essential Matrices [ICRA2020]

GANSketchingJittor - Implementation of Sketch Your Own GAN in Jittor

Article Reranking by Memory-enhanced Key Sentence Matching for Detecting Previously Fact-checked Claims.

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

This is a five-step framework for the development of intrusion detection systems (IDS) using machine learning (ML) considering model realization, and performance evaluation.

Denoising Normalizing Flow

Automatically replace ONNX's RandomNormal node with Constant node.

Model Serving Made Easy

Human annotated noisy labels for CIFAR-10 and CIFAR-100.

A PyTorch Implementation of "Watch Your Step: Learning Node Embeddings via Graph Attention" (NeurIPS 2018).

A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

Meta Self-learning for Multi-Source Domain Adaptation： A Benchmark

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

Fastquant - Backtest and optimize your trading strategies with only 3 lines of code!

AsymmetricGAN - Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

A simple configurable bot for sending arXiv article alert by mail