Audio2Face - Audio To Face With Python

Last update: Dec 26, 2022

Related tags

Deep Learning Audio2Face

Overview

Audio2Face

Discription

We create a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project.

Base Module

The framework we used contains three parts.In Formant network step,we perform fixed-function analysis of the input audio clip.In the articulation network,we concatenate an emotional state vector to the output of each convolution layer after the ReLU activation. The fully-connected layers at the end expand the 256+E abstract features to blendshape weights .

Usage

this pipeline shows how we use FACEGOOD Audio2Face.

Test video

Prepare data

step1: record voice and video ,and create animation from video in maya. note: the voice must contain vowel ,exaggerated talking and normal talking.Dialogue covers as many pronunciations as possible.
step2: we deal the voice with LPC,to split the voice into segment frames corresponding to the animation frames in maya.

Input data

Use ExportBsWeights.py to export weights file from Maya.Then we can get BS_name.npy and BS_value.npy .

Use step1_LPC.py to deal with wav file to get lpc_*.npy . Preprocess the wav to 2d data.

train

we recommand that uses FACEGOOD avatary to produces trainning data.its fast and accurate. http://www.avatary.com

the data for train is stored in dataSet1

python step14_train.py --epochs 8 --dataSet dataSet1

test

In folder /test,we supply a test application named AiSpeech.
wo provide a pretrained model,zsmeif.pb
In floder /example/ueExample, we provide a packaged ue project that contains a digit human created by FACEGOOD can drived by /AiSpeech/zsmeif.py.

you can follow the steps below to use it:

make sure you connect the microphone to computer.
run the script in terminal.

python zsmeif.py
when the terminal show the message "run main", please run FaceGoodLiveLink.exe which is placed in /example/ueExample/ folder.
click and hold on the left mouse button on the screen in UE project, then you can talk with the AI model and wait for the voice and animation response.

Dependences

tersorflow-gpu 1.15

python-libs: pyaudio requests websocket websocket-client

Data

The testing data, Maya model, and ue4 test project can be downloaded from the link below.

data_all code : n6ty

GoogleDrive

Reference

Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion

Contact

Wechat: FACEGOOD_CHINA
Email：[email protected]
Discord: https://discord.gg/V46y6uTdw8

License

Audio2Face Core is released under the terms of the MIT license.See COPYING for more information or see https://opensource.org/licenses/MIT.

Audio2Face - Audio To Face With Python

Related tags

Overview

Audio2Face

Discription

Base Module

Usage

Prepare data

Input data

train

test

Dependences

Data

Reference

Contact

License

Owner

FACEGOOD

The code for paper "Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation" which is accepted by AAAI 2022

Intrusion Detection System using ensemble learning (machine learning)

Square Root Bundle Adjustment for Large-Scale Reconstruction

KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch

Finite Element Analysis

FaceAPI: AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using TensorFlow/JS

Unofficial implementation of Pix2SEQ

A Python wrapper for Google Tesseract

Toontown: Galaxy, a new Toontown game based on Disney's Toontown Online

A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

Framework for abstracting Amiga debuggers and access to AmigaOS libraries and devices.

Image Captioning on google cloud platform based on iot

A transformer which can randomly augment VOC format dataset (both image and bbox) online.

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

Official PyTorch code for "BAM: Bottleneck Attention Module (BMVC2018)" and "CBAM: Convolutional Block Attention Module (ECCV2018)"

Exploring Image Deblurring via Blur Kernel Space (CVPR'21)

(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Official Pytorch implementation of Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.