VisemeNet is an audio-driven animator centric speech animation driving a JALI or standard FACS-based face-rigging from input audio.
The original repo is outdated and difficult to setup the environment for testing the pretrained model. This code is to provide a super-clean inference module based on the original author's repo.
How to freeze graph
This repo does not need bazel-build for "freeze-graph" function
The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergen