The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Last update: Oct 28, 2022

Related tags

Overview

VAENAR-TTS

This repo contains code accompanying the paper "VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis".

Samples | Paper | Pretrained Models

Usage

0. Dataset

English: LJSpeech
Mandarin: DataBaker(标贝)

1. Environment setup

conda env create -f environment.yml
conda activate vaenartts-env

2. Data pre-processing

For English using LJSpeech:

CUDA_VISIBLE_DEVICES= python preprocess.py --dataset ljspeech --data_dir /path/to/extracted/LJSpeech-1.1 --save_dir ./ljspeech

For Mandarin using Databaker(标贝):

CUDA_VISIBLE_DEVICES= python preprocess.py --dataset databaker --data_dir /path/to/extracted/biaobei --save_dir ./databaker

3. Training

For English using LJSpeech:

CUDA_VISIBLE_DEVICES=0 TF_FORCE_GPU_ALLOW_GROWTH=true python train.py --dataset ljspeech --log_dir ./lj-log_dir --test_dir ./lj-test_dir --data_dir ./ljspeech/tfrecords/ --model_dir ./lj-model_dir

For Mandarin using Databaker(标贝):

CUDA_VISIBLE_DEVICES=0 TF_FORCE_GPU_ALLOW_GROWTH=true python train.py --dataset databaker --log_dir ./db-log_dir --test_dir ./db-test_dir --data_dir ./databaker/tfrecords/ --model_dir ./db-model_dir

4. Inference (synthesize speech for the whole test set)

For English using LJSpeech:

CUDA_VISIBLE_DEVICES=0 TF_FORCE_GPU_ALLOW_GROWTH=true python inference.py --dataset ljspeech --test_dir ./lj-test-2000 --data_dir ./ljspeech/tfrecords/ --batch_size 16 --write_wavs true --draw_alignments true --ckpt_path ./lj-model_dir/ckpt-2000

For Mandarin using Databaker(标贝):

CUDA_VISIBLE_DEVICES=0 TF_FORCE_GPU_ALLOW_GROWTH=true python inference.py --dataset databaker --test_dir ./db-test-2000 --data_dir ./databaker/tfrecords/ --batch_size 16 --write_wavs true --draw_alignments true --ckpt_path ./db-model_dir/ckpt-2000

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Related tags

Overview

VAENAR-TTS

Samples | Paper | Pretrained Models

Usage

0. Dataset

1. Environment setup

2. Data pre-processing

3. Training

4. Inference (synthesize speech for the whole test set)

Reference

Owner

THUHCSI

Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)

Supplementary code for TISMIR paper "Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form"

Embeddinghub is a database built for machine learning embeddings.

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are implemented and can be seen in tensorboard.

Robust Consistent Video Depth Estimation

GAN-generated image detection based on CNNs

Plug and play transformer you can find network structure and official complete code by clicking List

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

MAg: a simple learning-based patient-level aggregation method for detecting microsatellite instability from whole-slide images

Geometry-Free View Synthesis: Transformers and no 3D Priors

McGill Physics Hackathon 2021: Reaction-Diffusion Models for the Generation of Biological Patterns

An efficient implementation of GPNN

Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

YoloAll is a collection of yolo all versions. you you use YoloAll to test yolov3/yolov5/yolox/yolo_fastest

This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend.

Position detection system of mobile robot in the warehouse enviroment

Generating Band-Limited Adversarial Surfaces Using Neural Networks