AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

Last update: Dec 28, 2022

Related tags

Deep Learning AdaSpeech2

Overview

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data [WIP]

Unofficial Pytorch implementation of AdaSpeech 2.

Requirements :

All code written in Python 3.6.2 .

Install Pytorch

Before installing pytorch please check your Cuda version by running following command : nvcc --version

pip install torch torchvision

In this repo I have used Pytorch 1.6.0 for torch.bucketize feature which is not present in previous versions of PyTorch.

Installing other requirements :

pip install -r requirements.txt

To use Tensorboard install tensorboard version 1.14.0 seperatly with supported tensorflow (1.14.0)

For Preprocessing :

filelists folder contains MFA (Motreal Force aligner) processed LJSpeech dataset files so you don't need to align text with audio (for extract duration) for LJSpeech dataset. For other dataset follow instruction here. For other pre-processing run following command :

python nvidia_preprocessing.py -d path_of_wavs

For finding the min and max of F0 and Energy

python compute_statistics.py

Update the following in hparams.py by min and max of F0 and Energy

p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

Training :

[WIP]

Citations :

@misc{chen2021adaspeech,
      title={AdaSpeech: Adaptive Text to Speech for Custom Voice}, 
      author={Mingjian Chen and Xu Tan and Bohan Li and Yanqing Liu and Tao Qin and Sheng Zhao and Tie-Yan Liu},
      year={2021},
      eprint={2103.00993},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

@misc{yan2021adaspeech,
      title={AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data}, 
      author={Yuzi Yan and Xu Tan and Bohan Li and Tao Qin and Sheng Zhao and Yuan Shen and Tie-Yan Liu},
      year={2021},
      eprint={2104.09715},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

Related tags

Overview

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data [WIP]

Requirements :

For Preprocessing :

Training :

Citations :

Owner

Rishikesh (ऋषिकेश)

Official implementation of "Learning Forward Dynamics Model and Informed Trajectory Sampler for Safe Quadruped Navigation" (RSS 2022)

Finding an Unsupervised Image Segmenter in each of your Deep Generative Models

Deep ViT Features as Dense Visual Descriptors

LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

Multi-angle c(q)uestion answering

Codebase for Amodal Segmentation through Out-of-Task andOut-of-Distribution Generalization with a Bayesian Model

A deep-learning pipeline for segmentation of ambiguous microscopic images.

Estimation of human density in a closed space using deep learning.

Causal Imitative Model for Autonomous Driving

scalingscattering

Hard cater examples from Hopper ICLR paper

State of the Art Neural Networks for Generative Deep Learning

A code implementation of AC-GC: Activation Compression with Guaranteed Convergence, in NeurIPS 2021.

A simple tutoral for error correction task, based on Pytorch

Data for "Driving the Herd: Search Engines as Content Influencers" paper

EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

A Pytree Module system for Deep Learning in JAX

Sudoku solver - A sudoku solver with python

Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".