Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

Last update: Dec 23, 2022

Related tags

Overview

GradTTS

Unofficial Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech" (arxiv)

About this repo

This is an unofficial implementation of GradTTS. We created this project based on GlowTTS (https://github.com/jaywalnut310/glow-tts). We replace the GlowDecoder with DiffusionDecoder which follows the settings of the original paper. In addition, we also replace torch.distributed with horovod for convenience and we don't use fp16 now.

Training and inference

Please go to egs/ folder, and see run.sh and inference_waveglow_vocoder.py for example use. Before training, please download and extract the LJ Speech dataset, then rename or create a link to the dataset folder: ln -s /path/to/LJSpeech-1.1/wavs DUMMY. And build Monotonic Alignment Search Code (Cython): cd monotonic_align; python setup.py build_ext --inplace. Before inference, you should download waveglow checkpoint from download_link and put it into the waveglow folder.

Reference Materials

Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

GlowTTS

Score-Based Generative Modeling through Stochastic Differential Equations

score_sde_pytorch

denoising-diffusion-pytorch

Authors

Heyang Xue(https://github.com/WelkinYang) and Qicong Xie(https://github.com/QicongXie)

Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

Related tags

Overview

GradTTS

Unofficial Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech" (arxiv)

About this repo

Training and inference

Reference Materials

Authors

Owner

HeyangXue1997

Example repository for custom C++/CUDA operators for TorchScript

Pairwise model for commonlit competition

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

Official implementation of CVPR2020 paper "Deep Generative Model for Robust Imbalance Classification"

[ICCV21] Self-Calibrating Neural Radiance Fields

Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

LSSY量化交易系统

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

Python package for downloading ECMWF reanalysis data and converting it into a time series format.

One Million Scenes for Autonomous Driving

Exact Pareto Optimal solutions for preference based Multi-Objective Optimization

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Repository for the Bias Benchmark for QA dataset.

FaceAnon - Anonymize people in images and videos using yolov5-crowdhuman

This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

The code for the NeurIPS 2021 paper "A Unified View of cGANs with and without Classifiers".