PyTorch implementation of Tacotron speech synthesis model.

Last update: Dec 09, 2022

Overview

tacotron_pytorch

PyTorch implementation of Tacotron speech synthesis model.

Inspired from keithito/tacotron. Currently not as much good speech quality as keithito/tacotron can generate, but it seems to be basically working. You can find some generated speech examples trained on LJ Speech Dataset at here.

If you are comfortable working with TensorFlow, I'd recommend you to try https://github.com/keithito/tacotron instead. The reason to rewrite it in PyTorch is that it's easier to debug and extend (multi-speaker architecture, etc) at least to me.

Requirements

PyTorch
TensorFlow (if you want to run the training script. This definitely can be optional, but for now required.)

Installation

git clone --recursive https://github.com/r9y9/tacotron_pytorch
pip install -e . # or python setup.py develop

If you want to run the training script, then you need to install additional dependencies.

pip install -e ".[train]"

Training

The package relis on keithito/tacotron for text processing, audio preprocessing and audio reconstruction (added as a submodule). Please follows the quick start section at https://github.com/keithito/tacotron and prepare your dataset accordingly.

If you have your data prepared, assuming your data is in "~/tacotron/training" (which is the default), then you can train your model by:

python train.py

Alignment, predicted spectrogram, target spectrogram, predicted waveform and checkpoint (model and optimizer states) are saved per 1000 global step in checkpoints directory. Training progress can be monitored by:

tensorboard --logdir=log

Testing model

Open the notebook in notebooks directory and change checkpoint_path to your model.

PyTorch implementation of Tacotron speech synthesis model.

Related tags

Overview

tacotron_pytorch

Requirements

Installation

Training

Testing model

Owner

Ryuichi Yamamoto

PyTorch implementation of PSPNet

Pgn2tex - Scripts to convert pgn files to latex document. Useful to build books or pdf from pgn studies

Light-weight network, depth estimation, knowledge distillation, real-time depth estimation, auxiliary data.

High performance distributed framework for training deep learning recommendation models based on PyTorch.

Probabilistic Tensor Decomposition of Neural Population Spiking Activity

Unified file system operation experience for different backend

Single-Shot Motion Completion with Transformer

VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

Disentangled Cycle Consistency for Highly-realistic Virtual Try-On, CVPR 2021

A naive ROS interface for visualDet3D.

An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

Measuring Coding Challenge Competence With APPS

Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

Colour detection is necessary to recognize objects, it is also used as a tool in various image editing and drawing apps.

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

Vignette is a face tracking software for characters using osu!framework.

Official repo for QHack—the quantum machine learning hackathon

Implementation of QuickDraw - an online game developed by Google, combined with AirGesture - a simple gesture recognition application

A memory-efficient implementation of DenseNets

Replication package for the manuscript "Using Personality Detection Tools for Software Engineering Research: How Far Can We Go?" submitted to TOSEM