Pytorch implementation of Tacotron

Last update: Dec 02, 2022

Overview

Tacotron-pytorch

A pytorch implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.

Requirements

Install python 3
Install pytorch == 0.2.0
Install requirements:
```
pip install -r requirements.txt
```

Data

I used LJSpeech dataset which consists of pairs of text script and wav files. The complete dataset (13,100 pairs) can be downloaded here. I referred https://github.com/keithito/tacotron for the preprocessing code.

File description

hyperparams.py includes all hyper parameters that are needed.
data.py loads training data and preprocess text to index and wav files to spectrogram. Preprocessing codes for text is in text/ directory.
module.py contains all methods, including CBHG, highway, prenet, and so on.
network.py contains networks including encoder, decoder and post-processing network.
train.py is for training.
synthesis.py is for generating TTS sample.

Training the network

STEP 1. Download and extract LJSpeech data at any directory you want.
STEP 2. Adjust hyperparameters in hyperparams.py, especially 'data_path' which is a directory that you extract files, and the others if necessary.
STEP 3. Run train.py.

Generate TTS wav file

STEP 1. Run synthesis.py. Make sure the restore step.

Samples

You can check the generated samples in 'samples/' directory. Training step was only 60K, so the performance is not good yet.

Reference

Keith ito: https://github.com/keithito/tacotron

Comments

Any comments for the codes are always welcome.

Pytorch implementation of Tacotron

Related tags

Overview

Tacotron-pytorch

Requirements

Data

File description

Training the network

Generate TTS wav file

Samples

Reference

Comments

Owner

soobin seo

Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

kochat

SentAugment is a data augmentation technique for semi-supervised learning in NLP.

A NLP program: tokenize method, PoS Tagging with deep learning

Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Russian words synonyms and antonyms

AEC_DeepModel - Deep learning based acoustic echo cancellation baseline code

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Fully featured implementation of Routing Transformer

结巴中文分词

To be a next-generation DL-based phenotype prediction from genome mutations.

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Code for the paper TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks

Deep learning for NLP crash course at ABBYY.

PyTorch impelementations of BERT-based Spelling Error Correction Models.

Library for Russian imprecise rhymes generation

Help you discover excellent English projects and get rid of disturbing by other spoken language

[EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction

Almost State-of-the-art Text Generation library