A Pytorch Implementation of ClariNet

Last update: Sep 15, 2022

Overview

ClariNet

A Pytorch Implementation of ClariNet (Mel Spectrogram --> Waveform)

Requirements

PyTorch 0.4.1 & python 3.6 & Librosa

Examples

Step 1. Download Dataset

LJSpeech : https://keithito.com/LJ-Speech-Dataset/

Step 2. Preprocessing (Preparing Mel Spectrogram)

python preprocessing.py --in_dir ljspeech --out_dir DATASETS/ljspeech

Step 3. Train Gaussian Autoregressive WaveNet (Teacher)

python train.py --model_name wavenet_gaussian --batch_size 8 --num_blocks 2 --num_layers 10

Step 4. Synthesize (Teacher)

--load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

python synthesize.py --model_name wavenet_gaussian --num_blocks 2 --num_layers 10 --load_step 10000 --num_samples 5

Step 5. Train Gaussian Inverse Autoregressive Flow (Student)

--teacher_name (YOUR TEACHER MODEL'S NAME)

--teacher_load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

--KL_type qp : Reversed KL divegence KL(q||p) or --KL_type pq : Forward KL divergence KL(p||q)

python train_student.py --model_name wavenet_gaussian_student --teacher_name wavenet_gaussian --teacher_load_step 10000 --batch_size 2 --num_blocks_t 2 --num_layers_t 10 --num_layers_s 10 --KL_type qp

Step 6. Synthesize (Student)

--model_name (YOUR STUDENT MODEL'S NAME)

--load_step CHECKPOINT : the # of the pre-trained student model's global training step (also depicted in the trained weight file)

--teacher_name (YOUR TEACHER MODEL'S NAME)

--teacher_load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

python synthesize_student.py --model_name wavenet_gaussian_student --load_step 10000 --teacher_name wavenet_gaussian --teacher_load_step 10000 --num_blocks_t 2 --num_layers_t 10 --num_layers_s 10 --num_samples 5

References

WaveNet vocoder : https://github.com/r9y9/wavenet_vocoder
ClariNet : https://arxiv.org/abs/1807.07281

A Pytorch Implementation of ClariNet

Related tags

Overview

ClariNet

Requirements

Examples

Step 1. Download Dataset

Step 2. Preprocessing (Preparing Mel Spectrogram)

Step 3. Train Gaussian Autoregressive WaveNet (Teacher)

Step 4. Synthesize (Teacher)

Step 5. Train Gaussian Inverse Autoregressive Flow (Student)

Step 6. Synthesize (Student)

References

Owner

Sungwon Kim

PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning

Using Machine Learning to Create High-Res Fine Art

Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features

This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

PyDeepFakeDet is an integrated and scalable tool for Deepfake detection.

Must-read Papers on Physics-Informed Neural Networks.

Corgis are the cutest creatures; have 30K of them!

Implementation of hyperparameter optimization/tuning methods for machine learning & deep learning models

Efficient 3D Backbone Network for Temporal Modeling

RoFormer_pytorch

The official implementation of the IEEE S&P`22 paper "SoK: How Robust is Deep Neural Network Image Classification Watermarking".

Transfer Learning library for Deep Neural Networks.

General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends)

Python scripts for performing object detection with the 1000 labels of the ImageNet dataset in ONNX.

SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from UAV Images.

Build and run Docker containers leveraging NVIDIA GPUs

This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]