Jax/Flax implementation of Variational-DiffWave.

Last update: Dec 16, 2022

Overview

jax-variational-diffwave

Jax/Flax implementation of Variational-DiffWave. (Zhifeng Kong et al., 2020, Diederik P. Kingma et al., 2021.)

DiffWave with Continuous-time Variational Diffusion Models.
DiffWave: A Versatile Diffusion Model for Audio Synthesis, Zhifeng Kong et al., 2020. [arXiv:2009.09761]
Variational Diffusion Models, Diederik P. Kingma et al., 2021. [arXiv:2107.00630]

Requirements

Tested in python 3.7.9 conda environment, requirements.txt

Usage

To train model, run train.py.
Checkpoint will be written on TrainConfig.ckpt, tensorboard summary on TrainConfig.log.

python train.py --data-dir /datasets/ljspeech --from-raw
tensorboard --logdir ./log/

To start to train from previous checkpoint, --load-step is available.

python train.py --load-epoch 10 --config ./ckpt/l1.json

[WIP] To synthesize test set, run synth.py.

python synth.py

[WIP] Pretrained checkpoints are relased on releases.

To use pretrained model, download files and unzip it.
Checkout git repository to proper commit tags and following is sample script.

with open('l1.json') as f:
    config = Config.load(json.load(f))

diffwave = VLBDiffWaveApp(config.model)
diffwave.restore('./l1/l1_99.ckpt')

# mel: [B, T, mel]
audio, _ = diffwave(mel, timesteps=50, key=jax.random.PRNGKey(0))

Jax/Flax implementation of Variational-DiffWave.

Related tags

Overview

jax-variational-diffwave

Requirements

Usage

Owner

YoungJoong Kim

This repository contains the database and code used in the paper Embedding Arithmetic for Text-driven Image Transformation

使用yolov5训练自己数据集(详细过程)并通过flask部署

Measure WWjj polarization fraction

🕹️ Official Implementation of Conditional Motion In-betweening (CMIB) 🏃

Official code for 'Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentationon Complex Urban Driving Scenes'

Simple embedding based text classifier inspired by fastText, implemented in tensorflow

PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

交互式标注软件，暂定名 iann

An inofficial PyTorch implementation of PREDATOR based on KPConv.

pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

CTF Challenge for CSAW Finals 2021

IMBENS: class-imbalanced ensemble learning in Python.

Use VITS and Opencpop to develop singing voice synthesis; Maybe it will VISinger.

Towhee is a flexible machine learning framework currently focused on computing deep learning embeddings over unstructured data.

[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

Code for "Diversity can be Transferred: Output Diversification for White- and Black-box Attacks"

September-Assistant - Open-source Windows Voice Assistant

PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment

Quantum-enhanced transformer neural network