The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Last update: Jan 03, 2023

Related tags

Deep Learning WSRGlow

Overview

WSRGlow

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio samples can be found here.

Feel free to create issues or send an email to [email protected] if you have problems running the code.

Before running the code, you need to install the dependicies by pip install -r requirements.txt.

The configs for model architecture and training scheme is saved in config.yaml. You can overwrite some of the attributes by adding the --hparams flag when running a command. The general way to run a python script is

python $SRC$ --config $CONFIG$ --hparams $KEY1$=$VALUE1$,$KEY2$=$VALUE2$,...

See hparams.py for more details.

To prepare data

Before training, you need to binarize the data first. The raw wav files should be put in the hparams['raw_data_path']. The binarized data would be put in the hparams['binary_data_path'].

Specifically, for the VCTK corpus, the file structure should be like

.
|--data
    |--raw
        |--VCTK-Corpus
            |--wav48
                |--$WAVS
|--checkpoints
    |--wsrglow

where the model checkpoints are in checkpoints/wsrglow.

The command to binarize is

python binarizer.py --config config.yaml

To modify the architecture of the model

The current WSRGlow model in model.py is designed for x4 super-resolution and takes waveform, spectrogram and phase information as input.

To train

Run python train.py --config config.yaml on a GPU.

To infer

Change the code in infer.py to specify the checkpoint you want to load and the sample inputs you want to use for inference. Run python infer.py --config config.yaml on a GPU, modify the code for the correct path of checkpoints and wav files.

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Related tags

Overview

WSRGlow

To prepare data

To modify the architecture of the model

To train

To infer

Owner

Kexun Zhang

Streamlit Tutorial (ex: stock price dashboard, cartoon-stylegan, vqgan-clip, stylemixing, styleclip, sefa)

use machine learning to recognize gesture on raspberrypi

Implementation of paper "Graph Condensation for Graph Neural Networks"

HyperCube: Implicit Field Representations of Voxelized 3D Models

Simple tutorials using Google's TensorFlow Framework

Cross-Modal Contrastive Learning for Text-to-Image Generation

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

Understanding Hyperdimensional Computing for Parallel Single-Pass Learning

An Evaluation of Generative Adversarial Networks for Collaborative Filtering.

Unofficial implementation of Pix2SEQ

Official page of Patchwork (RA-L'21 w/ IROS'21)

Data manipulation and transformation for audio signal processing, powered by PyTorch

tensorflow implementation of 'YOLO : Real-Time Object Detection'

Graph-total-spanning-trees - A Python script to get total number of Spanning Trees in a Graph

Code for the submitted paper Surrogate-based cross-correlation for particle image velocimetry

GradAttack is a Python library for easy evaluation of privacy risks in public gradients in Federated Learning

PyTorch implementation of Weak-shot Fine-grained Classification via Similarity Transfer

Trained on Simulated Data, Tested in the Real World

Visual Question Answering in Pytorch

Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021