A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Last update: Dec 28, 2022

Related tags

Deep Learning Pytorch-MBNet

Overview

Pytorch-MBNet

A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Training

To train a new model, please run train.py, the input arguments are:

--data_path: The path of the directory containing all .wav files of VCC-2018 and the train/dev/test split files (the files in ./data).
--save_dir: The path of the directory to save the trained models. Please create the directory before training.
--total_steps: The total #training step in the training.
--valid_steps: Do the validation every #(valid_steps) of training update.
--log_steps: Log the tensorboard every #(log_steps) of training update.
--update_freq: Gradient accumulation, the default value is 1 (no accumulation).

Testing

To test on VCC-2018, please run test.py, the input arguments are:

--model_path: The path to the saved model.
--idtable_path: The path to the "judge id-number" mapping table file used during training.
--step: The time step for tensorboard log, which can be the same as the training steps.
--split: The valid/test split of data to be used in the testing.

Inference

After training on the VCC data, the model can be utilized to inference on other data. The input arguments are --data_path, --model_path, --save_dir, which are similar to the above. Notice that the bias-net is not used since in this code the ground-truth judge ids are assumed to be unavailable.

The pre-trained model can be found in ./pre_trained.

A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Related tags

Overview

Pytorch-MBNet

Training

Testing

Inference

Owner

Deep motion generator collections

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

An end-to-end machine learning library to directly optimize AUC loss

A Next Generation ConvNet by FaceBookResearch Implementation in PyTorch(Original) and TensorFlow.

Implementation of OpenAI paper with Simple Noise Scale on Fastai V2

An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

Deployment of PyTorch chatbot with Flask

Python based framework for Automatic AI for Regression and Classification over numerical data.

This Artificial Intelligence program can take a black and white/grayscale image and generate a realistic or plausible colorized version of the same picture.

Instance-level Image Retrieval using Reranking Transformers

HandFoldingNet ✌️ : A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

The PyTorch implementation of paper REST: Debiased Social Recommendation via Reconstructing Exposure Strategies

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

cl;asification problem using classification models in supervised learning

Dynamic Slimmable Network (CVPR 2021, Oral)

A fast implementation of bss_eval metrics for blind source separation

This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.

Information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations