An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Last update: Jun 16, 2022

Related tags

Computer Vision AutoVC

Overview

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

This is an unofficial implementation of AutoVC based on the official one.

The repository is still under construction, so some details may be missing or incomplete.

Preprocessing

python preprocess.py <data_path> <save_path> <encoder_path> [--seg_len seg] [--n_workers workers]

Training

python train.py <config> <data_path> <save_path> [--n_steps steps] [--save_steps save] [--log_steps log] [--batch_size batch] [--seg_len seg]

Reference

Please cite the paper if you find it useful.

@InProceedings{pmlr-v97-qian19c,
  title = {{A}uto{VC}: Zero-Shot Voice Style Transfer with Only Autoencoder Loss},
  author = {Qian, Kaizhi and Zhang, Yang and Chang, Shiyu and Yang, Xuesong and Hasegawa-Johnson, Mark},
  pages = {5210--5219},
  year = {2019},
  editor = {Kamalika Chaudhuri and Ruslan Salakhutdinov},
  volume = {97},
  series = {Proceedings of Machine Learning Research},
  address = {Long Beach, California, USA},
  month = {09--15 Jun},
  publisher = {PMLR},
  pdf = {http://proceedings.mlr.press/v97/qian19c/qian19c.pdf},
  url = {http://proceedings.mlr.press/v97/qian19c.html}
}

An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Related tags

Overview

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Preprocessing

Training

Reference

Owner

Chien-yu Huang

RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection

A real-time dolly zoom camera effect

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

BNF Globalization Code (CVPR 2016)

OCR engine for all the languages

Dirty, ugly, and hopefully useful OCR of Facebook Papers docs released by Gizmodo

An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Virtual Zoom Gesture using OpenCV

An organized collection of tutorials and projects created for aspriring computer vision students.

A pkg stiching around view images(4-6cameras) to generate bird's eye view.

This is a real life mario project using python and mediapipe

Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper.

OCR-D-compliant page segmentation

Textboxes : Image Text Detection Model : python package (tensorflow)

Image augmentation for machine learning experiments.

基于Paddle框架的PSENet复现

Creating a virtual tv using opencv in python3.

Fully-automated scripts for collecting AI-related papers

Captcha Recognition