Finetune SSL models for MOS prediction

Overview

Finetune SSL models for MOS prediction

This is code for our paper under review for ICASSP 2022:

"Generalization Ability of MOS Prediction Networks" Erica Cooper, Wen-Chin Huang, Tomoki Toda, Junichi Yamagishi https://arxiv.org/abs/2110.02635

Please cite this preprint if you use this code.

Dependencies:

  • Fairseq toolkit: https://github.com/pytorch/fairseq Make sure you can import fairseq in Python.
  • torch, numpy, scipy, torchaudio
  • I have exported my conda environment for this project to environment.yml
  • You also need to download a pretrained wav2vec2 model checkpoint. These can be obtained here: https://github.com/pytorch/fairseq/tree/main/examples/wav2vec Please choose wav2vec_small.pt, w2v_large_lv_fsh_swbd_cv.pt, or xlsr_53_56k.pt.
  • You also need to have a MOS dataset. Datasets for the MOS prediction challenge will be released once the challenge starts. TODO update with a link.

How to use

  • Modify the paths in mos_fairseq.py to point to your own data and SSL checkpoints.
  • Run python mos_fairseq.py to finetune an SSL model on the data.
  • Modify variables in predict.py to point to your favorite checkpoint.
  • run predict.py to run inference using that checkpoint.

Acknowledgments

This study is supported by JST CREST grants JP- MJCR18A6, JPMJCR20D3, and JPMJCR19A3, and by MEXT KAKENHI grants 21K11951 and 21K19808. Thanks to the organizers of the Blizzard Challenge and Voice Conversion Challenge, and to Zhenhua Ling, Zhihang Xie, and Zhizheng Wu for answering our questions about past challenges. Thanks also to the Fairseq team for making their code and models available.

License

BSD 3-Clause License

Copyright (c) 2021, Yamagishi Laboratory, National Institute of Informatics All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

  • Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Owner
Yamagishi and Echizen Laboratories, National Institute of Informatics
Yamagishi and Echizen Laboratories, National Institute of Informatics, Japan
Yamagishi and Echizen Laboratories, National Institute of Informatics
Tom-the-AI - A compound artificial intelligence software for Linux systems.

Tom the AI (version 0.82) WARNING: This software is not yet ready to use, I'm still setting up the GitHub repository. Should be ready in a few days. T

2 Apr 28, 2022
Hierarchical Few-Shot Generative Models

Hierarchical Few-Shot Generative Models Giorgio Giannone, Ole Winther This repo contains code and experiments for the paper Hierarchical Few-Shot Gene

Giorgio Giannone 6 Dec 12, 2022
Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Self-supervised Graph-level Representation Learning with Local and Global Structure Introduction This project is an implementation of ``Self-supervise

MilaGraph 50 Dec 09, 2022
Semi-Supervised Semantic Segmentation with Pixel-Level Contrastive Learning from a Class-wise Memory Bank

This repository provides the official code for replicating experiments from the paper: Semi-Supervised Semantic Segmentation with Pixel-Level Contrast

Iñigo Alonso Ruiz 58 Dec 15, 2022
Towards Improving Embedding Based Models of Social Network Alignment via Pseudo Anchors

PSML paper: Towards Improving Embedding Based Models of Social Network Alignment via Pseudo Anchors PSML_IONE,PSML_ABNE,PSML_DEEPLINK,PSML_SNNA: numpy

13 Nov 27, 2022
Semi-SDP Semi-supervised parser for semantic dependency parsing.

Semi-SDP Semi-supervised parser for semantic dependency parsing. This repo contains the code used for the semi-supervised semantic dependency parser i

12 Sep 17, 2021
Pytorch implementation of forward and inverse Haar Wavelets 2D

Pytorch implementation of forward and inverse Haar Wavelets 2D

Sergei Belousov 9 Oct 30, 2022
Jremesh-tools - Blender addon for quad remeshing

JRemesh Tools Blender 2.8 - 3.x addon for quad remeshing. Currently it is a wrap

Jayanam 89 Dec 30, 2022
We utilize deep reinforcement learning to obtain favorable trajectories for visual-inertial system calibration.

Unified Data Collection for Visual-Inertial Calibration via Deep Reinforcement Learning Update: The lastest code will be updated in this branch. Pleas

ETHZ ASL 27 Dec 29, 2022
A curated list of long-tailed recognition resources.

Awesome Long-tailed Recognition A curated list of long-tailed recognition and related resources. Please feel free to pull requests or open an issue to

Zhiwei ZHANG 542 Jan 01, 2023
Code for the paper Learning the Predictability of the Future

Learning the Predictability of the Future Code from the paper Learning the Predictability of the Future. Website of the project in hyperfuture.cs.colu

Computer Vision Lab at Columbia University 139 Nov 18, 2022
PenguinSpeciesPredictionML - Basic model to predict Penguin species based on beak size and sex.

Penguin Species Prediction (ML) 🐧 👨🏽‍💻 What? 💻 This project is a basic model using sklearn methods to predict Penguin species based on beak size

Tucker Paron 0 Jan 08, 2022
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

MAVE: : A Product Dataset for Multi-source Attribute Value Extraction The dataset contains 3 million attribute-value annotations across 1257 unique ca

Google Research Datasets 89 Jan 08, 2023
SAMO: Streaming Architecture Mapping Optimisation

SAMO: Streaming Architecture Mapping Optimiser The SAMO framework provides a method of optimising the mapping of a Convolutional Neural Network model

Alexander Montgomerie-Corcoran 20 Dec 10, 2022
DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

What is DeepHyper? DeepHyper is a software package that uses learning, optimization, and parallel computing to automate the design and development of

DeepHyper Team 214 Jan 08, 2023
Open standard for machine learning interoperability

Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides

Open Neural Network Exchange 13.9k Dec 30, 2022
Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals.

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals This repo contains the Pytorch implementation of our paper: Unsupervised Seman

Wouter Van Gansbeke 335 Dec 28, 2022
A annotation of yolov5-5.0

代码版本:0714 commit #4000 $ git clone https://github.com/ultralytics/yolov5 $ cd yolov5 $ git checkout 720aaa65c8873c0d87df09e3c1c14f3581d4ea61 这个代码只是注释版

Laughing 229 Dec 17, 2022
Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Deep Multi-Magnification Network This repository provides training and inference codes for Deep Multi-Magnification Network published here. Deep Multi

Computational Pathology 12 Aug 06, 2022
A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Alpha Zero General (any game, any framework!) A simplified, highly flexible, commented and (hopefully) easy to understand implementation of self-play

Surag Nair 3.1k Jan 05, 2023