Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

Last update: Jan 09, 2023

Related tags

Overview

UniSpeech

The family of UniSpeech:

UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech-SAT (ICASSP 2022 Submission): Universal Speech Representation Learning with Speaker Aware Pre-Training

Pre-trained models

We strongly suggest using our UniSpeech-SAT model for speaker related tasks, since it shows very powerful performance on various speaker related benchmarks.

Model	Dataset	Model
UniSpeech Base	1500 hrs CommonVoice	download
UniSpeech Large	1500 hrs CommonVoice	download
UniSpeech-SAT Base	960 hrs LibriSpeech	download
UniSpeech-SAT Base+	60k hrs Libri-Light + 10k hrs GigaSpeech + 24k hrs VoxPopuli	download
UniSpeech-SAT Large	60k hrs Libri-Light + 10k hrs GigaSpeech + 24k hrs VoxPopuli	download

License

This project is licensed under the license found in the LICENSE file in the root directory of this source tree. Portions of the source code are based on the FAIRSEQ project.

Microsoft Open Source Code of Conduct

Contact Information

For help or issues using UniSpeech models, please submit a GitHub issue.

For other communications related to UniSpeech, please contact Yu Wu ([email protected]).

Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

Related tags

Overview

UniSpeech

Pre-trained models

License

Contact Information

Owner

Microsoft

CN24 is a complete semantic segmentation framework using fully convolutional networks

A generator of point clouds dataset for PyPipes.

Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

Post-Training Quantization for Vision transformers.

Dynamic Realtime Animation Control

Semi-supervised Representation Learning for Remote Sensing Image Classification Based on Generative Adversarial Networks

Automated Attendance Project Using Face Recognition

TriMap: Large-scale Dimensionality Reduction Using Triplets

Bulk2Space is a spatial deconvolution method based on deep learning frameworks

MoCoGAN: Decomposing Motion and Content for Video Generation

PyTorch implementaton of our CVPR 2021 paper "Bridging the Visual Gap: Wide-Range Image Blending"

Optimizing Deeper Transformers on Small Datasets

A symbolic-model-guided fuzzer for TLS

A lossless neural compression framework built on top of JAX.

using yolox+deepsort for object-tracker

Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs

Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

SplineConv implementation for Paddle.

MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation