Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding

Last update: Sep 21, 2022

Related tags

Overview

Wav2Vec2CTC With KenLM

Using KenLM ARPA language model with beam search to decode audio files and show the most probable transcription.

Assuming you've already installed HuggingFace transformers library, you need also to install the ctcdecode library:

git clone --recursive https://github.com/parlance/ctcdecode.git
cd ctcdecode && pip install .

Then, you need to change the language model path from inside the script wav2vec2_kenlm.py:

lm_path = "YOUR ARPA LANGUAGE MODEL PATH"

You may download a pretrained ARPA English Language model from this link.

Finally, run the script and see the result:

python wav2vec2_kenlm.py

Acknowledgments

This project uses the functionalities of different open-source projects that are mentioned below.

Owner

farisalasmary

GitHub Repository

내부 작업용 django + vue(vuetify) boilerplate. 짠 하면 돌아감.

Pocket Galaxy 아주 간단한 개인용, 혹은 내부용 툴을 만들어야하는데 이왕이면 웹이 편하죠? 그럴때를 위해 만들어둔 django와 vue(vuetify)로 이뤄진 boilerplate 입니다. 각 폴더에 있는 설명서대로 실행을 시키면 일단 당장 뭔가가 돌아갑니

16 Dec 03, 2021

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux

Command-line tools for speech and intent recognition on Linux

988 Jan 04, 2023

Official PyTorch implementation of SegFormer

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers Figure 1: Performance of SegFormer-B0 to SegFormer-B5. Project page

1.4k Dec 29, 2022

NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT

NeuralQA: A Usable Library for (Extractive) Question Answering on Large Datasets with BERT Still in alpha, lots of changes anticipated. View demo on n

220 Dec 11, 2022

A high-level yet extensible library for fast language model tuning via automatic prompt search

ruPrompts ruPrompts is a high-level yet extensible library for fast language model tuning via automatic prompt search, featuring integration with Hugg

37 Dec 07, 2022

Fast, DB Backed pretrained word embeddings for natural language processing.

Embeddings Embeddings is a python package that provides pretrained word embeddings for natural language processing and machine learning. Instead of lo

212 Nov 21, 2022

TextFlint is a multilingual robustness evaluation platform for natural language processing tasks,

TextFlint is a multilingual robustness evaluation platform for natural language processing tasks, which unifies general text transformation, task-specific transformation, adversarial attack, sub-popu

587 Dec 20, 2022

Installation, test and evaluation of Scribosermo speech-to-text engine

Scribosermo STT Setup Scribosermo is a LGPL licensed, open-source speech recognition engine to "Train fast Speech-to-Text networks in different langua

3 Jun 20, 2022

SummerTime - Text Summarization Toolkit for Non-experts

A library to help users choose appropriate summarization tools based on their specific tasks or needs. Includes models, evaluation metrics, and datasets.

213 Jan 04, 2023

Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data

Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data Authors: Yi-Chang Chen, Yu-Chuan Chang, Yen-Cheng Chang and Yi-Ren Ye

5 Dec 15, 2022

Code Generation using a large neural network called GPT-J

CodeGenX is a Code Generation system powered by Artificial Intelligence! It is delivered to you in the form of a Visual Studio Code Extension and is Free and Open-source!

389 Dec 31, 2022

Bnagla hand written document digiiztion

Bnagla hand written document digiiztion This repo addresses the problem of digiizing hand written documents in Bangla. Documents have definite fields

1 Dec 10, 2021

Persian Bert For Long-Range Sequences

ParsBigBird: Persian Bert For Long-Range Sequences The Bert and ParsBert algorithms can handle texts with token lengths of up to 512, however, many ta

63 Dec 14, 2022

CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)

CrossNER is a fully-labeled collected of named entity recognition (NER) data spanning over five diverse domains (Politics, Natural Science, Music, Literature, and Artificial Intelligence) with specia

89 Nov 10, 2022

本插件是pcrjjc插件的重置版，可以独立于后端api运行

pcrjjc2 本插件是pcrjjc重置版，不需要使用其他后端api，但是需要自行配置客户端本项目基于AGPL v3协议开源，由于项目特殊性，禁止基于本项目的任何商业行为配置方法环境需求：.net framework 4.5及以上 jre8 别忘了装jre8 别忘了装jre8 别忘了装jre8

132 Dec 26, 2022

Exploring dimension-reduced embeddings

sleepwalk Exploring dimension-reduced embeddings This is the code repository. See here for the Sleepwalk web page. License and disclaimer This program

91 Nov 29, 2022

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

PAUSE: Positive and Annealed Unlabeled Sentence Embedding Sentence embedding refers to a set of effective and versatile techniques for converting raw

21 Dec 15, 2022

ETM - R package for Topic Modelling in Embedding Spaces

ETM - R package for Topic Modelling in Embedding Spaces This repository contains an R package called topicmodels.etm which is an implementation of ETM

37 Nov 06, 2022

Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)

Linear Multihead Attention (Linformer) PyTorch Implementation of reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer:

58 Dec 23, 2022

SimCTG - A Contrastive Framework for Neural Text Generation

A Contrastive Framework for Neural Text Generation Authors: Yixuan Su, Tian Lan,

345 Jan 03, 2023