Transformers Wav2Vec2 + Parlance's CTCDecodeTransformers Wav2Vec2 + Parlance's CTCDecode

Overview

🤗 Transformers Wav2Vec2 + Parlance's CTCDecode

Introduction

This repo shows how 🤗 Transformers can be used in combination with Parlance's ctcdecode & KenLM ngram as a simple way to boost word error rate (WER).

Included is a file to create an ngram with KenLM as well as a simple evaluation script to compare the results of using Wav2Vec2 with ctcdecode + KenLM vs. without using any language model.

Note: The scripts are written to be used on GPU. If you want to use a CPU instead, simply remove all .to("cuda") occurances in eval.py.

Installation

In a first step, one should install KenLM. For Ubuntu, it should be enough to follow the installation steps described here. The installed kenlm folder should be move into this repo for ./create_ngram.py to function correctly. Alternatively, one can also link the lmplz binary file to a lmplz bash command to directly run lmplz instead of ./kenlm/build/bin/lmplz.

Next, some Python dependencies should be installed. Assuming PyTorch is installed, it should be sufficient to run pip install -r requirements.txt.

Run evaluation

Create ngram

In a first step on should create a ngram. E.g. for polish the command would be:

./create_ngram.py --language polish --path_to_ngram polish.arpa

After the language model is created, one should open the file. one should add a The file should have a structure which looks more or less as follows:

\data\        
ngram 1=86586
ngram 2=546387
ngram 3=796581           
ngram 4=843999             
ngram 5=850874              
                                                  
\1-grams:
-5.7532206      
   
       0
0       
         -0.06677356                                                                            
-3.4645514      drugi   -0.2088903
...

   

Now it is very important also add a token to the n-gram so that it can be correctly loaded. You can simple copy the line:

0 -0.06677356

and change to . When doing this you should also inclease ngram by 1. The new ngram should look as follows:

\data\
ngram 1=86587
ngram 2=546387
ngram 3=796581
ngram 4=843999
ngram 5=850874

\1-grams:
-5.7532206      
    
        0
0       
          -0.06677356
0            -0.06677356
-3.4645514      drugi   -0.2088903
...

    

Now the ngram can be correctly used with pyctcdecode

Run eval

Having created the ngram, one can run:

./eval.py --language polish --path_to_ngram polish.arpa

To compare Wav2Vec2 + LM vs. Wav2Vec2 + No LM on polish.

Results

==================================================polish==================================================
polish - No LM - | WER: 0.3069742867206763 | CER: 0.06054530156286364 | Time: 32.37423086166382
polish - With LM - | WER: 0.39526828695550076 | CER: 0.17596985266474516 | Time: 62.017329692840576

I didn't obtain any good results even when trying out a variety of different settings for alpha and beta. Sadly there aren't many examples, tutorials or docs on parlance/ctcdecode so it's hard to find the reason for the problem.

Also tried it out for other languages like Portuguese and Spanish, but no luck there either.

Owner
Patrick von Platen
Patrick von Platen
An ActivityWatch watcher to pose questions to the user and record her answers.

aw-watcher-ask An ActivityWatch watcher to pose questions to the user and record her answers. This watcher uses Zenity to present dialog boxes to the

Bernardo Chrispim Baron 33 Dec 03, 2022
Utilize Korean BERT model in sentence-transformers library

ko-sentence-transformers 이 프로젝트는 KoBERT 모델을 sentence-transformers 에서 보다 쉽게 사용하기 위해 만들어졌습니다. Ko-Sentence-BERT-SKTBERT 프로젝트에서는 KoBERT 모델을 sentence-trans

Junghyun 40 Dec 20, 2022
Calibre recipe to convert latest issue of Analyse & Kritik into an ebook

Calibre Recipe für "Analyse & Kritik" Dies ist ein "Recipe" für die Konvertierung der aktuellen Ausgabe der Zeitung Analyse & Kritik in ein Ebook. Es

Henning 3 Jan 04, 2022
PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

PRAnCER (Platform enabling Rapid Annotation for Clinical Entity Recognition) is a web platform that enables the rapid annotation of medical terms within clinical notes. A user can highlight spans of

Sontag Lab 39 Nov 14, 2022
A simple chatbot based on chatterbot that you can use for anything has basic features

Chatbotium A simple chatbot based on chatterbot that you can use for anything has basic features. I have some errors Read the paragraph below: Known b

Herman 1 Feb 16, 2022
Extracting Summary Knowledge Graphs from Long Documents

GraphSum This repo contains the data and code for the G2G model in the paper: Extracting Summary Knowledge Graphs from Long Documents. The other basel

Zeqiu (Ellen) Wu 10 Oct 21, 2022
PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation

SITT The repo contains official PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation. Authors: Boyi Li Yin Cui T

Boyi Li 52 Jan 05, 2023
A Lightweight NLP Data Loader for All Deep Learning Frameworks in Python

LineFlow: Framework-Agnostic NLP Data Loader in Python LineFlow is a simple text dataset loader for NLP deep learning tasks. LineFlow was designed to

TofuNLP 177 Jan 04, 2023
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

Salesforce 564 Jan 08, 2023
Summarization module based on KoBART

KoBART-summarization Install KoBART pip install git+https://github.com/SKT-AI/KoBART#egg=kobart Requirements pytorch==1.7.0 transformers==4.0.0 pytor

seujung hwan, Jung 148 Dec 28, 2022
Need: Image Search With Python

Need: Image Search The problem is that a user needs to search for a specific ima

Surya Komandooru 1 Dec 30, 2021
A simple word search made in python

Word Search Puzzle A simple word search made in python Usage $ python3 main.py -h usage: main.py [-h] [-c] [-f FILE] Generates a word s

Magoninho 16 Mar 10, 2022
Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention

Sinkhorn Transformer This is a reproduction of the work outlined in Sparse Sinkhorn Attention, with additional enhancements. It includes a parameteriz

Phil Wang 217 Nov 25, 2022
Gold standard corpus annotated with verb-preverb connections for Hungarian.

Hungarian Preverb Corpus A gold standard corpus manually annotated with verb-preverb connections for Hungarian. corpus The corpus consist of the follo

RIL Lexical Knowledge Representation Research Group 3 Jan 27, 2022
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

Wasi Ahmad 138 Dec 30, 2022
hashily is a Python module that provides a variety of text decoding and encoding operations.

hashily is a python module that performs a variety of text decoding and encoding functions. It also various functions for encrypting and decrypting text using various ciphers.

DevMysT 5 Jul 17, 2022
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Hugging Face 15k Jan 02, 2023
Sequence-to-Sequence Framework in PyTorch

nmtpytorch allows training of various end-to-end neural architectures including but not limited to neural machine translation, image captioning and au

LIUM 395 Nov 21, 2022
Sapiens is a human antibody language model based on BERT.

Sapiens: Human antibody language model ____ _ / ___| __ _ _ __ (_) ___ _ __ ___ \___ \ / _` | '_ \| |/ _ \ '

Merck Sharp & Dohme Corp. a subsidiary of Merck & Co., Inc. 13 Nov 20, 2022
Yomichad - a Japanese pop-up dictionary that can display readings and English definitions of Japanese words

Yomichad is a Japanese pop-up dictionary that can display readings and English definitions of Japanese words, kanji, and optionally named entities. It is similar to yomichan, 10ten, and rikaikun in s

Jonas Belouadi 7 Nov 07, 2022