Persian Kaldi profile for Rhasspy built from open speech data

Last update: Aug 08, 2022

Related tags

Miscellaneous fa_kaldi-rhasspy

Overview

Persian Kaldi Profile

A Rhasspy profile for Persian (fa).

Installation

Get started by first installing Vosk:

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
pip3 install --upgrade pip
pip3 install --upgrade wheel setuptools

# Install Vosk
pip3 install vosk

Next, download the model and extract it:

wget 'https://github.com/rhasspy/fa_kaldi-rhasspy/releases/download/v1.0/vosk-model-small-fa-rhasspy-0.15.zip'
unzip vosk-model-small-fa-rhasspy-0.15.zip

Finally, run the transcribe.py Python program with the model and an audio file:

python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 welcome.wav

{"result": [{"conf": 1.0, "end": 0.48, "start": 0.06, "word": "خوش"}, {"conf": 1.0, "end": 1.11, "start": 0.48, "word": "آمدید"}], "text": "خوش آمدید"}

For each audio file given to transcribe.py, a line of JSON will be printed in the output with the transcription details.

You might also like...

Service for working with open data of the State Duma of the Russian Federation

Сервис для работы с открытыми данными Госдумы РФ Исходные данные из API Госдумы РФ извлекаются с помощью Apache Nifi и приземляются в хранилище Clickh

2 Feb 14, 2022

Driving lessons made simpler. Custom scheduling API built with Python.

NOTE This is a mirror of a GitLab repository. Dryvo Dryvo is a unique solution for the driving lessons industry. Our aim is to save the teacher’s time

595 Dec 5, 2022

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

64 Sep 28, 2022

This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.

Python Projects {Open Source} Introduction The repository was built with a tree-like structure in mind, it contains collections of Python Projects. Mo

115 Apr 30, 2022

This is a repository for voting software built using Choice Coin on the Algorand Network.

A repository for voting systems using Choice Coin.

633 Dec 23, 2022

Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing method and position changing method.

Three-different-method-for-list-reversion Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing met

4 Sep 24, 2021

Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.

🌍 Take back your privacy with Dot Browser, the privacy-conscious web browser that protects you from being tracked and monitored online.

1k Jan 7, 2023

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

guess-the-numbers Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Number guessing game

5 Oct 9, 2021

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

password-generator Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Password generator

3 Oct 9, 2021

Comments

PySoundFile failed. Trying audioread instead.
I just tried to run this command: python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 MyFile.mp3

and got this error:

/your/path/.venv/lib/python3.9/site-packages/librosa/util/decorators.py:88: UserWarning: PySoundFile failed. Trying audioread instead. return f(*args, **kwargs)

Thank you so much
opened by GameO7er 1
ModuleNotFoundError: No module named 'librosa'
I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

Traceback (most recent call last): File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module> import librosa ModuleNotFoundError: No module named 'librosa'

Thank you so much.
opened by GameO7er 1
ModuleNotFoundError: No module named 'numpy'
I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

Traceback (most recent call last): File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module> import librosa ModuleNotFoundError: No module named 'numpy'

Thank you so much.
opened by GameO7er 1

Error using recipes

Hello, Thanks for you great work for sharing this useful repo. I tried to use your recipes to train Persian data. In run.sh file, an error ocurred while adapting lm.arpa and creating G.fst:

creating G.fst...
arpa2fst -
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
ERROR: FstHeader::Read: Bad FST header: standard input

full run.sh output is:

Runtime configuration is: nJobs 12, nDecodeJobs 12. If this is not what you want, edit cmd.sh
Starting at stage 0, train_stage -10

Prepare phoneme data for Kaldi

utils/prepare_lang.sh data/local/dict <unk> data/local/lang data/lang
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/silence_phones.txt is OK

Checking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/optional_silence.txt is OK

Checking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/nonsilence_phones.txt is OK

Checking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.

Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/lexicon.txt is OK

Checking data/local/dict/extra_questions.txt ...
--> reading data/local/dict/extra_questions.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/extra_questions.txt is OK
--> SUCCESS [validating dictionary directory data/local/dict]

**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int
prepare_lang.sh: validating output directory
utils/validate_lang.pl data/lang
Checking existence of separator file
separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case.
Checking data/lang/phones.txt ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/phones.txt is OK

Checking words.txt: #0 ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/words.txt is OK

Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OK

Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> found no unexplainable phones in phones.txt

Checking data/lang/phones/context_indep.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 15 entry/entries in data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.{txt, int, csl} are OK

Checking data/lang/phones/nonsilence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 116 entry/entries in data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.{txt, int, csl} are OK

Checking data/lang/phones/silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 15 entry/entries in data/lang/phones/silence.txt
--> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.{txt, int, csl} are OK

Checking data/lang/phones/optional_silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.{txt, int, csl} are OK

Checking data/lang/phones/disambig.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 14 entry/entries in data/lang/phones/disambig.txt
--> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.{txt, int, csl} are OK

Checking data/lang/phones/roots.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 32 entry/entries in data/lang/phones/roots.txt
--> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
--> data/lang/phones/roots.{txt, int} are OK

Checking data/lang/phones/sets.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 32 entry/entries in data/lang/phones/sets.txt
--> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
--> data/lang/phones/sets.{txt, int} are OK

Checking data/lang/phones/extra_questions.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 11 entry/entries in data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.{txt, int} are OK

Checking data/lang/phones/word_boundary.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 131 entry/entries in data/lang/phones/word_boundary.txt
--> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt
--> data/lang/phones/word_boundary.{txt, int} are OK

Checking optional_silence.txt ...
--> reading data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.txt is OK

Checking disambiguation symbols: #0 and #1
--> data/lang/phones/disambig.txt has "#0" and "#1"
--> data/lang/phones/disambig.txt is OK

Checking topo ...

Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ...
--> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols
--> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt
--> data/lang/phones/word_boundary.txt is OK

Checking word-level disambiguation symbols...
--> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
Checking word_boundary.int and disambig.int
--> generating a 35 word/subword sequence
--> resulting phone sequence from L.fst corresponds to the word sequence
--> L.fst is OK
--> generating a 45 word/subword sequence
--> resulting phone sequence from L_disambig.fst corresponds to the word sequence
--> L_disambig.fst is OK

Checking data/lang/oov.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/oov.txt
--> data/lang/oov.int corresponds to data/lang/oov.txt
--> data/lang/oov.{txt, int} are OK

--> data/lang/L.fst is olabel sorted
--> data/lang/L_disambig.fst is olabel sorted
--> SUCCESS [validating lang directory data/lang]

adapt our LM for kaldi...


creating G.fst...
arpa2fst -
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
ERROR: FstHeader::Read: Bad FST header: standard input

make mfcc

fix_data_dir.sh: kept all 12394 utterances.
fix_data_dir.sh: old files are kept in data/train/.backup
mkdir: cannot create directory 'data/train/wav.scp': File exists
steps/make_mfcc.sh --cmd utils/run.pl --nj 12 data/train exp/make_mfcc_chain/train mfcc_chain
utils/validate_data_dir.sh: Successfully validated data-directory data/train
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.

can you please help me fix this issue? thanks

opened by MahdiEsrafili 0

Persian Kaldi profile for Rhasspy built from open speech data

Related tags

Overview

Persian Kaldi Profile

Installation

You might also like...

Service for working with open data of the State Duma of the Russian Federation

Driving lessons made simpler. Custom scheduling API built with Python.

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.

This is a repository for voting software built using Choice Coin on the Algorand Network.

Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing method and position changing method.

Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

Comments

PySoundFile failed. Trying audioread instead.

ModuleNotFoundError: No module named 'librosa'

ModuleNotFoundError: No module named 'numpy'

Error using recipes

Releases(v1.0)

v1.0(Oct 9, 2021)

Owner

Rhasspy

Cute study buddy that helps you study with the Pomodoro technique!

Small scripts to learn about GNOME internals

A small scale relica of bank management system using the MySQL queries in the python language.

Why write code when you can import it directly from GitHub Copilot?

Blender addon, import and update mixamo animation

Mommas-cookbook - A Repository About Mom's Recipes

This repository contains Python games that I've worked on. You'll learn how to create python games with AI. I try to focus on creating board games without GUI in Jupyter-notebook.

LiteX-Acorn-Baseboard is a baseboard developed around the SQRL's Acorn board (or Nite/LiteFury) expanding their possibilities

Syntax highlighting for yarn.lock and bun.lockb files

Um pequeno painel de consulta

用于红队成员初步快速攻击的全自动化工具。

A Python module for decorators, wrappers and monkey patching.

Blender addons - A collection of Blender tools I've written for myself over the years.

In this project, we are going to display the battery notification and the time left for the battery to drain out using the battery capacity value.

A python script made for personal use to monitor for sports card restocks on target.com since they are sold out often

Plugin to manage site, circuit and device diagrams and documents in Netbox

Demodulate and error correct FIS-B and ADS-B signals on 978 MHz.

Checkers Project Built Using Python

用于导出墨墨背单词的词库，并生成适用于 List 背单词，不背单词，欧陆词典等的自定义词库

Structural basis for solubility in protein expression systems