Python interface to the WebRTC Voice Activity Detector

Related tags

Audiopy-webrtcvad
Overview
https://travis-ci.org/wiseman/py-webrtcvad.svg?branch=master

py-webrtcvad

This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3.

A VAD classifies a piece of audio data as being voiced or unvoiced. It can be useful for telephony and speech recognition.

The VAD that Google developed for the WebRTC project is reportedly one of the best available, being fast, modern and free.

How to use it

  1. Install the webrtcvad module:

    pip install webrtcvad
    
  2. Create a Vad object:

    import webrtcvad
    vad = webrtcvad.Vad()
    
  3. Optionally, set its aggressiveness mode, which is an integer between 0 and 3. 0 is the least aggressive about filtering out non-speech, 3 is the most aggressive. (You can also set the mode when you create the VAD, e.g. vad = webrtcvad.Vad(3)):

    vad.set_mode(1)
    
  4. Give it a short segment ("frame") of audio. The WebRTC VAD only accepts 16-bit mono PCM audio, sampled at 8000, 16000, 32000 or 48000 Hz. A frame must be either 10, 20, or 30 ms in duration:

    # Run the VAD on 10 ms of silence. The result should be False.
    sample_rate = 16000
    frame_duration = 10  # ms
    frame = b'\x00\x00' * int(sample_rate * frame_duration / 1000)
    print 'Contains speech: %s' % (vad.is_speech(frame, sample_rate)
    

See example.py for a more detailed example that will process a .wav file, find the voiced segments, and write each one as a separate .wav.

How to run unit tests

To run unit tests:

pip install -e ".[dev]"
python setup.py test

History

2.0.10

Fixed memory leak. Thank you, bond005!

2.0.9

Improved example code. Added WebRTC license.

2.0.8

Fixed Windows compilation errors. Thank you, xiongyihui!
Owner
John Wiseman
John Wiseman
Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums)

LAKH MuseNet MIDI Dataset Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums) Bonus: Choir on Channel 10 Please CC

Alex 6 Nov 20, 2022
Generating a structured library of .wav samples with Python.

sample-library Scripts for generating a structured sample library with Python Requires Docker about Samples are written to wave files in lib/. Differe

Ben Mangold 1 Nov 11, 2021
Simple discord bot by @merive 🤖

Parzibot Powerful and Useful Discord Bot on Python. The source code of the bot is available to everyone. Parzibot uses English language. This is free

merive_ 3 Dec 28, 2022
Suyash More 111 Jan 07, 2023
LibXtract is a simple, portable, lightweight library of audio feature extraction functions.

LibXtract LibXtract is a simple, portable, lightweight library of audio feature extraction functions. The purpose of the library is to provide a relat

Jamie Bullock 215 Nov 16, 2022
SinGlow: Generative Flow for SVS tasks in Tensorflow 2

SinGlow is a part of my Singing voice synthesis system. It can extract features of sound, particularly songs and musics. Then we can use these features (or perfect encoding) for feature migrating tas

Haobo Yang 8 Aug 22, 2022
A python program for visualizing MIDI files, and displaying them in a spiral layout

SpiralMusic_python A python program for visualizing MIDI files, and displaying them in a spiral layout For a hardware version using Teensy & LED displ

Gavin 6 Nov 23, 2022
An 8D music player made to enjoy Halloween this year!🤘

HAPPY HALLOWEEN buddy! Split Player Hello There! Welcome to SplitPlayer... Supposed To Be A 8DPlayer.... You Decide.... It can play the ordinary audio

Akshat Kumar Singh 1 Nov 04, 2021
gentle forced aligner

Gentle Robust yet lenient forced-aligner built on Kaldi. A tool for aligning speech with text. Getting Started There are three ways to install Gentle.

1.2k Dec 30, 2022
Sequencer: Deep LSTM for Image Classification

Sequencer: Deep LSTM for Image Classification Created by Yuki Tatsunami Masato Taki This repository contains implementation for Sequencer. Abstract In

Yuki Tatsunami 111 Dec 16, 2022
Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi

TheHamkerCat 314 Dec 29, 2022
Use python MIDI to write some simple music

Use Python MIDI to write songs

小宝 1 Nov 19, 2021
Synthesia but open source, made in python and free

PyPiano Synthesia but open source, made in python and free Requirements are in requirements.txt If you struggle with installation of pyaudio, run : pi

DaCapo 11 Nov 06, 2022
Noinoi music is smoothly playing music on voice chat of telegram.

NOINOI MUSIC BOT ✨ Features Music & Video stream support MultiChat support Playlist & Queue support Skip, Pause, Resume, Stop feature Music & Video do

2 Feb 13, 2022
Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals

Welcome to MARSYAS. MARSYAS is a software framework for rapid prototyping of audio applications, with flexibility and extensibility as primary concer

Marsyas Developers Group 364 Oct 31, 2022
Reading list for research topics in sound event detection

Sound event detection aims at processing the continuous acoustic signal and converting it into symbolic descriptions of the corresponding sound events present at the auditory scene.

Soham 64 Jan 05, 2023
Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

3 Feb 07, 2022
a library for audio and music analysis

aubio aubio is a library to label music and sounds. It listens to audio signals and attempts to detect events. For instance, when a drum is hit, at wh

aubio 2.9k Dec 30, 2022
Music player - endlessly plays your music

Music player First, if you wonder about what is supposed to be a music player or what makes a music player different from a simple media player, read

Albert Zeyer 482 Dec 19, 2022
A python library for working with praat, textgrids, time aligned audio transcripts, and audio files.

praatIO Questions? Comments? Feedback? A library for working with praat, time aligned audio transcripts, and audio files that comes with batteries inc

Tim 224 Dec 19, 2022