Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

Last update: Jan 02, 2023

Overview

A Python library for audio feature extraction, classification, segmentation and applications

This doc contains general info. Click here for the complete wiki. For a more generic intro to audio data handling read this article

News

[2020-09-12] Read this hackernoon article titled "Intro to audio analysis" for an intro to theory and practice of audio feature extraction, classification and segmentation.
Special issue in Pattern Recognition in Multimedia Signal Analysis, Deadline 30 November 2020
[2019-11-19] Major lib refactoring. Please report any issues or inconsistencies in the documentation.
Check out paura a python script for realtime recording and analysis of audio data
[2018-08-12] pyAudioAnalysis now ported to Python 3

General

pyAudioAnalysis is a Python library covering a wide range of audio analysis tasks. Through pyAudioAnalysis you can:

Extract audio features and representations (e.g. mfccs, spectrogram, chromagram)
Train, parameter tune and evaluate classifiers of audio segments
Classify unknown sounds
Detect audio events and exclude silence periods from long recordings
Perform supervised segmentation (joint segmentation - classification)
Perform unsupervised segmentation (e.g. speaker diarization) and extract audio thumbnails
Train and use audio regression models (example application: emotion recognition)
Apply dimensionality reduction to visualize audio data and content similarities

Installation

Clone the source of this library: git clone https://github.com/tyiannak/pyAudioAnalysis.git
Install dependencies: pip install -r ./requirements.txt
Install using pip: pip install -e .

An audio classification example

More examples and detailed tutorials can be found at the wiki

pyAudioAnalysis provides easy-to-call wrappers to execute audio analysis tasks. Eg, this code first trains an audio segment classifier, given a set of WAV files stored in folders (each folder representing a different class) and then the trained classifier is used to classify an unknown audio WAV file

from pyAudioAnalysis import audioTrainTest as aT
aT.extract_features_and_train(["classifierData/music","classifierData/speech"], 1.0, 1.0, aT.shortTermWindow, aT.shortTermStep, "svm", "svmSMtemp", False)
aT.file_classification("data/doremi.wav", "svmSMtemp","svm")

Result: (0.0, array([ 0.90156761, 0.09843239]), ['music', 'speech'])

In addition, command-line support is provided for all functionalities. E.g. the following command extracts the spectrogram of an audio signal stored in a WAV file: python audioAnalysis.py fileSpectrogram -i data/doremi.wav

Audio Handling Basics: Process Audio Files In Command-Line or Python, if you want to learn how to handle audio files from command line, and some basic programming on audio signal processing. Start with that if you don't know anything about audio.
Intro to Audio Analysis: Recognizing Sounds Using Machine Learning This goes a bit deeper than the previous article, by providing a complete intro to theory and practice of audio feature extraction, classification and segmentation (includes many Python examples).
The library's wiki
How to Use Machine Learning to Color Your Lighting Based on Music Mood. An interesting use-case of using this lib to train a real-time music mood estimator.
A more general and theoretic description of the adopted methods (along with several experiments on particular use-cases) is presented in this publication. Please use the following citation when citing pyAudioAnalysis in your research work:

@article{giannakopoulos2015pyaudioanalysis,
  title={pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis},
  author={Giannakopoulos, Theodoros},
  journal={PloS one},
  volume={10},
  number={12},
  year={2015},
  publisher={Public Library of Science}
}

For Matlab-related audio analysis material check this book.

Author

Theodoros Giannakopoulos, Principal Researcher of Multimodal Machine Learning at the Multimedia Analysis Group of the Computational Intelligence Lab (MagCIL) of the Institute of Informatics and Telecommunications, of the National Center for Scientific Research "Demokritos"

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

Related tags

Overview

A Python library for audio feature extraction, classification, segmentation and applications

News

General

Installation

An audio classification example

Further reading

Author

Owner

Theodoros Giannakopoulos

:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

F.R.I.D.A.Y. ----- Female Replacement Intelligent Digital Assistant Youth

Praat in Python, the Pythonic way

Codes for "Efficient Long-Range Attention Network for Image Super-resolution"

PyAbsorp is a python module that has the main focus to help estimate the Sound Absorption Coefficient.

A Simple Script that will help you to Play / Change Songs with just your Voice

DeepMusic is an easy to use Spotify like app to manage and listen to your favorites musics.

python script for getting mp3 files from yaoutube playlist

𝙰 𝙼𝚞𝚜𝚒𝚌 𝙱𝚘𝚝 𝙲𝚛𝚎𝚊𝚝𝚎𝚍 𝙱𝚢 𝚃𝚎𝚊𝚖𝙳𝚕𝚝 💖

Terminal-based audio-to-text converter

DCL - An easy to use diacritic library used for diacritic and accent manipulation.

Improved Python UI to convert Youtube URL to .mp3 file.

Python library for audio and music analysis

pedalboard is a Python library for adding effects to audio.

MusicBrainz Picard

GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

Tune in is a Collaborative Music Playing Systems where multiple guests can join a room and enjoy the song being played

A python library for working with praat, textgrids, time aligned audio transcripts, and audio files.

Real-Time Spherical Microphone Renderer for binaural reproduction in Python

An audio guide for destroying oracles in Destiny's Vault of Glass raid