A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Overview

Basic Pitch Logo

License PyPI - Python Version Supported Platforms

Basic Pitch is a Python library for Automatic Music Transcription (AMT), using lightweight neural network developed by Spotify's Audio Intelligence Lab. It's small, easy-to-use, and pip install-able.

Basic Pitch may be simple, but it's is far from "basic"! basic-pitch is efficient and easy to use, and its multipitch support, its ability to generalize across instruments, and its note accuracy competes with much larger and more resource-hungry AMT systems.

Provide a compatible audio file and basic-pitch will generate a MIDI file, complete with pitch bends. Basic pitch is instrument-agnostic and supports polyphonic instruments, so you can freely enjoy transcription of all your favorite music, no matter what instrument is used. Basic pitch works best on one instrument at a time.

Research Paper

This library was released in conjunction with Spotify's publication at ICASSP 2022. You can read more about this research in the paper, A Lightweight Instrument-Agnostic Model for Polyphonic Note Transcription and Multipitch Estimation.

If you use this library in academic research, consider citing it:

@inproceedings{2022_BittnerBRME_LightweightNoteTranscription_ICASSP,
  author= {Bittner, Rachel M. and Bosch, Juan Jos\'e and Rubinstein, David and Meseguer-Brocal, Gabriel and Ewert, Sebastian},
  title= {A Lightweight Instrument-Agnostic Model for Polyphonic Note Transcription and Multipitch Estimation},
  booktitle= {Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
  address= {Singapore},
  year= 2022,
}

Demo

If, for whatever reason, you're not yet completely inspired, or you're just like so totally over the general vibe and stuff, checkout our snappy demo website, basicpitch.io, to experiment with our model on whatever music audio you provide!

Installation

basic-pitch is available via PyPI. To install the current release:

pip install basic-pitch

To update Basic Pitch to the latest version, add --upgrade to the above command.

Compatible Environments:

  • MacOS, Windows and Ubuntu operating systems
  • Python versions 3.7, 3.8, 3.9

Usage

Model Prediction

Command Line Tool

This library offers a command line tool interface. A basic prediction command will generate and save a MIDI file transcription of audio at the <input-audio-path> to the <output-directory>:

basic-pitch <output-directory> <input-audio-path>

To process more than one audio file at a time:

basic-pitch <output-directory> <input-audio-path-1> <input-audio-path-2> <input-audio-path-3>

Optionally, you may append any of the following flags to your prediction command to save additional formats of the prediction output to the <output-directory>:

  • --sonify-midi to additionally save a .wav audio rendering of the MIDI file
  • --save-model-outputs to additionally save raw model outputs as an NPZ file
  • --save-note-events to additionally save the predicted note events as a CSV file

To discover more parameter control, run:

basic-pitch --help

Programmatic

predict()

Import basic-pitch into your own Python code and run the predict functions directly, providing an <input-audio-path> and returning the model's prediction results:

from basic_pitch.inference import predict
from basic_pitch import ICASSP_2022_MODEL_PATH

model_output, midi_data, note_activations = predict(<input-audio-path>)
  • <minimum-frequency> & <maximum-frequency> (floats) set the maximum and minimum allowed note frequency, in Hz, returned by the model. Pitch events with frequencies outside of this range will be excluded from the prediction results.
  • model_output is the raw model inference output
  • midi_data is the transcribed MIDI data derived from the model_output
  • note_events is a list of note events derived from the model_output

predict() in a loop

To run prediction within a loop, you'll want to load the model yourself and provide predict() with the loaded model object itself to be used for repeated prediction calls, in order to avoid redundant and sluggish model loading.

import tensorflow as tf

from basic_pitch.inference import predict
from basic_pitch import ICASSP_2022_MODEL_PATH

basic_pitch_model = tf.saved_model.load(str(ICASSP_2022_MODEL_PATH))

for x in range():
    ...
    model_output, midi_data, note_activations = predict(
        <loop-x-input-audio-path>,
        basic_pitch_model,
    )
    ...

predict_and_save()

If you would like basic-pitch orchestrate the generation and saving of our various supported output file types, you may use predict_and_save instead of using predict directly:

from basic_pitch.inference import predict_and_save

predict_and_save(
    <input-audio-path-list>,
    <output-directory>,
    <save-midi>,
    <sonify-midi>,
    <save-model-outputs>,
    <save-note-events>,
)

where:

  • <input-audio-path-list> & <output-directory>
    • directory paths for basic-pitch to read from/write to.
  • <save-midi>
    • bool to control generating and saving a MIDI file to the <output-directory>
  • <sonify-midi>
    • bool to control saving a WAV audio rendering of the MIDI file to the <output-directory>
  • <save-model-outputs>
    • bool to control saving the raw model output as a NPZ file to the <output-directory>
  • <save-note-events>
    • bool to control saving predicted note events as a CSV file <output-directory>

Model Input

Supported Audio Codecs

basic-pitch accepts all sound files that are compatible with its version of librosa, including:

  • .mp3
  • .ogg
  • .wav
  • .flac
  • .m4a

Mono Channel Audio Only

While you may use stereo audio as an input to our model, at prediction time, the channels of the input will be down-mixed to mono, and then analyzed and transcribed.

File Size/Audio Length

This model can process any size or length of audio, but processing of larger/longer audio files could be limited by your machine's available disk space. To process these files, we recommend streaming the audio of the file, processing windows of audio at a time.

Sample Rate

Input audio maybe be of any sample rate, however, all audio will be resampled to 22050 Hz before processing.

Contributing

Contributions to basic-pitch are welcomed! See CONTRIBUTING.md for details.

Copyright and License

basic-pitch is Copyright 2022 Spotify AB.

This software is licensed under the Apache License, Version 2.0 (the "Apache License"). You may choose either license to govern your use of this software only upon the condition that you accept all of the terms of either the Apache License.

You may obtain a copy of the Apache License at:

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the Apache License or the GPL License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the Apache License for the specific language governing permissions and limitations under the Apache License.

Generating a structured library of .wav samples with Python.

sample-library Scripts for generating a structured sample library with Python Requires Docker about Samples are written to wave files in lib/. Differe

Ben Mangold 1 Nov 11, 2021
Mousai is a simple application that can identify song like Shazam

Mousai is a simple application that can identify song like Shazam. It saves the artist, album, and title of the identified song in a JSON file.

Dave Patrick 662 Jan 07, 2023
Python library for handling audio datasets.

AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener

Matthias 121 Nov 27, 2022
Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi

TheHamkerCat 314 Dec 29, 2022
Noinoi music is smoothly playing music on voice chat of telegram.

NOINOI MUSIC BOT ✨ Features Music & Video stream support MultiChat support Playlist & Queue support Skip, Pause, Resume, Stop feature Music & Video do

2 Feb 13, 2022
Python interface to the WebRTC Voice Activity Detector

py-webrtcvad This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3. A VAD classifies a p

John Wiseman 1.5k Dec 22, 2022
A voice assistant which can handle your everyday task and allows you to book items from your favourite store!

Voicely Table of Contents About The Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Contact Acknowled

Awantika Nigam 2 Nov 17, 2021
A python package for calculating the PESQ.

PyPESQ (WIP) Pypesq is a python wrapper for the PESQ score calculation C routine. It only can be used in evaluation purpose. INSTALL pip install https

Jingdong Li 269 Dec 18, 2022
ᴀ ʙᴏᴛ ᴛʜᴀᴛ ᴄᴀɴ ᴘʟᴀʏ ᴍᴜꜱɪᴄ ɪɴ ᴛᴇʟᴇɢʀᴀᴍ ɢʀᴏᴜᴘ ᴏɴ ᴠᴏɪᴄᴇ ᴄᴀʟʟ

GJ516 LOVER'S ııllıllı ♥️ ➤⃝Gᴊ516_ᴍᴜꜱɪᴄ_ʙᴏᴛ ♥️ ıllıllı ᴀ ʙᴏᴛ ᴛʜᴀᴛ ᴄᴀɴ ᴘʟᴀʏ ᴍᴜꜱɪᴄ ɪɴ ᴛᴇʟᴇɢʀᴀᴍ ɢʀᴏᴜᴘ ᴏɴ ᴠᴏɪᴄᴇ ᴄᴀʟʟ Requirements 📝 FFmpeg NodeJS nodesou

1 Nov 22, 2021
A collection of python scripts for extracting and analyzing acoustics from audio files.

pyAcoustics A collection of python scripts for extracting and analyzing acoustics from audio files. Contents 1 Common Use Cases 2 Major revisions 3 Fe

Tim 74 Dec 26, 2022
Vixtify - Python Controlled Music Player

Strumm Sound Playlist : Click me to listen Welcome to GitHub Pages You can use the editor on GitHub to maintain and preview the content for your websi

Vicky Kumar 2 Feb 03, 2022
extract unpack asset file (form unreal engine 4 pak) with extenstion *.uexp which contain awb/acb (cri/cpk like) sound or music resource

Uexp2Awb extract unpack asset file (form unreal engine 4 pak) with extenstion .uexp which contain awb/acb (cri/cpk like) sound or music resource. i ju

max 6 Jun 22, 2022
Nayeli: cool telegram groups vc music project

Nayeli-music Nayeli 🥀 is cool telegram 🍎 groups vc music project 🎋 . Nayeli-music Nayeli Deployment 🎋 📲 Esy deploy 🐾️ Source Owner ♥️ ❄️ He is s

Kasun bandara 2 Dec 20, 2021
A Python wrapper around the Soundcloud API

soundcloud-python A friendly wrapper around the Soundcloud API. Installation To install soundcloud-python, simply: pip install soundcloud Or if you'r

SoundCloud 84 Dec 31, 2022
Bot duniya Music Player

Bot duniya Music Player Requirements 📝 FFmpeg (Latest) NodeJS nodesource.com (NodeJS 17+) Python (3.10+) PyTgCalls (Lastest) 2nd Telegram Account (ne

Aman Vishwakarma 16 Oct 21, 2022
❤️ Hi There Im Cozmo Music Bot A next gen powerful telegram group Music bot for get your Songs and music @Venuja_Sadew

🎵 Cozmo MUSIC 🎵 Cozmo Music is a Music powerfull bot for playing music on telegram voice chat groups. Requirements FFmpeg NodeJS nodesource.com Pyth

Venuja Sadew 3 Jan 08, 2022
A python program to cut longer MP3 files (i.e. recordings of several songs) into the individual tracks.

I'm writing a python script to cut longer MP3 files (i.e. recordings of several songs) into the individual tracks called ReCut. So far there are two

Dönerspiess 1 Oct 27, 2021
A simple voice detection system which can be applied practically for designing a device with capability to detect a baby’s cry and automatically turning on music

Auto-Baby-Cry-Detection-with-Music-Player A simple voice detection system which can be applied practically for designing a device with capability to d

2 Dec 15, 2021
Analyze, visualize and process sound field data recorded by spherical microphone arrays.

Sound Field Analysis toolbox for Python The sound_field_analysis toolbox (short: sfa) is a Python port of the Sound Field Analysis Toolbox (SOFiA) too

Division of Applied Acoustics at Chalmers University of Technology 69 Nov 23, 2022
Open Sound Strip, Sequence or Record in Audacity

Audacity Tools For Blender Sound editing in Blender Video Sequence Editor with Audacity integrated. Send/receive the full edited sequence or single st

64 Dec 31, 2022