MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Related tags

Audiomidi-ddsp
Overview
logo

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Demos | Blog Post | Colab Notebook | Paper | Hugging Face Spaces

MIDI-DDSP is a hierarchical audio generation model for synthesizing MIDI expanded from DDSP.

Links

Install MIDI-DDSP

You could install MIDI-DDSP via pip, which allows you to use the cool Command-line MIDI synthesis to synthesize your MIDI.

To install MIDI-DDSP via pip, simply run:

pip install midi-ddsp

Train MIDI-DDSP

To train MIDI-DDSP, please first install midi-ddsp and clone the MIDI-DDSP repository:

git clone https://github.com/magenta/midi-ddsp.git

For dataset, please download the tfrecord files for the URMP dataset in here to the data folder in your cloned repository using the following commands:

cd midi-ddsp # enter the project directory
mkdir ./data # create a data folder
gsutil cp gs://magentadata/datasets/urmp/urmp_20210324/* ./data # download tfrecords to directory

Please check here for how to install and use gsutil.

Finally, you can run the script train_midi_ddsp.sh to train the exact same model we used in the paper:

sh ./train_midi_ddsp.sh

The current codebase does not support training with arbitrary dataset, but we will hopefully update that in the near future.

Side note:

If one download the dataset to a different location, please change the data_dir parameter in train_midi_ddsp.sh.

The training of MIDI-DDSP takes approximately 18 hours on a single RTX 8000. The training code for now does not support multi-GPU training. We recommend using a GPU with more than 24G of memory when training Synthesis Generator in batch size of 16. For a GPU with less memory, please consider using a smaller batch size and change the batch size in train_midi_ddsp.sh.

Try to play with MIDI-DDSP yourself!

Please try out MIDI-DDSP in Colab notebooks!

In this notebook, you will try to use MIDI-DDSP to synthesis a monophonic MIDI file, adjust note expressions, make pitch bend by adjusting synthesis parameters, and synthesize quartet from Bach chorales.

We have trained MIDI-DDSP on the URMP dataset which support synthesizing 13 instruments: violin, viola, cello, double bass, flute, oboe, clarinet, saxophone, bassoon, trumpet, horn, trombone, tuba. You could find how to download and use our pre-trained model below:

Command-line MIDI synthesis

On can use the MIDI-DDSP as a command-line MIDI synthesizer just like FluidSynth.

To use command-line synthesis to synthesize a midi file, please first download the model weights by running:

midi_ddsp_download_model_weights

To synthesize a midi file simply run the following command:

midi_ddsp_synthesize --midi_path <path-to-midi>

For a starter, you can try to synthesize the example midi file in this repository:

midi_ddsp_synthesize --midi_path ./midi_example/ode_to_joy.mid

The command line also enables synthesize a folder of midi files. For more advance use (synthesize a folder, using FluidSynth for instruments not supported, etc.), please see synthesize_midi.py --help.

If you have a trouble downloading the model weights, please manually download from here, and specify the synthesis_generator_weight_path and expression_generator_weight_path by yourself when using the command line. You can also specify your other model weights if you want to use your own trained model.

Python Usage

After installing midi-ddsp, you could import midi-ddsp in python and synthesize MIDI in your code.

Minimal Example

Here is a simple example to use MIDI-DDSP to synthesize a midi file:

from midi_ddsp import synthesize_midi, load_pretrained_model

midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize MIDI
output = synthesize_midi(synthesis_generator, expression_generator, midi_file)
# The synthesized audio
synthesized_audio = output['mix_audio']

Advance Usage

Here is an advance example to synthesize the ode_to_joy.mid, change the note expression controls, and adjust the synthesis parameters:

import numpy as np
import tensorflow as tf
from midi_ddsp.utils.midi_synthesis_utils import synthesize_mono_midi, conditioning_df_to_audio
from midi_ddsp.utils.inference_utils import get_process_group
from midi_ddsp.midi_ddsp_synthesize import load_pretrained_model
from midi_ddsp.data_handling.instrument_name_utils import INST_NAME_TO_ID_DICT

# -----MIDI Synthesis-----
midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize with violin:
instrument_name = 'violin'
instrument_id = INST_NAME_TO_ID_DICT[instrument_name]
# Run model prediction
midi_audio, midi_control_params, midi_synth_params, conditioning_df = synthesize_mono_midi(synthesis_generator,
                                                                                           expression_generator,
                                                                                           midi_file, instrument_id,
                                                                                           output_dir=None)

synthesized_audio = midi_audio  # The synthesized audio

# -----Adjust note expression controls and re-synthesize-----

# Make all notes weak vibrato:
conditioning_df_changed = conditioning_df.copy()
note_vibrato = conditioning_df_changed['vibrato_extend'].value
conditioning_df_changed['vibrato_extend'] = np.ones_like(conditioning_df['vibrato_extend'].values) * 0.1
# Re-synthesize
midi_audio_changed, midi_control_params_changed, midi_synth_params_changed = conditioning_df_to_audio(
  synthesis_generator, conditioning_df_changed, tf.constant([instrument_id]))

synthesized_audio_changed = midi_audio_changed  # The synthesized audio

# There are 6 note expression controls in conditioning_df that you could change:
# 'amplitude_mean', 'amplitude_std', 'vibrato_extend', 'brightness', 'attack_level', 'amplitudes_max_pos'.
# Please refer to https://colab.research.google.com/github/magenta/midi-ddsp/blob/main/midi_ddsp/colab/MIDI_DDSP_Demo.ipynb#scrollTo=XfPPrdPu5sSy for the effect of each control. 

# -----Adjust synthesis parameters and re-synthesize-----

# The original synthesis parameters:
f0_ori = midi_synth_params['f0_hz']
amps_ori = midi_synth_params['amplitudes']
noise_ori = midi_synth_params['noise_magnitudes']
hd_ori = midi_synth_params['harmonic_distribution']

# TODO: make your change of the synthesis parameters here:
f0_changed = f0_ori
amps_changed = amps_ori
noise_changed = noise_ori
hd_changed = hd_ori

# Resynthesis the audio using DDSP
processor_group = get_process_group(midi_synth_params['amplitudes'].shape[1], use_angular_cumsum=True)
midi_audio_changed = processor_group({'amplitudes': amps_changed,
                                      'harmonic_distribution': hd_changed,
                                      'noise_magnitudes': noise_changed,
                                      'f0_hz': f0_changed, },
                                     verbose=False)
midi_audio_changed = synthesis_generator.reverb_module(midi_audio_changed, reverb_number=instrument_id, training=False)

synthesized_audio_changed = midi_audio_changed  # The synthesized audio
Comments
  • ImportError and AttributeError

    ImportError and AttributeError

    Hi! Very interesting work! I’m trying to run MIDI_DDSP_Demo.ipynb, and I encountered some errors.

    1. ImportError: cannot import name 'LD_RANGE' occurred in from ddsp.spectral_ops import F0_RANGE, LD_RANGE(see here). -> According to DDSP, I think 'DB_RANGE' is correct, not 'LD_RANGE' .

    2. AttributeError: module 'ddsp.spectral_ops' has no attribute 'amplitude_to_db' occurred where ddsp.spectral_ops.amplitude_to_db is used (see here). -> According to DDSP, I suppose it is 'ddsp.core', not 'ddsp.spectral_ops'.

    opened by MasayaKawamura 3
  • Question on the installation

    Question on the installation

    Hi,

    When I opened up a new colab notebook and tried to pip install the midi-ddsp package, it took over 40 minutes and the installation can not be completed. I didn't experience this until this week.

    I've tried pip install midi-ddsp and pip install git+https://github.com/magenta/midi-ddsp and both gave me the same results.

    It seemed like pip would spend a lot of time trying to find which version of etils was compatible.

    opened by tiianhk 2
  • Error when using

    Error when using "pip install midi-ddsp"

    Hello everyone,

    I'm trying to use midi-ddsp to synthesize a few .midi files. In order to achieve it, I create a virtual environment with:

    python3 -m venv .venv source .venv/bin/activate

    note: python version == 3.10.6

    After creating it I run: "pip install midi-ddsp" . Getting this error message:

    Collecting ddsp Using cached ddsp-1.9.0-py2.py3-none-any.whl (200 kB) Using cached ddsp-1.7.1-py2.py3-none-any.whl (199 kB) Using cached ddsp-1.7.0-py2.py3-none-any.whl (197 kB) Using cached ddsp-1.6.5-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.3-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.2-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.0-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.4.0-py2.py3-none-any.whl (192 kB) Using cached ddsp-1.3.1-py2.py3-none-any.whl (192 kB) Using cached ddsp-1.3.0-py2.py3-none-any.whl (183 kB) Using cached ddsp-1.2.0-py2.py3-none-any.whl (179 kB) Using cached ddsp-1.1.0-py2.py3-none-any.whl (175 kB) Using cached ddsp-1.0.1-py2.py3-none-any.whl (170 kB) Using cached ddsp-1.0.0-py2.py3-none-any.whl (168 kB) Using cached ddsp-0.14.0-py2.py3-none-any.whl (143 kB) Using cached ddsp-0.13.1-py2.py3-none-any.whl (129 kB) Using cached ddsp-0.13.0-py2.py3-none-any.whl (129 kB) Using cached ddsp-0.12.0-py2.py3-none-any.whl (127 kB) Using cached ddsp-0.10.0-py2.py3-none-any.whl (109 kB) Using cached ddsp-0.9.0-py2.py3-none-any.whl (109 kB) Using cached ddsp-0.8.0-py2.py3-none-any.whl (108 kB) Using cached ddsp-0.7.0-py2.py3-none-any.whl (107 kB) Using cached ddsp-0.5.1-py2.py3-none-any.whl (101 kB) Using cached ddsp-0.5.0-py2.py3-none-any.whl (101 kB) Using cached ddsp-0.4.0-py2.py3-none-any.whl (97 kB) Using cached ddsp-0.2.4-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.3-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.2-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.0-py2.py3-none-any.whl (88 kB) Using cached ddsp-0.1.0-py3-none-any.whl (88 kB) Using cached ddsp-0.0.10-py3-none-any.whl (88 kB) Using cached ddsp-0.0.9-py3-none-any.whl (86 kB) Using cached ddsp-0.0.8-py3-none-any.whl (86 kB) Using cached ddsp-0.0.7-py3-none-any.whl (85 kB) Using cached ddsp-0.0.6-py2.py3-none-any.whl (91 kB) Using cached ddsp-0.0.5-py2.py3-none-any.whl (91 kB) Using cached ddsp-0.0.4-py2.py3-none-any.whl (83 kB) Using cached ddsp-0.0.3-py2.py3-none-any.whl (81 kB) Using cached ddsp-0.0.1-py2.py3-none-any.whl (75 kB) Using cached ddsp-0.0.0-py2.py3-none-any.whl (75 kB) INFO: pip is looking at multiple versions of midi-ddsp to determine which version is compatible with other requirements. This could take a while. Collecting midi-ddsp Using cached midi_ddsp-0.1.3-py3-none-any.whl (56 kB) Using cached midi_ddsp-0.1.1-py3-none-any.whl (56 kB) Using cached midi_ddsp-0.1.0-py3-none-any.whl (53 kB) ERROR: Cannot install midi-ddsp because these package versions have conflicting dependencies.

    The conflict is caused by: ddsp 3.4.4 depends on tensorflow ddsp 3.4.3 depends on tensorflow ddsp 3.4.1 depends on tensorflow ddsp 3.4.0 depends on tensorflow ddsp 3.3.6 depends on tensorflow ddsp 3.3.4 depends on tensorflow ddsp 3.3.2 depends on tensorflow ddsp 3.3.0 depends on tensorflow ddsp 3.2.1 depends on tensorflow ddsp 3.2.0 depends on tensorflow ddsp 3.1.0 depends on tensorflow ddsp 1.9.0 depends on tensorflow ddsp 1.7.1 depends on tensorflow ddsp 1.7.0 depends on tensorflow ddsp 1.6.5 depends on tensorflow ddsp 1.6.3 depends on tensorflow ddsp 1.6.2 depends on tensorflow ddsp 1.6.0 depends on tensorflow ddsp 1.4.0 depends on tensorflow ddsp 1.3.1 depends on tensorflow ddsp 1.3.0 depends on tensorflow ddsp 1.2.0 depends on tensorflow ddsp 1.1.0 depends on tensorflow ddsp 1.0.1 depends on tensorflow ddsp 1.0.0 depends on tensorflow ddsp 0.14.0 depends on tensorflow ddsp 0.13.1 depends on tensorflow ddsp 0.13.0 depends on tensorflow ddsp 0.12.0 depends on tensorflow ddsp 0.10.0 depends on tensorflow ddsp 0.9.0 depends on tensorflow ddsp 0.8.0 depends on tensorflow ddsp 0.7.0 depends on tensorflow ddsp 0.5.1 depends on tensorflow ddsp 0.5.0 depends on tensorflow ddsp 0.4.0 depends on tensorflow ddsp 0.2.4 depends on tensorflow ddsp 0.2.3 depends on tensorflow ddsp 0.2.2 depends on tensorflow ddsp 0.2.0 depends on tensorflow ddsp 0.1.0 depends on tensorflow ddsp 0.0.10 depends on tensorflow ddsp 0.0.9 depends on tensorflow ddsp 0.0.8 depends on tensorflow ddsp 0.0.7 depends on tensorflow ddsp 0.0.6 depends on tensorflow ddsp 0.0.5 depends on tensorflow ddsp 0.0.4 depends on tensorflow ddsp 0.0.3 depends on tensorflow ddsp 0.0.1 depends on tensorflow ddsp 0.0.0 depends on tensorflow>=2.1.0

    To fix this you could try to:

    1. loosen the range of package versions you've specified
    2. remove package versions to allow pip attempt to solve the dependency conflict

    ERROR: Resolution Impossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

    ======== I am currently working from an M1 mac so I would be pleased if you could help me reaching a solution for this problem.

    Thank you in advance, Juan Carlos

    opened by JuanCarlosMartinezSevilla 2
  • Why is vibrato_rate not used?

    Why is vibrato_rate not used?

    In my opinion, vibrato_rate (peak frequency) is more plausible than vibrato_extend (peak amplitude) to represent pitch pulsating. Why is vibrato_rate not used?

    opened by bfs18 2
  • Docker image with all the necessary packages

    Docker image with all the necessary packages

    Hi again!

    I'm still trying to execute your code with the "pip install midi-ddsp".

    I'm using docker from a tensorflow/tensorflow:latest image and running the pip command. I get this error:

    "E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered"

    I ask you guys in order to know if you work with docker... if I could be able to have access to the image you use so all versions are the same and to avoid these type of errors.

    Thank you, Best regards, Juan Carlos

    opened by JuanCarlosMartinezSevilla 0
  • How to distinguish whether it is cresc or decresc in fluctuation?

    How to distinguish whether it is cresc or decresc in fluctuation?

    Hi! I am interested in your project. Currently, I am doing similar stuff, but i have some questions. (1) I know that Fluctuation stands for how much does that note cresc or decresc. Peak means which position got the maximum energy. However, i am wondering how to know if that note is cresc or decresc. Let's assume our peak position is 0.4, and fluc is 0.3. Then, is it decrease 0.3 or increase 0.3? How to distinguish it? (2) I know that all the expressive control values are normalized between 0 and 1, but what's the unit measure in these expressive controls?

    Thanks for answering

    opened by pillow8781 2
  • What is ``input`` in the def call()?

    What is ``input`` in the def call()?

    Hi, I am looking inside the code. I've seen a lot of methods about def call(self, inputs) in your code, especially looking at this one.

      def call(self, inputs):
        synth_params = self.get_synth_params(inputs)
    

    However, I couldn't find out what's the calculation of inputs, there are some clues I've found. In those codes, inputs is respond to the data in get_fake_data_synthesis_generator, then what are the data and units you input to get_fake_data_synthesis_generator? Frames? Amplitude or anything else?

    Thanks!

    opened by Megan8821 4
  • How to solve the error when installing midi-ddsp?

    How to solve the error when installing midi-ddsp?

    Hi, I am new in training model, and I got some problems in here. Does anyone how to solve the error of error: subprocess-exited-with -error and error: metadata-generation-failed in the terminal?

    opened by Megan8821 4
  • tfrecord features

    tfrecord features

    Hello, I'm very interested in your amazing work. Took a deep look at the urmp tfrecord datasets, I found something a bit confusing for me. Could you be so kind to help?

    1. I checked some urmp_tfrecords data, I found that some of them contain the following features: {"audio", "f0_confidence", "f0_hz", "f0_time", "id", "instrument_id", "loudness_db", "note_active_frame_indices", "note_active_velocities", "note_offsets", "note_onsets", "orig_f0_hz", "orig_f0_time", "power_db", "recording_id", "sequence"}. However, some of them don't include {"orig_f0_hz", "orig_f0_time"} in their tfrecord data. Why is this so and does such an inconsistency influence the model training?
    2. I want to include piano music when I train my own model. To this end, I think I need to generate tfrecords that have the same content as the urmp ones you used in your model. I plan to use maestro dataset. Could you be so kind to indicate if there's a tfrecord data generation code that we can take as a reference? Like the one you used to generate tfrecords for the midi-ddsp model?
    3. What is the difference between "batched" and "unbatched" dataset?

    Thank you very much for your help in advance.

    opened by gladys0313 1
  • Can i test with custom data?

    Can i test with custom data?

    Hi, i am interested in this exciting project and i am trying to test this with our custom dataset and reproduce the format of original data. But there are some difficulties and questions below.

    1. Is there no way to use custom datasets at all?
    1. Is there any code to calculate elements of dataset below?
    • I want to know how to get "note_active_velocities", "note_active_frame_indices", "power_db", "note_onsets", "note_offsets" but there is no any code on repository.

    Thank you for reading!

    opened by 589hero 6
  • Question on Figure

    Question on Figure

    image

    quick question on this figure in the blog post: i know coconet is its own model that will generate subsequent melodies given the input midi file. however, should i decide to train midi ddsp, will the training of coconet also be a part of this? or should i expect a monophonic midi melody as input and the generated audio as output.

    thanks for all the help and this awesome project

    opened by theadamsabra 7
Releases(v0.2.5)
Owner
Magenta
An open source research project exploring the role of machine learning as a tool in the creative process.
Magenta
Python tools for the corpus analysis of popular music.

CATCHY Corpus Analysis Tools for Computational Hook discovery Python tools for the corpus analysis of popular music recordings. The tools can be used

Jan VB 20 Aug 20, 2022
Python CD-DA ripper preferring accuracy over speed

Whipper Whipper is a Python 3 (3.6+) CD-DA ripper based on the morituri project (CDDA ripper for *nix systems aiming for accuracy over speed). It star

671 Jan 04, 2023
Extract the songs from your osu! libary into proper mp3 form, complete with metadata and album art!

osu-Extract Extract the songs from your osu! libary into proper mp3 form, complete with metadata and album art! Requirements python3 mutagen pillow Us

William Carter 2 Mar 09, 2022
Code to work with wave files!

Code to work with wave files!

Mohammad Dori 3 Jul 15, 2022
Supysonic is a Python implementation of the Subsonic server API.

Supysonic Supysonic is a Python implementation of the Subsonic server API. Current supported features are: browsing (by folders or tags) streaming of

Alban 228 Nov 19, 2022
Audio processor to map oracle notes in the VoG raid in Destiny 2 to call outs.

vog_oracles Audio processor to map oracle notes in the VoG raid in Destiny 2 to call outs. Huge thanks to mzucker on GitHub for the note detection cod

19 Sep 29, 2022
Audio features extraction

Yaafe Yet Another Audio Feature Extractor Build status Branch master : Branch dev : Anaconda : Install Conda Yaafe can be easily install with conda. T

Yaafe 231 Dec 26, 2022
Mopidy is an extensible music server written in Python

Mopidy Mopidy is an extensible music server written in Python. Mopidy plays music from local disk, Spotify, SoundCloud, Google Play Music, and more. Y

Mopidy 7.6k Jan 05, 2023
:notes: Cross-platform music player

Exaile Exaile is a music player with a simple interface and powerful music management capabilities. Features include automatic fetching of album art,

Exaile 327 Dec 19, 2022
Python library for audio and music analysis

librosa A python package for music and audio analysis. Documentation See https://librosa.org/doc/ for a complete reference manual and introductory tut

librosa 5.6k Jan 06, 2023
Frescobaldi LilyPond Editor

README for Frescobaldi Homepage: http://www.frescobaldi.org/ Main author: Wilbert Berendsen Frescobaldi is a LilyPond sheet music text editor. It aims

Frescobaldi 600 Dec 29, 2022
Code for csig audio deepfake detection

FMFCC Audio Deepfake Detection Solution This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection

BokingChen 9 Jun 04, 2022
An 8D music player made to enjoy Halloween this year!🤘

HAPPY HALLOWEEN buddy! Split Player Hello There! Welcome to SplitPlayer... Supposed To Be A 8DPlayer.... You Decide.... It can play the ordinary audio

Akshat Kumar Singh 1 Nov 04, 2021
Simple, hackable offline speech to text - using the VOSK-API.

Nerd Dictation Offline Speech to Text for Desktop Linux. This is a utility that provides simple access speech to text for using in Linux without being

Campbell Barton 844 Jan 07, 2023
This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

OneBit 1 Nov 05, 2021
[Singing Log] Let your program learn to sing!

[Singing Log] Let your program learn to sing! You must have thought this was changelog when you saw the English title, but it's not, it's chànggēlog. What it does is allow your program to print logs

黄巍 22 Sep 03, 2022
Scrap electronic music charts into CSV files

musiccharts A small python script to scrap (electronic) music charts into directories with csv files. Installation Download MusicCharts.exe Run MusicC

Dustin Scharf 1 May 11, 2022
Xbot-Music - Bot Play Music and Video in Voice Chat Group Telegram

XBOT-MUSIC A Telegram Music+video Bot written in Python using Pyrogram and Py-Tg

Fariz 2 Jan 20, 2022
Python I/O for STEM audio files

stempeg = stems + ffmpeg Python package to read and write STEM audio files. Technically, stems are audio containers that combine multiple audio stream

Fabian-Robert Stöter 72 Dec 23, 2022
🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.

🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.

Jim Schwoebel 28 Dec 22, 2022