FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control

Related tags

Deep Learningfigaro
Overview

FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control

by Dimitri von Rütte, Luca Biggio, Yannic Kilcher, Thomas Hofmann

Getting started

Prerequisites:

  • Python 3.9
  • Conda

Setup

  1. Clone this repository to your disk
  2. Install required packages (see requirements.txt). With Conda:
conda create --name figaro python=3.9
conda activate figaro
pip install -r requirements.txt

Preparing the Data

To train models and to generate new samples, we use the Lakh MIDI dataset (altough any collection of MIDI files can be used).

  1. Download (size: 1.6GB) and extract the archive file:
wget http://hog.ee.columbia.edu/craffel/lmd/lmd_full.tar.gz
tar -xzf lmd_full.tar.gz
  1. You may wish to remove the archive file now: rm lmd_full.tar.gz

Download Pre-Trained Models

If you don't wish to train your own models, you can download our pre-trained models.

  1. Download (size: 2.3GB) and extract the archive file:
wget -O checkpoints.zip https://polybox.ethz.ch/index.php/s/a0HUHzKuPPefWkW/download
unzip checkpoints.zip
  1. You may wish to remove the archive file now: rm checkpoints.zip

Training

Training arguments such as model type, batch size, model params are passed to the training scripts via environment variables.

Available model types are:

  • vq-vae: VQ-VAE model used for the learned desription
  • figaro: FIGARO with both the expert and learned description
  • figaro-expert: FIGARO with only the expert description
  • figaro-learned: FIGARO with only the learned description
  • figaro-no-inst: FIGARO (expert) without instruments
  • figaro-no-chord: FIGARO (expert) without chords
  • figaro-no-meta: FIGARO (expert) without style (meta) information
  • baseline: Unconditional decoder-only baseline following Huang et al. (2018)

Example invocation of the training script is given by the following command:

MODEL=figaro-expert python src/train.py

For models using the learned description (figaro and figaro-learned), a pre-trained VQ-VAE checkpoint needs to be provided as well:

MODEL=figaro VAE_CHECKPOINT=./checkpoints/vq-vae.ckpt python src/train.py

Generation

To generate samples, make sure you have a trained checkpoint prepared (either download one or train it yourself). For this script, make sure that the dataset is prepared according to Preparing the Data. This is needed to extract descriptions, based on which new samples can be generated.

An example invocation of the generation script is given by the following command:

MODEL=figaro-expert CHECKPOINT=./checkpoints/figaro-expert.ckpt python src/generate.py

For models using the learned description (figaro and figaro-learned), a pre-trained VQ-VAE checkpoint needs to be provided as well:

MODEL=figaro CHECKPOINT=./checkpoints/figaro.ckpt VAE_CHECKPOINT=./checkpoints/vq-vae.ckpt python src/generate.py

Evaluation

We provide the evaluation scripts used to calculate the desription metrics on some set of generated samples. Refer to the previous section for how to generate samples yourself.

Example usage:

SAMPLE_DIR=./samples/figaro-expert python src/evaluate.py

Parameters

The following environment variables are available for controlling hyperparameters beyond their default value.

Training (train.py)

Model

Variable Description Default value
MODEL Model architecture to be trained
D_MODEL Hidden size of the model 512
CONTEXT_SIZE Number of tokens in the context to be passed to the auto-encoder 256
D_LATENT [VQ-VAE] Dimensionality of the latent space 1024
N_CODES [VQ-VAE] Codebook size 2048
N_GROUPS [VQ-VAE] Number of groups to split the latent vector into before discretization 16

Optimization

Variable Description Default value
EPOCHS Max. number of training epochs 16
MAX_TRAINING_STEPS Max. number of training iterations 100,000
BATCH_SIZE Number of samples in each batch 128
TARGET_BATCH_SIZE Number of samples in each backward step, gradients will be accumulated over TARGET_BATCH_SIZE//BATCH_SIZE batches 256
WARMUP_STEPS Number of learning rate warmup steps 4000
LEARNING_RATE Initial learning rate, will be decayed after constant warmup of WARMUP_STEPS steps 1e-4

Others

Variable Description Default value
CHECKPOINT Path to checkpoint from which to resume training
VAE_CHECKPOINT Path to VQ-VAE checkpoint to be used for the learned description
ROOT_DIR The folder containing MIDI files to train on ./lmd_full
OUTPUT_DIR Folder for saving checkpoints ./results
LOGGING_DIR Folder for saving logs ./logs
N_WORKERS Number of workers to be used for the dataloader available CPUs

Generation (generate.py)

Variable Description Default value
MODEL Specify which model will be loaded
CHECKPOINT Path to the checkpoint for the specified model
VAE_CHECKPOINT Path to the VQ-VAE checkpoint to be used for the learned description (if applicable)
ROOT_DIR Folder containing MIDI files to extract descriptions from ./lmd_full
OUTPUT_DIR Folder to save generated MIDI samples to ./samples
MAX_ITER Max. number of tokens that should be generated 16,000
MAX_BARS Max. number of bars that should be generated 32
MAKE_MEDLEYS Set to True if descriptions should be combined into medleys. False
N_MEDLEY_PIECES Number of pieces to be combined into one 2
N_MEDLEY_BARS Number of bars to take from each piece 16
VERBOSE Logging level, set to 0 for silent execution 2

Evaluation (evaluate.py)

Variable Description Default value
SAMPLE_DIR Folder containing generated samples which should be evaluated ./samples
OUT_FILE CSV file to which a detailed log of all metrics will be saved to ./metrics.csv
MAX_SAMPLES Limit the number of samples to be used for computing evaluation metrics 1024
Owner
Dimitri
Dimitri
Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning

structshot Code and data for paper "Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning", Yi Yang and Arz

ASAPP Research 47 Dec 27, 2022
A Re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"

What is This This is a simple re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"(1). Only Sections

102 Dec 14, 2022
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

DeeBERT This is the code base for the paper DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. Code in this repository is also available

Castorini 132 Nov 14, 2022
Fully convolutional networks for semantic segmentation

FCN-semantic-segmentation Simple end-to-end semantic segmentation using fully convolutional networks [1]. Takes a pretrained 34-layer ResNet [2], remo

Kai Arulkumaran 186 Dec 25, 2022
Competitive Programming Club, Clinify's Official repository for CP problems hosting by club members.

Clinify-CPC_Programs This repository holds the record of the competitive programming club where the competitive coding aspirants are thriving hard and

Clinify Open Sauce 4 Aug 22, 2022
Refactoring dalle-pytorch and taming-transformers for TPU VM

Text-to-Image Translation (DALL-E) for TPU in Pytorch Refactoring Taming Transformers and DALLE-pytorch for TPU VM with Pytorch Lightning Requirements

Kim, Taehoon 61 Nov 07, 2022
A project studying the influence of communication in multi-objective normal-form games

Communication in Multi-Objective Normal-Form Games This repo consists of five different types of agents that we have used in our study of communicatio

Willem Röpke 0 Dec 17, 2021
This is an official implementation of the High-Resolution Transformer for Dense Prediction.

High-Resolution Transformer for Dense Prediction Introduction This is the official implementation of High-Resolution Transformer (HRT). We present a H

HRNet 403 Dec 13, 2022
Orthogonal Over-Parameterized Training

The inductive bias of a neural network is largely determined by the architecture and the training algorithm. To achieve good generalization, how to effectively train a neural network is of great impo

Weiyang Liu 11 Apr 18, 2022
Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.

CMPC-Refseg Code of our CVPR 2020 paper Referring Image Segmentation via Cross-Modal Progressive Comprehension. Shaofei Huang*, Tianrui Hui*, Si Liu,

spyflying 55 Dec 01, 2022
Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Self-Supervised Document Similarity Ranking (SDR) via Contextualized Language Models and Hierarchical Inference This repo is the implementation for SD

Microsoft 36 Nov 28, 2022
This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Code-and-Dataset-for-CapSal This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detec

lu zhang 48 Aug 19, 2022
Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

ToxiChat Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Install depen

Ashutosh Baheti 11 Jan 01, 2023
Yoga - Yoga asana classifier for python

Yoga Asana Classifier Description Hi welcome to my new deep learning project "Yo

Programminghut 35 Dec 12, 2022
Medical Insurance Cost Prediction using Machine earning

Medical-Insurance-Cost-Prediction-using-Machine-learning - Here in this project, I will use regression analysis to predict medical insurance cost for people in different regions, and based on several

1 Dec 27, 2021
PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds

PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds PCAM: Product of Cross-Attention Matrices for Rigid Registration of P

valeo.ai 24 May 31, 2022
Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

CorDA Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation Prerequisite Please create and activate the follo

Qin Wang 60 Nov 30, 2022
The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Pixel-level Self-Paced Learning for Super-Resolution This is an official implementaion of the paper Pixel-level Self-Paced Learning for Super-Resoluti

Elon Lin 41 Dec 15, 2022
Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Single-view robot pose and joint angle estimation via render & compare Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic CVPR: Conference on C

Yann Labbé 51 Oct 14, 2022
OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

OCR-Streamlit-App OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages OCR app gets an image a

Siva Prakash 5 Apr 05, 2022