Efficient neural networks for analog audio effect modeling

Overview

micro-TCN

Efficient neural networks for audio effect modeling.

| Paper | Demo | Plugin |

Setup

Install the requirements.

python3 -m venv env/
source env/bin/activate
pip install -r requirements.txt

Then install auraloss.

pip install git+https://github.com/csteinmetz1/auraloss

Pre-trained models

You can download the pre-trained models here. Then unzip as below.

mkdir lightning_logs
mv models.zip lightning_logs/
cd lightning_logs/
unzip models.zip 

Use the compy.py script in order to process audio files. Below is an example of how to run the TCN-300-C pre-trained model on GPU. This will process all the files in the audio/ directory with the limit mode engaged and a peak reduction of 42.

python comp.py -i audio/ --limit 1 --peak_red 42 --gpu

If you want to hear the output of a different model, you can pass the --model_id flag. To view the available pre-trained models (once you have downloaded them) run the following.

python comp.py --list_models

Found 13 models in ./lightning_logs/bulk
1-uTCN-300__causal__4-10-13__fraction-0.01-bs32
10-LSTM-32__1-32__fraction-1.0-bs32
11-uTCN-300__causal__3-60-5__fraction-1.0-bs32
13-uTCN-300__noncausal__30-2-15__fraction-1.0-bs32
14-uTCN-324-16__noncausal__10-2-15__fraction-1.0-bs32
2-uTCN-100__causal__4-10-5__fraction-1.0-bs32
3-uTCN-300__causal__4-10-13__fraction-1.0-bs32
4-uTCN-1000__causal__5-10-5__fraction-1.0-bs32
5-uTCN-100__noncausal__4-10-5__fraction-1.0-bs32
6-uTCN-300__noncausal__4-10-13__fraction-1.0-bs32
7-uTCN-1000__noncausal__5-10-5__fraction-1.0-bs32
8-TCN-300__noncausal__10-2-15__fraction-1.0-bs32
9-uTCN-300__causal__4-10-13__fraction-0.1-bs32

We also provide versions of the pre-trained models that have been converted to TorchScript for use in C++ here.

Evaluation

You will first need to download the SignalTrain dataset (~20GB) as well as the pre-trained models above. With this, you can then run the same evaluation pipeline used for reporting the metrics in the paper. If you would like to do this on GPU, perform the following command.

python test.py \
--root_dir /path/to/SignalTrain_LA2A_Dataset_1.1 \
--half \
--preload \
--eval_subset test \
--save_dir test_audio \

In this case, not only will the metrics be printed to terminal, we will also save out all of the processed audio from the test set to disk in the test_audio/ directory. If you would like to run the tests across the entire dataset you can specific a different string after the --eval_subset flag, as either train, val, or full.

Training

If would like to re-train the models in the paper, you can run the training script which will train all the models one by one.

python train.py \ 
--root_dir /path/to/SignalTrain_LA2A_Dataset_1.1 \
--precision 16 \
--preload \
--gpus 1 \

Plugin

We provide plugin builds (AV/VST3) for macOS. You can also build the plugin for your platform. This will require the traced models, which you can download here. First, you will need download and extract libtorch. Check the PyTorch site to find the correct version.

wget https://download.pytorch.org/libtorch/cpu/libtorch-macos-1.7.1.zip
unzip libtorch-macos-1.7.1.zip

Now move this into the realtime/ directory .

mv libtorch realtime/

We provide a ncomp.jucer file and a CMakeLists.txt that was created using FRUT. You will likely need to compile and run FRUT on this .jucer file in order to create a valid CMakeLists.txt. To do so, follow the instructions on compiling FRUT. Then convert the .jucer file. You will have to update the paths here to reflect the location of FRUT.

cd realtime/plugin/
../../FRUT/prefix/FRUT/bin/Jucer2CMake reprojucer ncomp.jucer ../../FRUT/prefix/FRUT/cmake/Reprojucer.cmake

Now you can finally build the plugin using CMake with the build.sh script. BUT, you will have to first update the path to libtorch in the build.sh script.

rm -rf build
mkdir build
cd build
cmake .. -G Xcode -DCMAKE_PREFIX_PATH=/absolute/path/to/libtorch ..
cmake --build .

Citation

If you use any of this code in your work, please consider citing us.

    @article{steinmetz2021efficient,
            title={Efficient Neural Networks for Real-time Analog Audio Effect Modeling},
            author={Steinmetz, Christian J. and Reiss, Joshua D.},
            journal={arXiv:2102.06200},
            year={2021}}
Owner
Christian Steinmetz
Building tools for musicians and audio engineers (often with machine learning). PhD Student at Queen Mary University of London.
Christian Steinmetz
Autonomous Robots Kalman Filters

Autonomous Robots Kalman Filters The Kalman Filter is an easy topic. However, ma

20 Jul 18, 2022
[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting [Paper] [Project Website] [Google Colab] We propose a method for converting a

Virginia Tech Vision and Learning Lab 6.2k Jan 01, 2023
PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

Meta Archive 873 Dec 15, 2022
Learn about Spice.ai with in-depth samples

Samples Learn about Spice.ai with in-depth samples ServerOps - Learn when to run server maintainance during periods of low load Gardener - Intelligent

Spice.ai 16 Mar 23, 2022
An example showing how to use jax to train resnet50 on multi-node multi-GPU

jax-multi-gpu-resnet50-example This repo shows how to use jax for multi-node multi-GPU training. The example is adapted from the resnet50 example in d

Yangzihao Wang 20 Jul 04, 2022
Yolov5-lite - Minimal PyTorch implementation of YOLOv5

Yolov5-Lite: Minimal YOLOv5 + Deep Sort Overview This repo is a shortened versio

Kadir Nar 57 Nov 28, 2022
Collect super-resolution related papers, data, repositories

Collect super-resolution related papers, data, repositories

WangChaofeng 1.7k Jan 03, 2023
A very impractical 3D rendering engine that runs in the python terminal.

Terminal-3D-Render A very impractical 3D rendering engine that runs in the python terminal. do NOT try to run this program using the standard python I

23 Dec 31, 2022
RoMA: Robust Model Adaptation for Offline Model-based Optimization

RoMA: Robust Model Adaptation for Offline Model-based Optimization Implementation of RoMA: Robust Model Adaptation for Offline Model-based Optimizatio

9 Oct 31, 2022
Python suite to construct benchmark machine learning datasets from the MIMIC-III clinical database.

MIMIC-III Benchmarks Python suite to construct benchmark machine learning datasets from the MIMIC-III clinical database. Currently, the benchmark data

Chengxi Zang 6 Jan 02, 2023
Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

MetaAdaptRank This repository provides the implementation of meta-learning to reweight synthetic weak supervision data described in the paper Few-Shot

THUNLP 5 Jun 16, 2022
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers Results results on COCO val Backbone Method Lr Schd PQ Config Download

155 Dec 20, 2022
Data-driven reduced order modeling for nonlinear dynamical systems

SSMLearn Data-driven Reduced Order Models for Nonlinear Dynamical Systems This package perform data-driven identification of reduced order model based

Haller Group, Nonlinear Dynamics 27 Dec 13, 2022
PyTorch implementation of "PatchGame: Learning to Signal Mid-level Patches in Referential Games" to appear in NeurIPS 2021

PatchGame: Learning to Signal Mid-level Patches in Referential Games This repository is the official implementation of the paper - "PatchGame: Learnin

Kamal Gupta 22 Mar 16, 2022
UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.

Microsoft 7.6k Jan 01, 2023
[内测中]前向式Python环境快捷封装工具,快速将Python打包为EXE并添加CUDA、NoAVX等支持。

QPT - Quick packaging tool 快捷封装工具 GitHub主页 | Gitee主页 QPT是一款可以“模拟”开发环境的多功能封装工具,最短只需一行命令即可将普通的Python脚本打包成EXE可执行程序,并选择性添加CUDA和NoAVX的支持,尽可能兼容更多的用户环境。 感觉还可

QPT Family 545 Dec 28, 2022
An algorithmic trading bot that learns and adapts to new data and evolving markets using Financial Python Programming and Machine Learning.

ALgorithmic_Trading_with_ML An algorithmic trading bot that learns and adapts to new data and evolving markets using Financial Python Programming and

1 Mar 14, 2022
This repository is for the preprint "A generative nonparametric Bayesian model for whole genomes"

BEAR Overview This repository contains code associated with the preprint A generative nonparametric Bayesian model for whole genomes (2021), which pro

Debora Marks Lab 10 Sep 18, 2022
A program that uses computer vision to detect hand gestures, used for controlling movie players.

HandGestureDetection This program uses a Haar Cascade algorithm to detect the presence of your hand, and then passes it on to a self-created and self-

2 Nov 22, 2022
In this project, we'll be making our own screen recorder in Python using some libraries.

Screen Recorder in Python Project Description: In this project, we'll be making our own screen recorder in Python using some libraries. Requirements:

Hassan Shahzad 4 Jan 24, 2022