Optimizing synthesizer parameters using gradient approximation

NASH 2021 Hackathon!

These are some experiments I conducted during NASH 2021, the Neural Audio Synthesis Hackathon that took place on the 18th & 19th of December.

Over the weekend I explored implementing gradient approximation for torchsynth, so that synthesizers could be included in deep learning models & training without having to have the full synth be differentiable. It uses simultaneous perturbation stochastic approximation (SPSA) to estimate the gradients for synthesizer parameters. This technique was used by Marco A. Martínez Ramírez et al. in their work on Differentiable Signal Processing With Black-Box Audio Effects.

I was able to start optimizing on a few parameters for a simple synthesizer, but ran into issues as soon as oscillator tuning or FM was introduced. There is a known issue with audio loss functions for calculating loss with pitch (Turian and Henry, 2020), so this is not surprising.

Nonetheless, techniques like SPSA seem promising for including traditional DSP synthesis into neural nets and deep learning!

Fun weekend puttering around with this! Thank you to Ben Hayes for organing the event.

Optimizing synthesizer parameters using gradient approximation

Related tags

Overview

Optimizing synthesizer parameters using gradient approximation

NASH 2021 Hackathon!

Owner

Jordie Shier

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

TUPÃ was developed to analyze electric field properties in molecular simulations

Video Autoencoder: self-supervised disentanglement of 3D structure and motion

A simple program for training and testing vit

MAME is a multi-purpose emulation framework.

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

Dynamic Head: Unifying Object Detection Heads with Attentions

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

Model Agnostic Interpretability for Multiple Instance Learning

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Cluttered MNIST Dataset

TeST: Temporal-Stable Thresholding for Semi-supervised Learning

Catch-all collection of generative art made using processing

This repository is for EMNLP 2021 paper: It is Not as Good as You Think! Evaluating Simultaneous Machine Translation on Interpretation Data

This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

A Tensorfflow implementation of Attend, Infer, Repeat

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently