Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

Last update: Nov 21, 2022

Overview

BigGAN Audio Visualizer

Description

This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generate and interpolate between noise/class vector inputs to the model. Classes are chosen manually or optionally using semantic similarity on BERT encodings of a lyrics corpus.

Usage:

usage: visualize.py [-h] -s SONG [--resolution {128,256,512}] [-d DURATION]
               [-ps [200-295]] [-ts [0.05-0.8]]
               [--classes CLASSES [CLASSES ...]] [-n NUM_CLASSES]
               [--jitter [0-1]] [--frame_length i*2^6] [--truncation [0.1-1]]
               [--smooth_factor [10-30]] [--batch_size BATCH_SIZE]
               [-o OUTPUT_FILE] [--use_last_vectors] [--use_last_classes]
               [-l LYRICS]

Arguments

short	long	default	range	help
`-h`	`--help`			show this help message and exit
`-s`	`--song`	`input/romantic.mp3`		path to input audio file
	`--resolution`	`512`	`{128,256,512}`	output video resolution
`-d`	`--duration`	`None`		output video duration
`-ps`	`--pitch_sensitivity`	`220`	`[200-295]`	controls the sensitivity of the class vector to changes in pitch
`-ts`	`--tempo_sensitivity`	`0.25`	`[0.05-0.8]`	controls the sensitivity of the noise vector to changes in volume and tempo
	`--classes`	`None`		manually specify [--num_classes] ImageNet classes
`-n`	`--num_classes`	`12`		number of unique classes to use
	`--jitter`	`0.5`	`[0-1]`	controls jitter of the noise vector to reduce repitition
	`--frame_length`	`512`	`i*2^6`	number of audio frames to video frames in the output
	`--truncation`	`1`	`[0.1-1]`	BigGAN truncation parameter controls complexity of structure within frames
	`--smooth_factor`	`20`	`[10-30]`	controls interpolation between class vectors to smooth rapid flucations
	`--batch_size`	`30`		BigGAN batch_size
`-o`	`--output_file`			name of output file stored in output/, defaults to [--song] path base_name
	`--use_last_vectors`	`False`		set flag to use previous saved class/noise vectors
	`--use_last_classes`	`False`		set flag to use previous classes
`-l`	`--lyrics`	`None`		path to lyrics file; setting [--lyrics LYRICS] computes classes by semantic similarity under BERT encodings

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

Related tags

Overview

BigGAN Audio Visualizer

Description

Usage:

Arguments

Owner

Rush Kapoor

Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

Earth Vision Foundation

Minecraft Hack Detection With Python

How to Leverage Multimodal EHR Data for Better Medical Predictions?

Tutorials and implementations for "Self-normalizing networks"

Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)

My course projects for the 2021 Spring Machine Learning course at the National Taiwan University (NTU)

PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

Evaluation suite for large-scale language models.

Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.

ilpyt: imitation learning library with modular, baseline implementations in Pytorch

The software associated with a paper accepted at EMNLP 2021 titled "Open Knowledge Graphs Canonicalization using Variational Autoencoders".

Local trajectory planner based on a multilayer graph framework for autonomous race vehicles.

Detection of drones using their thermal signatures from thermal camera through YOLO-V3 based CNN with modifications to encapsulate drone motion

Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

Framework for evaluating ANNS algorithms on billion scale datasets.

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis in JAX

PPO is a very popular Reinforcement Learning algorithm at present.

Deep learning models for classification of 15 common weeds in the southern U.S. cotton production systems.