Code base for the paper "Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation"

Overview

This repository contains code for the paper Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation.

Installation

Our dependencies are fully specified in Pipfile, which can be supplied to pipenv to install the environment. One failsafe approach is to install pipenv in a fresh virtual environment, then run pipenv install in this directory. Note the Pipfile specifies our Python 3.9 development environment; most experiments were run in an identical environment under Python 3.7 instead.

Difficulties with CUDA versions meant we had to manually install PyTorch and Torchvision rather than use pipenv --- the corresponding lines in Pipfile may need adjustment for your use case. Alternatively, use the list of dependencies as a guide to what to install yourself with pip, or use the full dump of our development environment in final_requirements.txt.

Datasets may not be bundled with the repository, but are expected to be found at locations specified in datasets.py, preprocessed into single PyTorch tensors of all the input and output data (generally data/<dataset>/data.pt and data/<dataset>/targets.pt).

Configuration

Training code is controlled with YAML configuration files, as per the examples in configs/. Generally one file is required to specify the dataset, and a second to specify the algorithm, using the obvious naming convention. Brief help text is available on the command line, but the meanings of each option should be reasonably self-explanatory.

For Ours (WD+LR), use the file Ours_LR.yaml; for Ours (WD+LR+M), use the file Ours_LR_Momentum.yaml; for Ours (WD+HDLR+M), use the file Ours_HDLR_Momentum.yaml. For Long/Medium/Full Diff-through-Opt, we provide separate configuration files for the UCI cases and the Fashion-MNIST cases.

We provide two additional helper configurations. Random_Validation.yaml copies Random.yaml, but uses the entire validation set to compute the validation loss at each logging step. This allows for stricter analysis of the best-performing run at particular time steps, for instance while constructing Random (3-batched). Random_Validation_BayesOpt.yaml only forces the use of the entire dataset for the very last validation loss computation, so that Bayesian Optimisation runs can access reliable performance metrics without adversely affecting runtime.

The configurations provided match those necessary to replicate the main experiments in our paper (in Section 4: Experiments). Other trials, such as those in the Appendix, will require these configurations to be modified as we describe in the paper. Note especially that our three short-horizon bias studies all require different modifications to the LongDiffThroughOpt_*.yaml configurations.

Running

Individual runs are commenced by executing train.py and passing the desired configuration files with the -c flag. For example, to run the default Fashion-MNIST experiments using Diff-through-Opt, use:

$ python train.py -c ./configs/fashion_mnist.yaml ./configs/DiffThroughOpt.yaml

Bayesian Optimisation runs are started in a similar way, but with a call to bayesopt.py rather than train.py.

For executing multiple runs in parallel, parallel_exec.py may be useful: modify the main function call at the bottom of the file as required, then call this file instead of train.py at the command line. The number of parallel workers may be specified by num_workers. Any configurations passed at the command line are used as a base, to which modifications may be added by override_generator. The latter should either be a function which generates one override dictionary per call (in which case num_repetitions sets the number of overrides to generate), or a function which returns a generator over configurations (in which case set num_repetitions = None). Each configuration override is run once for each of algorithms, whose configurations are read automatically from the corresponding files and should not be explicitly passed at the command line. Finally, main_function may be used to switch between parallel calls to train.py and bayesopt.py as required.

For blank-slate replications, the most useful override generators will be natural_sgd_generator, which generates a full SGD initialisation in the ranges we use, and iteration_id, which should be used with Bayesian Optimisation runs to name each parallel run using a counter. Other generators may be useful if you wish to supplement existing results with additional algorithms etc.

PennTreebank and CIFAR-10 were executed on clusters running SLURM; the corresponding subfolders contain configuration scripts for these experiments, and submit.sh handles the actual job submission.

Analysis

By default, runs are logged in Tensorboard format to the ./runs directory, where Tensorboard may be used to inspect the results. If desired, a descriptive name can be appended to a particular execution using the -n switch on the command line. Runs can optionally be written to a dedicated subfolder specified with the -g switch, and the base folder for logging can be changed with the -l switch.

If more precise analysis is desired, pass the directory containing the desired results to util.get_tags(), which will return a dictionary of the evolution of each logged scalar in the results. Note that this function uses Tensorboard calls which predate its --load_fast option, so may take tens of minutes to return.

This data dictionary can be passed to one of the more involved plotting routines in figures.py to produce specific plots. The script paper_plots.py generates all the plots we use in our paper, and may be inspected for details of any particular plot.

Generalized Proximal Policy Optimization with Sample Reuse (GePPO)

Generalized Proximal Policy Optimization with Sample Reuse This repository is the official implementation of the reinforcement learning algorithm Gene

Jimmy Queeney 9 Nov 28, 2022
Joint parameterization and fitting of stroke clusters

StrokeStrip: Joint Parameterization and Fitting of Stroke Clusters Dave Pagurek van Mossel1, Chenxi Liu1, Nicholas Vining1,2, Mikhail Bessmeltsev3, Al

Dave Pagurek 44 Dec 01, 2022
A collection of IPython notebooks covering various topics.

ipython-notebooks This repo contains various IPython notebooks I've created to experiment with libraries and work through exercises, and explore subje

John Wittenauer 2.6k Jan 01, 2023
Using multidimensional LSTM neural networks to create a forecast for Bitcoin price

Multidimensional LSTM BitCoin Time Series Using multidimensional LSTM neural networks to create a forecast for Bitcoin price. For notes around this co

Jakob Aungiers 318 Dec 14, 2022
In-place Parallel Super Scalar Samplesort (IPS⁴o)

In-place Parallel Super Scalar Samplesort (IPS⁴o) This is the implementation of the algorithm IPS⁴o presented in the paper Engineering In-place (Share

82 Dec 22, 2022
Recovering Brain Structure Network Using Functional Connectivity

Recovering-Brain-Structure-Network-Using-Functional-Connectivity Framework: Papers: This repository provides a PyTorch implementation of the models ad

5 Nov 30, 2022
A C implementation for creating 2D voronoi diagrams

Branch OSX/Linux Windows master dev jc_voronoi A fast C/C++ header only implementation for creating 2D Voronoi diagrams from a point set Uses Fortune'

Mathias Westerdahl 481 Dec 29, 2022
Unoffical implementation about Image Super-Resolution via Iterative Refinement by Pytorch

Image Super-Resolution via Iterative Refinement Paper | Project Brief This is a unoffical implementation about Image Super-Resolution via Iterative Re

LiangWei Jiang 2.5k Jan 02, 2023
A set of examples around hub for creating and processing datasets

Examples for Hub - Dataset Format for AI A repository showcasing examples of using Hub Uploading Dataset Places365 Colab Tutorials Notebook Link Getti

Activeloop 11 Dec 14, 2022
Working demo of the Multi-class and Anomaly classification model using the CLIP feature space

👁️ Hindsight AI: Crime Classification With Clip About For Educational Purposes Only This is a recursive neural net trained to classify specific crime

Miles Tweed 2 Jun 05, 2022
A model to classify a piece of news as REAL or FAKE

Fake_news_classification A model to classify a piece of news as REAL or FAKE. This python project of detecting fake news deals with fake and real news

Gokul Stark 1 Jan 29, 2022
ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

ColossalAI-Examples This repository contains examples of training models with Co

HPC-AI Tech 185 Jan 09, 2023
How to Predict Stock Prices Easily Demo

How-to-Predict-Stock-Prices-Easily-Demo How to Predict Stock Prices Easily - Intro to Deep Learning #7 by Siraj Raval on Youtube ##Overview This is th

Siraj Raval 752 Nov 16, 2022
Test-Time Personalization with a Transformer for Human Pose Estimation, NeurIPS 2021

Transforming Self-Supervision in Test Time for Personalizing Human Pose Estimation This is an official implementation of the NeurIPS 2021 paper: Trans

41 Nov 28, 2022
MM1 and MMC Queue Simulation using python - Results and parameters in excel and csv files

implementation of MM1 and MMC Queue on randomly generated data and evaluate simulation results then compare with analytical results and draw a plot curve for them, simulate some integrals and compare

Mohamadreza Rezaei 1 Jan 19, 2022
High-resolution networks and Segmentation Transformer for Semantic Segmentation

High-resolution networks and Segmentation Transformer for Semantic Segmentation Branches This is the implementation for HRNet + OCR. The PyTroch 1.1 v

HRNet 2.8k Jan 07, 2023
Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

Shunted Transformer This is the offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation by Sucheng Ren, Daquan Zhou, Shengf

156 Dec 27, 2022
UFPR-ADMR-v2 Dataset

UFPR-ADMR-v2 Dataset The UFPR-ADMRv2 dataset contains 5,000 dial meter images obtained on-site by employees of the Energy Company of Paraná (Copel), w

Gabriel Salomon 8 Sep 29, 2022
Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

SinIR (Official Implementation) Requirements To install requirements: pip install -r requirements.txt We used Python 3.7.4 and f-strings which are in

47 Oct 11, 2022
Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation

Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation Introduction This is a PyTorch

XMed-Lab 30 Sep 23, 2022