Code base for the paper "Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation"

Overview

This repository contains code for the paper Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation.

Installation

Our dependencies are fully specified in Pipfile, which can be supplied to pipenv to install the environment. One failsafe approach is to install pipenv in a fresh virtual environment, then run pipenv install in this directory. Note the Pipfile specifies our Python 3.9 development environment; most experiments were run in an identical environment under Python 3.7 instead.

Difficulties with CUDA versions meant we had to manually install PyTorch and Torchvision rather than use pipenv --- the corresponding lines in Pipfile may need adjustment for your use case. Alternatively, use the list of dependencies as a guide to what to install yourself with pip, or use the full dump of our development environment in final_requirements.txt.

Datasets may not be bundled with the repository, but are expected to be found at locations specified in datasets.py, preprocessed into single PyTorch tensors of all the input and output data (generally data/<dataset>/data.pt and data/<dataset>/targets.pt).

Configuration

Training code is controlled with YAML configuration files, as per the examples in configs/. Generally one file is required to specify the dataset, and a second to specify the algorithm, using the obvious naming convention. Brief help text is available on the command line, but the meanings of each option should be reasonably self-explanatory.

For Ours (WD+LR), use the file Ours_LR.yaml; for Ours (WD+LR+M), use the file Ours_LR_Momentum.yaml; for Ours (WD+HDLR+M), use the file Ours_HDLR_Momentum.yaml. For Long/Medium/Full Diff-through-Opt, we provide separate configuration files for the UCI cases and the Fashion-MNIST cases.

We provide two additional helper configurations. Random_Validation.yaml copies Random.yaml, but uses the entire validation set to compute the validation loss at each logging step. This allows for stricter analysis of the best-performing run at particular time steps, for instance while constructing Random (3-batched). Random_Validation_BayesOpt.yaml only forces the use of the entire dataset for the very last validation loss computation, so that Bayesian Optimisation runs can access reliable performance metrics without adversely affecting runtime.

The configurations provided match those necessary to replicate the main experiments in our paper (in Section 4: Experiments). Other trials, such as those in the Appendix, will require these configurations to be modified as we describe in the paper. Note especially that our three short-horizon bias studies all require different modifications to the LongDiffThroughOpt_*.yaml configurations.

Running

Individual runs are commenced by executing train.py and passing the desired configuration files with the -c flag. For example, to run the default Fashion-MNIST experiments using Diff-through-Opt, use:

$ python train.py -c ./configs/fashion_mnist.yaml ./configs/DiffThroughOpt.yaml

Bayesian Optimisation runs are started in a similar way, but with a call to bayesopt.py rather than train.py.

For executing multiple runs in parallel, parallel_exec.py may be useful: modify the main function call at the bottom of the file as required, then call this file instead of train.py at the command line. The number of parallel workers may be specified by num_workers. Any configurations passed at the command line are used as a base, to which modifications may be added by override_generator. The latter should either be a function which generates one override dictionary per call (in which case num_repetitions sets the number of overrides to generate), or a function which returns a generator over configurations (in which case set num_repetitions = None). Each configuration override is run once for each of algorithms, whose configurations are read automatically from the corresponding files and should not be explicitly passed at the command line. Finally, main_function may be used to switch between parallel calls to train.py and bayesopt.py as required.

For blank-slate replications, the most useful override generators will be natural_sgd_generator, which generates a full SGD initialisation in the ranges we use, and iteration_id, which should be used with Bayesian Optimisation runs to name each parallel run using a counter. Other generators may be useful if you wish to supplement existing results with additional algorithms etc.

PennTreebank and CIFAR-10 were executed on clusters running SLURM; the corresponding subfolders contain configuration scripts for these experiments, and submit.sh handles the actual job submission.

Analysis

By default, runs are logged in Tensorboard format to the ./runs directory, where Tensorboard may be used to inspect the results. If desired, a descriptive name can be appended to a particular execution using the -n switch on the command line. Runs can optionally be written to a dedicated subfolder specified with the -g switch, and the base folder for logging can be changed with the -l switch.

If more precise analysis is desired, pass the directory containing the desired results to util.get_tags(), which will return a dictionary of the evolution of each logged scalar in the results. Note that this function uses Tensorboard calls which predate its --load_fast option, so may take tens of minutes to return.

This data dictionary can be passed to one of the more involved plotting routines in figures.py to produce specific plots. The script paper_plots.py generates all the plots we use in our paper, and may be inspected for details of any particular plot.

This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution Network.

Lite-HRNet: A Lightweight High-Resolution Network Introduction This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution

HRNet 675 Dec 25, 2022
Learning Features with Parameter-Free Layers (ICLR 2022)

Learning Features with Parameter-Free Layers (ICLR 2022) Dongyoon Han, YoungJoon Yoo, Beomyoung Kim, Byeongho Heo | Paper NAVER AI Lab, NAVER CLOVA Up

NAVER AI 65 Dec 07, 2022
A pytorch-based real-time segmentation model for autonomous driving

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation This project contains the Pytorch implementation for the proposed CFPNet: pap

342 Dec 22, 2022
Official implementation of YOGO for Point-Cloud Processing

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module By Chenfeng Xu, Bohan Zhai, Bichen Wu, T

Chenfeng Xu 67 Dec 20, 2022
Learning Calibrated-Guidance for Object Detection in Aerial Images

Learning Calibrated-Guidance for Object Detection in Aerial Images arxiv We propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance

51 Sep 22, 2022
Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

tyty 4 Aug 28, 2022
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022
clustering moroccan stocks time series data using k-means with dtw (dynamic time warping)

Moroccan Stocks Clustering Context Hey! we don't always have to forecast time series am I right ? We use k-means to cluster about 70 moroccan stock pr

Ayman Lafaz 7 Oct 18, 2022
This is the official implementation of VaxNeRF (Voxel-Accelearated NeRF).

VaxNeRF Paper | Google Colab This is the official implementation of VaxNeRF (Voxel-Accelearated NeRF). This codebase is implemented using JAX, buildin

naruya 132 Nov 21, 2022
The Body Part Regression (BPR) model translates the anatomy in a radiologic volume into a machine-interpretable form.

Copyright © German Cancer Research Center (DKFZ), Division of Medical Image Computing (MIC). Please make sure that your usage of this code is in compl

MIC-DKFZ 40 Dec 18, 2022
A repository for interferometer controller code.

dses-interferometer-controller A repository for interferometer controller code, hardware, and simulations. See dses.science for more information on th

Eli Reed 1 Jan 17, 2022
A benchmark dataset for emulating atmospheric radiative transfer in weather and climate models with machine learning (NeurIPS 2021 Datasets and Benchmarks Track)

ClimART - A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models Official PyTorch Implementation Using deep le

21 Dec 31, 2022
Parasite: a tool allowing you to compress and decompress files, to reduce their size

🦠 Parasite 🦠 Parasite is a tool written in Python3 allowing you to "compress" any file, reducing its size. ⭐ Features ⭐ + Fast + Good optimization,

Billy 30 Nov 25, 2022
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Deformable Attention Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DET

Phil Wang 128 Dec 24, 2022
Python PID Tuner - Based on a FOPDT model obtained using a Open Loop Process Reaction Curve

PythonPID_Tuner Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a rough e

6 Jan 14, 2022
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in

2 Aug 05, 2022
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning Tensorflow code and models for the paper: Large Scale Fine-Grained Categ

Yin Cui 187 Oct 01, 2022
A repository for generating stylized talking 3D and 3D face

style_avatar A repository for generating stylized talking 3D faces and 2D videos. This is the repository for paper Imitating Arbitrary Talking Style f

Haozhe Wu 191 Dec 22, 2022
This repository contains a toolkit for collecting, labeling and tracking object keypoints

This repository contains a toolkit for collecting, labeling and tracking object keypoints. Object keypoints are semantic points in an object's coordinate frame.

ETHZ ASL 13 Dec 12, 2022
wmctrl ported to Python Ctypes

work in progress wmctrl is a command that can be used to interact with an X Window manager that is compatible with the EWMH/NetWM specification. wmctr

Iyad Ahmed 22 Dec 31, 2022