An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

Related tags

Deep Learningiftopt
Overview

iftopt

An Implicit Function Theorem (IFT) optimizer for bi-level optimizations.

Requirements

  • Python 3.7+
  • PyTorch 1.x

Installation

$ pip install git+https://github.com/money-shredder/iftopt.git

Usage

Assuming a bi-level optimization of the form:

y* = argmin_{y} val_loss(x*, y), where x* = argmin_{x} train_loss(x, y).

To solve for the optimal x* and y* in the optimization problem, we can implement the following with iftopt:

from iftopt import HyperOptimizer
train_lr = val_lr = 0.1
# parameter to minimize the training loss
x = torch.nn.Parameter(...)
# hyper-parameter to minimize the validation loss
y = torch.nn.Parameter(...)
# training loss optimizer
opt = torch.optim.SGD([x], lr=train_lr)
# validation loss optimizer
hopt = HyperOptimizer(
    [y], torch.optim.SGD([y], lr=val_lr), vih_lr=0.1, vih_iterations=5)
# outer optimization loop for y
for _ in range(...):
    # inner optimization loop for x
    for _ in range(...):
        z = train_loss(x, y)
        # inner optimization step for x
        opt.zero_grad()
        z.backward()
        opt.step()
    # outer optimization step for y
    hopt.set_train_parameters([x])
    z = train_loss(x, y)
    hopt.train_step(z)
    v = val_loss(x, y)
    hopt.val_step(v)
    hopt.grad()
    hopt.step()

For a concrete simple example, please check out and run demo.py, where

train_loss = lambda x, y: (x + y) ** 2
val_loss = lambda x, y: x ** 2

with x = y = 1.0 initially. It will generate a video demo.mp4 showing the optimization trajectory in the animation below. Note that although the hyper-parameter y does not have a direct gradient w.r.t. the validation loss, iftopt can still minimize the validation loss by computing the hyper-gradient via implicit function theorem.

assets/demo.gif

Owner
The Money Shredder Lab
Accurate, Efficient and Robust DL
The Money Shredder Lab
Official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th ICML Workshop on AutoML)

Automated Learning Rate Scheduler for Large-Batch Training The official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th

Kakao Brain 35 Jan 04, 2023
PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet

PyTorch Image Classification Following papers are implemented using PyTorch. ResNet (1512.03385) ResNet-preact (1603.05027) WRN (1605.07146) DenseNet

1.2k Jan 04, 2023
Running AlphaFold2 (from ColabFold) in Azure Machine Learning

Running AlphaFold2 (from ColabFold) in Azure Machine Learning Colby T. Ford, Ph.D. Companion repository for Medium Post: How to predict many protein s

Colby T. Ford 3 Feb 18, 2022
Fake-user-agent-traffic-geneator - Python CLI Tool to generate fake traffic against URLs with configurable user-agents

Fake traffic generator for Gartner Demo Generate fake traffic to URLs with custo

New Relic Experimental 3 Oct 31, 2022
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP Andreas Fürst* 1, Elisabeth Rumetshofer* 1, Viet Tran1, Hubert Ramsauer1, Fei Tang3, Joh

Institute for Machine Learning, Johannes Kepler University Linz 133 Jan 04, 2023
The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"

pretraining-learning-curves This is the repository for the paper When Do You Need Billions of Words of Pretraining Data? Edge Probing We use jiant1 fo

ML² AT CILVR 19 Nov 25, 2022
My 1st place solution at Kaggle Hotel-ID 2021

1st place solution at Kaggle Hotel-ID My 1st place solution at Kaggle Hotel-ID to Combat Human Trafficking 2021. https://www.kaggle.com/c/hotel-id-202

Kohei Ozaki 18 Aug 19, 2022
GDSC-ML Team Interview Task

GDSC-ML-Team---Interview-Task Task 1 : Clean or Messy room In this task we have to classify the given test images as clean or messy. - Link for datase

Aayush. 1 Jan 19, 2022
Software & Hardware to do multi color printing with Sharpies

3D Print Colorizer is a combination of 3D printed parts and a Cura plugin which allows anyone with an Ender 3 like 3D printer to produce multi colored

343 Jan 06, 2023
PyTorch implementation of Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy

Anomaly Transformer in PyTorch This is an implementation of Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy. This pape

spencerbraun 160 Dec 19, 2022
Data Augmentation with Variational Autoencoders

Documentation Pyraug This library provides a way to perform Data Augmentation using Variational Autoencoders in a reliable way even in challenging con

112 Nov 30, 2022
A PyTorch implementation of unsupervised SimCSE

A PyTorch implementation of unsupervised SimCSE

99 Dec 23, 2022
Official Implementation of Neural Splines

Neural Splines: Fitting 3D Surfaces with Inifinitely-Wide Neural Networks This repository contains the official implementation of the CVPR 2021 (Oral)

Francis Williams 56 Nov 29, 2022
利用Tensorflow实现基于CNN的中文短文本分类

Text Classification with CNN 使用卷积神经网络进行中文文本分类 CNN做句子分类的论文可以参看: Convolutional Neural Networks for Sentence Classification 还可以去读dennybritz大牛的博客:Implemen

Jeremiah 4 Nov 08, 2022
Code for ICCV2021 paper SPEC: Seeing People in the Wild with an Estimated Camera

SPEC: Seeing People in the Wild with an Estimated Camera [ICCV 2021] SPEC: Seeing People in the Wild with an Estimated Camera, Muhammed Kocabas, Chun-

Muhammed Kocabas 187 Dec 26, 2022
BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

weijiawu 47 Dec 26, 2022
A best practice for tensorflow project template architecture.

A best practice for tensorflow project template architecture.

Mahmoud Gamal Salem 3.6k Dec 22, 2022
PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

Zechen Bai 12 Jul 08, 2022
Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

Graph Convolutional Networks for Temporal Action Localization This repo holds the codes and models for the PGCN framework presented on ICCV 2019 Graph

Runhao Zeng 318 Dec 06, 2022
TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"

Hierarchical Attention Networks for Document Classification This is an implementation of the paper Hierarchical Attention Networks for Document Classi

Quoc-Tuan Truong 83 Dec 05, 2022