Supporting code for the Neograd algorithm

Related tags

Deep LearningNeograd
Overview

Neograd

This repo supports the paper Neograd: Gradient Descent with a Near-Ideal Learning Rate, which introduces the algorithm "Neograd". The paper and associated code are by Michael F. Zimmer. It's been submitted to JMLR.

Getting Started

Download the code. Paths within the program are relative.

Prerequisites

Python 3
Jupyter notebook

Installing

Unzip/clone the repo. You should see this directory structure:
neograd/
libs/
notebooks/
figs/
The meaning of these names is self-explanatory. Only the name "notebooks" can be changed without interfering with the paths.

Running Notebooks

After cd-ing into the "notebooks" directory, open a notebook in Jupyter and execute the cells. If you choose to uncomment certain lines (the save fig command) a figure will be saved for you. Some of these are the same figs that appear in the aforementioned paper.

Descriptions of notebooks

These experiment notebooks contain evaluations of algorithms against the named cost fcn
EXPT_2Dshell
EXPT_Beale
EXPT_double
EXPT_quartic
EXPT_sigmoid-well

Additionally, these contain additional tests.
EXPT_hybrid
EXPT_manual
EXPT_momentum

Descriptions of libraries

algos_vec
Functions that are central to the GD family and Neograd family.

common
Functions for rho, alpha, and functions for tracking results of a run.

common_vec
Functions used by algos_vec, which aren't central to the algorithms. Also, these functions have a specific assumption that the "parameter vector" is a numpy array.

costgrad_vec
This is an aggregation of all the functions needed to compute the cost and gradient of the specific cost functions examined in the paper.

params
Contains all global parameters (not to be confused with the parameter vector that is being optimized). Also present is a function to return a "good choice" of alpha for each algorithm-cost function combination, as determined by trial and error.

plotting
The plotting functions are passed the dictionaries of results returned by the optimization runs

A few details

"p" represents the parameter vector in the repo; note this differs from "theta" which is used in the paper.

Statistics during the run are accumulated by a dictionary of lists. The keys in the dictionary contain the name of the statistic, and the "values" are lists. Before entering the main loop, the names/keys must be declared; this is done in the function "init_results". After each iteration, a list will have a value appended to it; this is done in the function "update_results". Both of these functions are in the "common" library.

If you set the total iteration number ("num") too high, you may find you get underflow errors plus their ramifications. This is because the Neograd algorithm will drive the error down to be so small, it bumps up against machine precision. There are a number of sophisticated ways to handle this, but for the purposes here it is enough to simply stop the optimization before it becomes an issue.

In the code on github, this alternative definition of rho may be used. Simply change the parameter "g_rhotype" to "original", instead of "new". This is discussed in an appendix of the paper.

Author

Michael F. Zimmer

License

This project is licensed under the MIT license.

Owner
Michael Zimmer
Michael Zimmer
ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

(Comet-) ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs Paper Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sa

AI2 152 Dec 27, 2022
A Broader Picture of Random-walk Based Graph Embedding

Random-walk Embedding Framework This repository is a reference implementation of the random-walk embedding framework as described in the paper: A Broa

Zexi Huang 23 Dec 13, 2022
FinGAT: A Financial Graph Attention Networkto Recommend Top-K Profitable Stocks

FinGAT: A Financial Graph Attention Networkto Recommend Top-K Profitable Stocks This is our implementation for the paper: FinGAT: A Financial Graph At

Yu-Che Tsai 64 Dec 13, 2022
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Daft-Exprt - PyTorch Implementation PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis The

Keon Lee 47 Dec 18, 2022
BC3407-Group-5-Project - BC3407 Group Project With Python

BC3407-Group-5-Project As the world struggles to contain the ever-changing varia

1 Jan 26, 2022
Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory = 8G Numpy 1.

46 Dec 14, 2022
PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features Overview This repository is the Pytorch implementation of PRIN/SPRIN: On Extracting P

Yang You 17 Mar 02, 2022
Denoising Normalizing Flow

Denoising Normalizing Flow Christian Horvat and Jean-Pascal Pfister 2021 We combine Normalizing Flows (NFs) and Denoising Auto Encoder (DAE) by introd

CHrvt 17 Oct 15, 2022
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 44 Dec 12, 2022
Simple machine learning library / 簡單易用的機器學習套件

FukuML Simple machine learning library / 簡單易用的機器學習套件 Installation $ pip install FukuML Tutorial Lesson 1: Perceptron Binary Classification Learning Al

Fukuball Lin 279 Sep 15, 2022
Rational Activation Functions - Replacing Padé Activation Units

Rational Activations - Learnable Rational Activation Functions First introduce as PAU in Padé Activation Units: End-to-end Learning of Activation Func

<a href=[email protected]"> 38 Nov 22, 2022
Pytorch Lightning 1.2k Jan 06, 2023
Simple converter for deploying Stable-Baselines3 model to TFLite and/or Coral

Running SB3 developed agents on TFLite or Coral Introduction I've been using Stable-Baselines3 to train agents against some custom Gyms, some of which

Gary Briggs 16 Oct 11, 2022
How will electric vehicles affect traffic congestion and energy consumption: an integrated modelling approach

EV-charging-impact This repository contains the code that has been used for the Queue modelling for the paper "How will electric vehicles affect traff

7 Nov 30, 2022
Pytorch implementation of FlowNet by Dosovitskiy et al.

FlowNetPytorch Pytorch implementation of FlowNet by Dosovitskiy et al. This repository is a torch implementation of FlowNet, by Alexey Dosovitskiy et

Clément Pinard 762 Jan 02, 2023
Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish legal domain Language Model ⚖️ This repository contains the page for two main resources for the Spanish legal domain: A RoBERTa model: https:/

Plan de Tecnologías del Lenguaje - Gobierno de España 12 Nov 14, 2022
The full training script for Enformer (Tensorflow Sonnet) on TPU clusters

Enformer TPU training script (wip) The full training script for Enformer (Tensorflow Sonnet) on TPU clusters, in an effort to migrate the model to pyt

Phil Wang 10 Oct 19, 2022
An unofficial styleguide and best practices summary for PyTorch

A PyTorch Tools, best practices & Styleguide This is not an official style guide for PyTorch. This document summarizes best practices from more than a

IgorSusmelj 1.5k Jan 05, 2023
Cluttered MNIST Dataset

Cluttered MNIST Dataset A setup script will download MNIST and produce mnist/*.t7 files: luajit download_mnist.lua Example usage: local mnist_clutter

DeepMind 50 Jul 12, 2022
A python package simulating the quasi-2D pseudospin-1/2 Gross-Pitaevskii equation with NVIDIA GPU acceleration.

A python package simulating the quasi-2D pseudospin-1/2 Gross-Pitaevskii equation with NVIDIA GPU acceleration. Introduction spinor-gpe is high-level,

2 Sep 20, 2022