A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

Last update: Aug 18, 2022

Overview

MADGRAD Optimization Algorithm For Tensorflow

This package implements the MadGrad Algorithm proposed in Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization (Aaron Defazio and Samy Jelassi, 2021).

Table of Contents

About The Project
Getting Started
- Prerequisites
- Installation
Usage
Contributing
License
Contact
Citations

About The Project

The MadGrad algorithm of optimization uses Dual averaging of gradients along with momentum based adaptivity to attain results that match or outperform Adam or SGD + momentum based algorithms. This project offers a Tensorflow implementation of the algorithm along with a few usage examples and tests.

Prerequisites

Prerequisites can be installed separately through the requirements.txt file as below

pip install -r requirements.txt

Installation

This project is built with Python 3 and can be pip installed directly

pip install tf-madgrad

Usage

To use the optimizer in any tf.keras model, you just need to import and instantiate the MadGrad optimizer from the tf_madgrad package.

from madgrad import MadGrad

# Create the architecture
inp = tf.keras.layers.Input(shape=shape)
...
op = tf.keras.layers.Dense(classes, activation=activation)

# Instantiate the model
model = tf.keras.models.Model(inp, op)

# Pass the MadGrad optimizer to the compile function
model.compile(optimizer=MadGrad(lr=0.01), loss=loss)

# Fit the keras model as normal
model.fit(...)

This implementation is also supported for distributed training using tf.strategy

See a MNIST example here

Contributing

Any and all contributions are welcome. Please raise an issue if the optimizer gives incorrect results or crashes unexpectedly during training.

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Feel free to reach out for any issues or requests related to this implementation

Darshan Deshpande - Email | LinkedIn

Citations

@misc{defazio2021adaptivity,
      title={Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization}, 
      author={Aaron Defazio and Samy Jelassi},
      year={2021},
      eprint={2101.11075},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

Related tags

Overview

MADGRAD Optimization Algorithm For Tensorflow

About The Project

Prerequisites

Installation

Usage

Contributing

License

Contact

Citations

Owner

IJCAI2020 & IJCV 2020 :city_sunrise: Unsupervised Scene Adaptation with Memory Regularization in vivo

[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Introduction to CPM

Code for "Modeling Indirect Illumination for Inverse Rendering", CVPR 2022

Official implementation of "Learning Not to Reconstruct" (BMVC 2021)

dataset for ECCV 2020 "Motion Capture from Internet Videos"

Research code of ICCV 2021 paper "Mesh Graphormer"

🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

An Active Automata Learning Library Written in Python

AntiFuzz: Impeding Fuzzing Audits of Binary Executables

Official repo for QHack—the quantum machine learning hackathon

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!

Notebooks, slides and dataset of the CorrelAid Machine Learning Winter School

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language

Telegram chatbot created with deep learning model (LSTM) and telebot library.

Official repo for SemanticGAN https://nv-tlabs.github.io/semanticGAN/

pytorch bert intent classification and slot filling

Code for "Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification", ECCV 2020 Spotlight