Pytorch Implementation of paper "Noisy Natural Gradient as Variational Inference"

Overview

Noisy Natural Gradient as Variational Inference

PyTorch implementation of Noisy Natural Gradient as Variational Inference.

Requirements

  • Python 3
  • Pytorch
  • visdom

Comments

  • This paper is about how to optimize bayesian neural network which has matrix variate gaussian distribution.
  • This implementation contains Noisy Adam optimizer which is for Fully Factorized Gaussian(FFG) distribution, and Noisy KFAC optimizer which is for Matrix Variate Gaussian(MVG) distribution.
  • These optimizers only work with bayesian network which has specific structure that I will mention below.
  • Currently only linear layer is available.

Experimental comments

  • I addded a lr scheduler to noisy KFAC because loss is exploded during training. I guess this happens because of slight approximation.
  • For MNIST training noisy KFAC is 15-20x slower than noisy Adam, as mentioned in paper.
  • I guess the noisy KFAC needs more epochs to train simple neural network structure like 2 linear layers.

Usage

Currently only MNIST dataset are currently supported, and only fully connected layer is implemented.

Options

  • model : Fully Factorized Gaussian(FFG) or Matrix Variate Gaussian(MVG)
  • n : total train dataset size. need this value for optimizer.
  • eps : parameter for optimizer. Default to 1e-8.
  • initial_size : initial input tensor size. Default to 784, size of MNIST data.
  • label_size : label size. Default to 10, size of MNIST label.

More details in option_parser.py

Train

$ python train.py --model=FFG --batch_size=100 --lr=1e-3 --dataset=MNIST
$ python train.py --model=MVG --batch_size=100 --lr=1e-2 --dataset=MNIST --n=60000

Visualize

  • To visualize intermediate results and loss plots, run python -m visdom.server and go to the URL http://localhost:8097

Test

$ python test.py --epoch=20

Training Graphs

1. MNIST

  • network is consist of 2 linear layers.
  • FFG optimized by noisy Adam : epoch 20, lr 1e-3

  • MVG optimized by noisy KFAC : epoch 100, lr 1e-2, decay 0.1 for every 30 epochs
  • Need to tune learning rate.

Implementation detail

  • Optimizing parameter procedure is consists of 2 steps, Calculating gradient and Applying to bayeisan parameters.
  • Before forward, network samples parameters with means & variances.
  • Usually calling step function updates parameters, but not this case. After calling step function, you have to update bayesian parameters. Look at the ffg_model.py

TODOs

  • More benchmark cases
  • Supports bayesian convolution
  • Implement Block Tridiagonal Covariance, which is dependent between layers.

Code reference

Visualization code(visualizer.py, utils.py) references to pytorch-CycleGAN-and-pix2pix(https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) by Jun-Yan Zhu

Author

Tony Kim

Owner
Tony JiHyun Kim
CEO/Tech Lead @PostAlpine Co., Ltd.
Tony JiHyun Kim
Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central gateway to assessments created in the open source community.

Lens by Credo AI - Responsible AI Assessment Framework Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data a

Credo AI 27 Dec 14, 2022
This repository contains an implementation of ConvMixer for the ICLR 2022 submission "Patches Are All You Need?".

Patches Are All You Need? 🤷 This repository contains an implementation of ConvMixer for the ICLR 2022 submission "Patches Are All You Need?". Code ov

ICLR 2022 Author 934 Dec 30, 2022
This repository implements WGAN_GP.

Image_WGAN_GP This repository implements WGAN_GP. Image_WGAN_GP This repository uses wgan to generate mnist and fashionmnist pictures. Firstly, you ca

Lieon 6 Dec 10, 2021
MMRazor: a model compression toolkit for model slimming and AutoML

Documentation: https://mmrazor.readthedocs.io/ English | 简体中文 Introduction MMRazor is a model compression toolkit for model slimming and AutoML, which

OpenMMLab 899 Jan 02, 2023
This repository contains the exercises and its solution contained in the book "An Introduction to Statistical Learning" in python.

An-Introduction-to-Statistical-Learning This repository contains the exercises and its solution contained in the book An Introduction to Statistical L

2.1k Jan 02, 2023
Improving Compound Activity Classification via Deep Transfer and Representation Learning

Improving Compound Activity Classification via Deep Transfer and Representation Learning This repository is the official implementation of Improving C

NingLab 2 Nov 24, 2021
Making Structure-from-Motion (COLMAP) more robust to symmetries and duplicated structures

SfM disambiguation with COLMAP About Structure-from-Motion generally fails when the scene exhibits symmetries and duplicated structures. In this repos

Computer Vision and Geometry Lab 193 Dec 26, 2022
Breaking the Dilemma of Medical Image-to-image Translation

Breaking the Dilemma of Medical Image-to-image Translation Supervised Pix2Pix and unsupervised Cycle-consistency are two modes that dominate the field

Kid Liet 86 Dec 21, 2022
Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

Proximal Backpropagation Proximal Backpropagation (ProxProp) is a neural network training algorithm that takes implicit instead of explicit gradient s

Thomas Frerix 40 Dec 17, 2022
This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

CLGo This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints An earlier

刘芮金 32 Dec 20, 2022
Gray Zone Assessment

Gray Zone Assessment Get started Clone github repository git clone https://github.com/andreanne-lemay/gray_zone_assessment.git Build docker image dock

1 Jan 08, 2022
PyTorch implementation of DirectCLR from paper Understanding Dimensional Collapse in Contrastive Self-supervised Learning

DirectCLR DirectCLR is a simple contrastive learning model for visual representation learning. It does not require a trainable projector as SimCLR. It

Meta Research 49 Dec 21, 2022
Some useful blender add-ons for SMPL skeleton's poses and global translation.

Blender add-ons for SMPL skeleton's poses and trans There are two blender add-ons for SMPL skeleton's poses and trans.The first is for making an offli

犹在镜中 154 Jan 04, 2023
This repository consists of Blender python scripts and corresponding assets to generate variants of the CANDLE dataset

candle-simulator This repository consists of Blender python scripts and corresponding assets to generate variants of the IITH-CANDLE dataset. The rend

1 Dec 15, 2021
Multi-angle c(q)uestion answering

Macaw Introduction Macaw (Multi-angle c(q)uestion answering) is a ready-to-use model capable of general question answering, showing robustness outside

AI2 430 Jan 04, 2023
PECOS - Prediction for Enormous and Correlated Spaces

PECOS - Predictions for Enormous and Correlated Output Spaces PECOS is a versatile and modular machine learning (ML) framework for fast learning and i

Amazon 387 Jan 04, 2023
UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

UDP-Pose This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Top-Down Results on

20 Jul 29, 2022
Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution Kai Zhang, Jingyun Liang, Luc Van Gool, Radu Timofte Computer Vision Lab

Kai Zhang 804 Jan 08, 2023
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022
Codebase for Time-series Generative Adversarial Networks (TimeGAN)

Codebase for Time-series Generative Adversarial Networks (TimeGAN)

Jinsung Yoon 532 Dec 31, 2022