Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Code for “Efficient Sharpness-aware Minimization for Improved Training of Neural Networks”

Requisite

This code is implemented in PyTorch, and we have tested the code under the following environment settings:

python = 3.8.8
torch = 1.8.0
torchvision = 0.9.0

What is in this repository

Codes for our ESAM on CIFAR10/CIFAR100 datasets.

How to use it

from utils.layer_dp_sam import ESAM
base_optimizer = torch.optim.SGD(model.parameters(),lr=args.learning_rate,momentum=0.9,weight_decay=args.weight_decay)
optimizer = ESAM(paras, base_optimizer, rho=args.rho, weight_dropout=args.weight_dropout,adaptive=args.isASAM,nograd_cutoff=args.nograd_cutoff,opt_dropout = args.opt_dropout,temperature=args.temperature)

--beta the SWP hyperparameter

--gamma the SDS hyperparameter

During training loss_fct should have reduction="none", to return instance-wise losses. defined_backward is the function used for DDP and mixed precision backward

loss_fct = torch.nn.CrossEntropyLoss(reduction="none")
def defined_backward():
    if args.fp16:
    with amp.scale_loss(loss, optimizer0) as scaled_loss:
        scaled_loss.backward()
    else:
        loss.backward()

paras = [inputs,targets,loss_fct,model,defined_backward]
optimizer.paras = paras
optimizer.step()
predictions_logits,loss = optimizer.returnthings

Example

bash run.sh

Reference Code

[1] SAM

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Related tags

Overview

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Requisite

What is in this repository

How to use it

Example

Reference Code

Owner

Angusdu

Joint parameterization and fitting of stroke clusters

Serverless proxy for Spark cluster

Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).

Optimising chemical reactions using machine learning

Code repository for the paper Computer Vision User Entity Behavior Analytics

PyTorch-based framework for Deep Hedging

NaturalProofs: Mathematical Theorem Proving in Natural Language

(Arxiv 2021) NeRF--: Neural Radiance Fields Without Known Camera Parameters

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

Code for "MetaMorph: Learning Universal Controllers with Transformers", Gupta et al, ICLR 2022

MediaPipeのPythonパッケージのサンプルです。2020/12/11時点でPython実装のある4機能(Hands、Pose、Face Mesh、Holistic)について用意しています。

Fast Neural Style for Image Style Transform by Pytorch

Gin provides a lightweight configuration framework for Python

PPO Lagrangian in JAX

Computer Vision Paper Reviews with Key Summary of paper, End to End Code Practice and Jupyter Notebook converted papers

DeepAL: Deep Active Learning in Python

Unsupervised Representation Learning by Invariance Propagation

Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

Sum-Product Probabilistic Language

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in clustering (CVPR2021)