A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Last update: Dec 28, 2022

Related tags

Overview

sam.pytorch

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization ( Foret+2020) Paper, Official implementation .

Requirements

Python>=3.8
PyTorch>=1.7.1

To run the example, you further need

homura by pip install -U homura-core==2020.12.0
chika by pip install -U chika

Example

python cifar10.py [--optim.name {sam,sgd}] [--model {renst20, wrn28_2}] [--optim.rho 0.05]

Results: Test Accuracy (CIFAR-10)

Model	SAM	SGD
ResNet-20	93.5	93.2
WRN28-2	95.8	95.4
ResNeXT29	96.4	95.8

SAM needs double forward passes per each update, thus training with SAM is slower than training with SGD. In case of ResNet-20 training, 80 mins vs 50 mins on my environment. Additional options --use_amp --jit_model may slightly accelerates the training.

Usage

SAMSGD can be used as a drop-in replacement of PyTorch optimizers with closures. Also, it is compatible with lr_scheduler and has state_dict and load_state_dict.

from sam import SAMSGD

optimizer = SAMSGD(model.parameters(), lr=1e-1, rho=0.05)

for input, target in dataset:
    def closure():
        optimizer.zero_grad()
        output = model(input)
        loss = loss_f(output, target)
        loss.backward()
        return loss


    loss = optimizer.step(closure)

Citation

@ARTICLE{2020arXiv201001412F,
    author = {{Foret}, Pierre and {Kleiner}, Ariel and {Mobahi}, Hossein and {Neyshabur}, Behnam},
    title = "{Sharpness-Aware Minimization for Efficiently Improving Generalization}",
    year = 2020,
    eid = {arXiv:2010.01412},
    eprint = {2010.01412},
}

@software{sampytorch
    author = {Ryuichiro Hataya},
    titile = {sam.pytorch},
    url    = {https://github.com/moskomule/sam.pytorch},
    year   = {2020}
}

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Related tags

Overview

sam.pytorch

Requirements

Example

Results: Test Accuracy (CIFAR-10)

Usage

Citation

Owner

Ryuichiro Hataya

scAR (single-cell Ambient Remover) is a package for data denoising in single-cell omics.

Artstation-Artistic-face-HQ Dataset (AAHQ)

Notebooks, slides and dataset of the CorrelAid Machine Learning Winter School

Adversarial Framework for (non-) Parametric Image Stylisation Mosaics

A crossplatform menu bar application using mpv as DLNA Media Renderer.

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"

ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

Code for the paper "Generative design of breakwaters usign deep convolutional neural network as a surrogate model"

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.

Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning

Pytorch Implementation of "Desigining Network Design Spaces", Radosavovic et al. CVPR 2020.

Si Adek Keras is software VR dangerous object detection.

Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification

SE-MSCNN: A Lightweight Multi-scaled Fusion Network for Sleep Apnea Detection Using Single-Lead ECG Signals

A2LP for short, ECCV2020 spotlight, Investigating SSL principles for UDA problems

git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Gym environments used in the paper: "Developmental Reinforcement Learning of Control Policy of a Quadcopter UAV with Thrust Vectoring Rotors"

Inteligência artificial criada para realizar interação social com idosos.