Losslandscapetaxonomy - Taxonomizing local versus global structure in neural network loss landscapes

Last update: Dec 30, 2022

Related tags

Overview

Taxonomizing local versus global structure in neural network loss landscapes

Introduction

This repository includes the programs to reproduce the results of the paper Taxonomizing local versus global structure in neural network loss landscapes. The code has been tested on Python 3.8.12 with PyTorch 1.10.1 and CUDA 10.2.

(Caricature of different types of loss landscapes). Globally well-connected versus globally poorly-connected loss landscapes; and locally sharp versus locally flat loss landscapes. Globally well-connected loss landscapes can be interpreted in terms of a global “rugged convexity”; and globally well-connected and locally flat loss landscapes can be further divided into two sub-cases, based on the similarity of trained models.

(2D phase plot). Partitioning the 2D load-like—temperature-like diagram into different phases of learning, varying batch size to change temperature and varying model width to change load. Models are trained with ResNet18 on CIFAR-10. All plots are on the same set of axes.

Usage

First, follow the steps below to install the necessary packages.

conda create -n loss_landscape python=3.8
source activate loss_landscape
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Training

Then, use the following command to generate the training scripts.

cd workspace/src
python example_experiment.py --metrics train

The training script can be found in the folder bash_scripts/width_lr_decay.

We recommend using some job scheduler to execute the training script. For example, use the following to generate an example slurm script for training.

python example_experiment.py --metrics train --generate-slurm-scripts

Evaluating metrics and generating phase plots

Use the following command to generate the scripts for different generalization metrics.

python example_experiment.py --metrics curve CKA hessian dist loss_acc

You can use our prior results, which are compressed and stored in workspace/checkpoint/results.tar.gz. Please decompress them using the command below.

cd workspace/checkpoint/
tar -xzvf results.tar.gz

After the generalization metrics are obtained, use the jupyter notebook Load_temperature_plots.ipynb in workspace/src/visualization/ to visualize the results.

Citation

We appreciate it if you would please cite the following paper if you found the repository useful for your work:

@inproceedings{yang2021taxonomizing,
  title={Taxonomizing local versus global structure in neural network loss landscapes},
  author={Yang, Yaoqing and Hodgkinson, Liam and Theisen, Ryan and Zou, Joe and Gonzalez, Joseph E and Ramchandran, Kannan and Mahoney, Michael W},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021}
}

License

MIT

Losslandscapetaxonomy - Taxonomizing local versus global structure in neural network loss landscapes

Related tags

Overview

Taxonomizing local versus global structure in neural network loss landscapes

Introduction

Usage

Training

Evaluating metrics and generating phase plots

Citation

License

Owner

Yaoqing Yang

Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

A Small and Easy approach to the BraTS2020 dataset (2D Segmentation)

A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image.

PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

Multi Task RL Baselines

PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡

Differentiable scientific computing library

SlotRefine: A Fast Non-Autoregressive Model forJoint Intent Detection and Slot Filling

Generate fine-tuning samples & Fine-tuning the model & Generate samples by transferring Note On

Pytorch Lightning Implementation of SC-Depth Methods.

Semantic Segmentation Suite in TensorFlow

Resco: A simple python package that report the effect of deep residual learning

Diffusion Probabilistic Models for 3D Point Cloud Generation (CVPR 2021)

The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

🏃‍♀️ A curated list about human motion capture, analysis and synthesis.

Multiband spectro-radiometric satellite image analysis with K-means cluster algorithm

SCAAML is a deep learning framwork dedicated to side-channel attacks run on top of TensorFlow 2.x.

[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

商品推荐系统