Code for visualizing the loss landscape of neural nets

Overview

Visualizing the Loss Landscape of Neural Nets

This repository contains the PyTorch code for the paper

Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer and Tom Goldstein. Visualizing the Loss Landscape of Neural Nets. NIPS, 2018.

An interactive 3D visualizer for loss surfaces has been provided by telesens.

Given a network architecture and its pre-trained parameters, this tool calculates and visualizes the loss surface along random direction(s) near the optimal parameters. The calculation can be done in parallel with multiple GPUs per node, and multiple nodes. The random direction(s) and loss surface values are stored in HDF5 (.h5) files after they are produced.

Setup

Environment: One or more multi-GPU node(s) with the following software/libraries installed:

Pre-trained models: The code accepts pre-trained PyTorch models for the CIFAR-10 dataset. To load the pre-trained model correctly, the model file should contain state_dict, which is saved from the state_dict() method. The default path for pre-trained networks is cifar10/trained_nets. Some of the pre-trained models and plotted figures can be downloaded here:

Data preprocessing: The data pre-processing method used for visualization should be consistent with the one used for model training. No data augmentation (random cropping or horizontal flipping) is used in calculating the loss values.

Visualizing 1D loss curve

Creating 1D linear interpolations

The 1D linear interpolation method [1] evaluates the loss values along the direction between two minimizers of the same network loss function. This method has been used to compare the flatness of minimizers trained with different batch sizes [2]. A 1D linear interpolation plot is produced using the plot_surface.py method.

mpirun -n 4 python plot_surface.py --mpi --cuda --model vgg9 --x=-0.5:1.5:401 --dir_type states \
--model_file cifar10/trained_nets/vgg9_sgd_lr=0.1_bs=128_wd=0.0_save_epoch=1/model_300.t7 \
--model_file2 cifar10/trained_nets/vgg9_sgd_lr=0.1_bs=8192_wd=0.0_save_epoch=1/model_300.t7 --plot
  • --x=-0.5:1.5:401 sets the range and resolution for the plot. The x-coordinates in the plot will run from -0.5 to 1.5 (the minimizers are located at 0 and 1), and the loss value will be evaluated at 401 locations along this line.
  • --dir_type states indicates the direction contains dimensions for all parameters as well as the statistics of the BN layers (running_mean and running_var). Note that ignoring running_mean and running_var cannot produce correct loss values when plotting two solutions togeather in the same figure.
  • The two model files contain network parameters describing the two distinct minimizers of the loss function. The plot will interpolate between these two minima.

VGG-9 SGD, WD=0

Producing plots along random normalized directions

A random direction with the same dimension as the model parameters is created and "filter normalized." Then we can sample loss values along this direction.

mpirun -n 4 python plot_surface.py --mpi --cuda --model vgg9 --x=-1:1:51 \
--model_file cifar10/trained_nets/vgg9_sgd_lr=0.1_bs=128_wd=0.0_save_epoch=1/model_300.t7 \
--dir_type weights --xnorm filter --xignore biasbn --plot
  • --dir_type weights indicates the direction has the same dimensions as the learned parameters, including bias and parameters in the BN layers.
  • --xnorm filter normalizes the random direction at the filter level. Here, a "filter" refers to the parameters that produce a single feature map. For fully connected layers, a "filter" contains the weights that contribute to a single neuron.
  • --xignore biasbn ignores the direction corresponding to bias and BN parameters (fill the corresponding entries in the random vector with zeros).

VGG-9 SGD, WD=0

We can also customize the appearance of the 1D plots by calling plot_1D.py once the surface file is available.

Visualizing 2D loss contours

To plot the loss contours, we choose two random directions and normalize them in the same way as the 1D plotting.

mpirun -n 4 python plot_surface.py --mpi --cuda --model resnet56 --x=-1:1:51 --y=-1:1:51 \
--model_file cifar10/trained_nets/resnet56_sgd_lr=0.1_bs=128_wd=0.0005/model_300.t7 \
--dir_type weights --xnorm filter --xignore biasbn --ynorm filter --yignore biasbn  --plot

ResNet-56

Once a surface is generated and stored in a .h5 file, we can produce and customize a contour plot using the script plot_2D.py.

python plot_2D.py --surf_file path_to_surf_file --surf_name train_loss
  • --surf_name specifies the type of surface. The default choice is train_loss,
  • --vmin and --vmax sets the range of values to be plotted.
  • --vlevel sets the step of the contours.

Visualizing 3D loss surface

plot_2D.py can make a basic 3D loss surface plot with matplotlib. If you want a more detailed rendering that uses lighting to display details, you can render the loss surface with ParaView.

ResNet-56-noshort ResNet-56

To do this, you must

  1. Convert the surface .h5 file to a .vtp file.
python h52vtp.py --surf_file path_to_surf_file --surf_name train_loss --zmax  10 --log

This will generate a VTK file containing the loss surface with max value 10 in the log scale.

  1. Open the .vtp file with ParaView. In ParaView, open the .vtp file with the VTK reader. Click the eye icon in the Pipeline Browser to make the figure show up. You can drag the surface around, and change the colors in the Properties window.

  2. If the surface appears extremely skinny and needle-like, you may need to adjust the "transforming" parameters in the left control panel. Enter numbers larger than 1 in the "scale" fields to widen the plot.

  3. Select Save screenshot in the File menu to save the image.

Reference

[1] Ian J Goodfellow, Oriol Vinyals, and Andrew M Saxe. Qualitatively characterizing neural network optimization problems. ICLR, 2015.

[2] Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang. On large-batch training for deep learning: Generalization gap and sharp minima. ICLR, 2017.

Citation

If you find this code useful in your research, please cite:

@inproceedings{visualloss,
  title={Visualizing the Loss Landscape of Neural Nets},
  author={Li, Hao and Xu, Zheng and Taylor, Gavin and Studer, Christoph and Goldstein, Tom},
  booktitle={Neural Information Processing Systems},
  year={2018}
}
Owner
Tom Goldstein
Tom Goldstein
JittorVis - Visual understanding of deep learning model.

JittorVis - Visual understanding of deep learning model.

182 Jan 06, 2023
L2X - Code for replicating the experiments in the paper Learning to Explain: An Information-Theoretic Perspective on Model Interpretation.

L2X Code for replicating the experiments in the paper Learning to Explain: An Information-Theoretic Perspective on Model Interpretation at ICML 2018,

Jianbo Chen 113 Sep 06, 2022
A python library for decision tree visualization and model interpretation.

dtreeviz : Decision Tree Visualization Description A python library for decision tree visualization and model interpretation. Currently supports sciki

Terence Parr 2.4k Jan 02, 2023
Visualization toolkit for neural networks in PyTorch! Demo -->

FlashTorch A Python visualization toolkit, built with PyTorch, for neural networks in PyTorch. Neural networks are often described as "black box". The

Misa Ogura 692 Dec 29, 2022
Code for visualizing the loss landscape of neural nets

Visualizing the Loss Landscape of Neural Nets This repository contains the PyTorch code for the paper Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer

Tom Goldstein 2.2k Dec 30, 2022
PyTorch implementation of DeepDream algorithm

neural-dream This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt. Here we DeepDream a photograph of the Golden Gate Br

121 Nov 05, 2022
An intuitive library to add plotting functionality to scikit-learn objects.

Welcome to Scikit-plot Single line functions for detailed visualizations The quickest and easiest way to go from analysis... ...to this. Scikit-plot i

Reiichiro Nakano 2.3k Dec 31, 2022
pytorch implementation of "Distilling a Neural Network Into a Soft Decision Tree"

Soft-Decision-Tree Soft-Decision-Tree is the pytorch implementation of Distilling a Neural Network Into a Soft Decision Tree, paper recently published

Kim Heecheol 262 Dec 04, 2022
treeinterpreter - Interpreting scikit-learn's decision tree and random forest predictions.

TreeInterpreter Package for interpreting scikit-learn's decision tree and random forest predictions. Allows decomposing each prediction into bias and

Ando Saabas 720 Dec 22, 2022
👋🦊 Xplique is a Python toolkit dedicated to explainability, currently based on Tensorflow.

👋🦊 Xplique is a Python toolkit dedicated to explainability, currently based on Tensorflow.

DEEL 343 Jan 02, 2023
An Empirical Review of Optimization Techniques for Quantum Variational Circuits

QVC Optimizer Review Code for the paper "An Empirical Review of Optimization Techniques for Quantum Variational Circuits". Each of the python files ca

Owen Lockwood 5 Jun 28, 2022
Code for "High-Precision Model-Agnostic Explanations" paper

Anchor This repository has code for the paper High-Precision Model-Agnostic Explanations. An anchor explanation is a rule that sufficiently “anchors”

Marco Tulio Correia Ribeiro 735 Jan 05, 2023
Visual Computing Group (Ulm University) 99 Nov 30, 2022
Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University

Contrastive Explanation (Foil Trees) Contrastive and counterfactual explanations for machine learning (ML) Marcel Robeer (2018-2020), TNO/Utrecht Univ

M.J. Robeer 41 Aug 29, 2022
Convolutional neural network visualization techniques implemented in PyTorch.

This repository contains a number of convolutional neural network visualization techniques implemented in PyTorch.

1 Nov 06, 2021
Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

Class Activation Map methods implemented in Pytorch pip install grad-cam ⭐ Comprehensive collection of Pixel Attribution methods for Computer Vision.

Jacob Gildenblat 6.5k Jan 01, 2023
Visualize a molecule and its conformations in Jupyter notebooks/lab using py3dmol

Mol Viewer This is a simple package wrapping py3dmol for a single command visualization of a RDKit molecule and its conformations (embed as Conformer

Benoît BAILLIF 1 Feb 11, 2022
🎆 A visualization of the CapsNet layers to better understand how it works

CapsNet-Visualization For more information on capsule networks check out my Medium articles here and here. Setup Use pip to install the required pytho

Nick Bourdakos 387 Dec 06, 2022
FairML - is a python toolbox auditing the machine learning models for bias.

======== FairML: Auditing Black-Box Predictive Models FairML is a python toolbox auditing the machine learning models for bias. Description Predictive

Julius Adebayo 338 Nov 09, 2022
Logging MXNet data for visualization in TensorBoard.

Logging MXNet Data for Visualization in TensorBoard Overview MXBoard provides a set of APIs for logging MXNet data for visualization in TensorBoard. T

Amazon Web Services - Labs 327 Dec 05, 2022