Zero-Cost Proxies for Lightweight NAS

Last update: Dec 20, 2022

Related tags

Overview

Zero-Cost-NAS

Companion code for the ICLR2021 paper: Zero-Cost Proxies for Lightweight NAS
tl;dr A single minibatch of data is used to score neural networks for NAS instead of performing full training.

In this README, we provide:

Summary of our work
How to run the code

If you have any questions, please open an issue or email us. (last update: 02.02.2021)

Summary

Intro. To perform neural architecture search (NAS), deep neural networks (DNNs) are typically trained until a final validation accuracy is computed and used to compare DNNs to each other and select the best one. However, this is time-consuming because training takes multiple GPU-hours/days/weeks. This is why a proxy for final accuracy is often used to speed up NAS. Typically, this proxy is a reduced form of training (e.g. EcoNAS) where the number of epochs is reduced, a smaller model is used or the training data is subsampled.

Proxies. Instead, we propose a series of "zero-cost" proxies that use a single-minibatch of data to score a DNN. These metrics are inspired by recent pruning-at-initialization literature, but are adapted to score an entire DNN and work within a NAS setting. When compared against econas (see orange pentagon in plot below), our zero-cost metrics take ~1000X less time to run but are better-correlated with final validation accuracy (especially synflow and jacob_cov), making them better (and much cheaper!) proxies for use within NAS. Even when EcoNAS is tuned specifically for NAS-Bench-201 (see econas+ purple circle in the plot), our vote zero-cost proxy is still better-correlated and is 3 orders of magnitude cheaper to compute.

Figure 1: Correlation of validation accuracy to final accuracy during the first 12 epochs of training (blue line) for three CIFAR-10 on the NAS-Bench-201 search space. Zero-cost and EcoNAS proxies are also labeled for comparison.

Zero-Cost NAS We use the zero-cost metrics to enhance 4 existing NAS algorithms, and we test it out on 3 different NAS benchmarks. For all cases, we achieve a new SOTA (state of the art result) in terms of search speed. We incorporate zero-cost proxies in two ways: (1) warmup: Use proxies to initialize NAS algorithms, (2) move proposal: Use proxies to improve the selection of the next model for evaluation. As Figure 2 shows, there is a significant speedup to all evaluated NAS algorithms.

Figure 2: Zero-Cost warmup and move proposal consistently improves speed and accuracy of 4 different NAS algorithms.

For more details, please take a look at our paper!

Running the Code

Install PyTorch for your system (v1.5.0 or later).
Install the package: pip install . (add -e for editable mode) -- note that all dependencies other than pytorch will be automatically installed.

API

The main function is find_measures below. Given a neural net and some information about the input data (dataloader) and loss function (loss_fn) it returns an array of zero-cost proxy metrics.

def find_measures(net_orig,                  # neural network
                  dataloader,                # a data loader (typically for training data)
                  dataload_info,             # a tuple with (dataload_type = {random, grasp}, number_of_batches_for_random_or_images_per_class_for_grasp, number of classes)
                  device,                    # GPU/CPU device used
                  loss_fn=F.cross_entropy,   # loss function to use within the zero-cost metrics
                  measure_names=None,        # an array of measure names to compute, if left blank, all measures are computed by default
                  measures_arr=None):        # [not used] if the measures are already computed but need to be summarized, pass them here

The available zero-cost metrics are in the measures directory. You can add new metrics by simply following one of the examples then registering the metric in the load_all function. More examples of how to use this function can be found in the code to reproduce results (below). You can also modify data loading functions in p_utils.py

Reproducing Results

NAS-Bench-201

Download the NAS-Bench-201 dataset and put in the data directory in the root folder of this project.
Run python nasbench2_pred.py with the appropriate cmd-line options -- a pickle file is produced with zero-cost metrics (see notebooks folder on how to use the pickle file.
Note that you need to manually download ImageNet16 and put in _datasets/ImageNet16 directory in the root folder. CIFAR-10/100 will be automatically downloaded.

NAS-Bench-101

Download the data directory and save it to the root folder of this repo. This contains pre-cached info from the NAS-Bench-101 repo.
[Optional] Download the NAS-Bench-101 dataset and put in the data directory in the root folder of this project and also clone the NAS-Bench-101 repo and install the package.
Run python nasbench1_pred.py. Note that this takes a long time to go through ~400k architectures, but precomputed results are in the notebooks folder (with a link to the results).

PyTorchCV

Run python ptcv_pred.py

NAS-Bench-ASR

Coming soon...

NAS with Zero-Cost Proxies

For the full list of NAS algorithms in our paper, we used a different NAS tool which is not publicly released. However, we included a notebook nas_examples.ipynb to show how to use zero-cost proxies to speed up aging evolution and random search methods using both warmup and move proposal.

Citation

@inproceedings{
  abdelfattah2021zerocost,
  title={{Zero-Cost Proxies for Lightweight NAS}},
  author={Mohamed S. Abdelfattah and Abhinav Mehrotra and {\L}ukasz Dudziak and Nicholas D. Lane},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2021}
}

Zero-Cost Proxies for Lightweight NAS

Related tags

Overview

Zero-Cost-NAS

Summary

Running the Code

API

Reproducing Results

NAS-Bench-201

NAS-Bench-101

PyTorchCV

NAS-Bench-ASR

NAS with Zero-Cost Proxies

Citation

Owner

SamsungLabs

Official implementation of the NeurIPS 2021 paper Online Learning Of Neural Computations From Sparse Temporal Feedback

[AI6101] Introduction to AI & AI Ethics is a core course of MSAI, SCSE, NTU, Singapore

Developing your First ML Workflow of the AWS Machine Learning Engineer Nanodegree Program

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.

Earth Vision Foundation

SberSwap Video Swap base on deep learning

Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models"

MediaPipe is a an open-source framework from Google for building multimodal

3D cascade RCNN for object detection on point cloud

Music Generation using Neural Networks Streamlit App

Ontologysim: a Owlready2 library for applied production simulation

Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation

Code for KHGT model, AAAI2021

DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral

Tensorflow implementation of MIRNet for Low-light image enhancement

An essential implementation of BYOL in PyTorch + PyTorch Lightning

An educational tool to introduce AI planning concepts using mobile manipulator robots.

Kindle is an easy model build package for PyTorch.

ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators

Framework web SnakeServer.

Zero-Cost Proxies for Lightweight NAS

Related tags

Overview

Zero-Cost-NAS

Summary

Running the Code

API

Reproducing Results

NAS-Bench-201

NAS-Bench-101

PyTorchCV

NAS-Bench-ASR

NAS with Zero-Cost Proxies

Citation

Owner

SamsungLabs

Official implementation of the NeurIPS 2021 paper Online Learning Of Neural Computations From Sparse Temporal Feedback

[AI6101] Introduction to AI & AI Ethics is a core course of MSAI, SCSE, NTU, Singapore

Developing your First ML Workflow of the AWS Machine Learning Engineer Nanodegree Program

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren*, Raymond A. Yeh*, Alexander G. Schwing.

Earth Vision Foundation

SberSwap Video Swap base on deep learning

Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models"

MediaPipe is a an open-source framework from Google for building multimodal

3D cascade RCNN for object detection on point cloud

Music Generation using Neural Networks Streamlit App

Ontologysim: a Owlready2 library for applied production simulation

Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation

Code for KHGT model, AAAI2021

DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral

Tensorflow implementation of MIRNet for Low-light image enhancement

An essential implementation of BYOL in PyTorch + PyTorch Lightning

An educational tool to introduce AI planning concepts using mobile manipulator robots.

Kindle is an easy model build package for PyTorch.

ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators

Framework web SnakeServer.

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.