Zero-Cost Proxies for Lightweight NAS

Last update: Dec 20, 2022

Related tags

Overview

Zero-Cost-NAS

Companion code for the ICLR2021 paper: Zero-Cost Proxies for Lightweight NAS
tl;dr A single minibatch of data is used to score neural networks for NAS instead of performing full training.

In this README, we provide:

Summary of our work
How to run the code

If you have any questions, please open an issue or email us. (last update: 02.02.2021)

Summary

Intro. To perform neural architecture search (NAS), deep neural networks (DNNs) are typically trained until a final validation accuracy is computed and used to compare DNNs to each other and select the best one. However, this is time-consuming because training takes multiple GPU-hours/days/weeks. This is why a proxy for final accuracy is often used to speed up NAS. Typically, this proxy is a reduced form of training (e.g. EcoNAS) where the number of epochs is reduced, a smaller model is used or the training data is subsampled.

Proxies. Instead, we propose a series of "zero-cost" proxies that use a single-minibatch of data to score a DNN. These metrics are inspired by recent pruning-at-initialization literature, but are adapted to score an entire DNN and work within a NAS setting. When compared against econas (see orange pentagon in plot below), our zero-cost metrics take ~1000X less time to run but are better-correlated with final validation accuracy (especially synflow and jacob_cov), making them better (and much cheaper!) proxies for use within NAS. Even when EcoNAS is tuned specifically for NAS-Bench-201 (see econas+ purple circle in the plot), our vote zero-cost proxy is still better-correlated and is 3 orders of magnitude cheaper to compute.

Figure 1: Correlation of validation accuracy to final accuracy during the first 12 epochs of training (blue line) for three CIFAR-10 on the NAS-Bench-201 search space. Zero-cost and EcoNAS proxies are also labeled for comparison.

Zero-Cost NAS We use the zero-cost metrics to enhance 4 existing NAS algorithms, and we test it out on 3 different NAS benchmarks. For all cases, we achieve a new SOTA (state of the art result) in terms of search speed. We incorporate zero-cost proxies in two ways: (1) warmup: Use proxies to initialize NAS algorithms, (2) move proposal: Use proxies to improve the selection of the next model for evaluation. As Figure 2 shows, there is a significant speedup to all evaluated NAS algorithms.

Figure 2: Zero-Cost warmup and move proposal consistently improves speed and accuracy of 4 different NAS algorithms.

For more details, please take a look at our paper!

Running the Code

Install PyTorch for your system (v1.5.0 or later).
Install the package: pip install . (add -e for editable mode) -- note that all dependencies other than pytorch will be automatically installed.

API

The main function is find_measures below. Given a neural net and some information about the input data (dataloader) and loss function (loss_fn) it returns an array of zero-cost proxy metrics.

def find_measures(net_orig,                  # neural network
                  dataloader,                # a data loader (typically for training data)
                  dataload_info,             # a tuple with (dataload_type = {random, grasp}, number_of_batches_for_random_or_images_per_class_for_grasp, number of classes)
                  device,                    # GPU/CPU device used
                  loss_fn=F.cross_entropy,   # loss function to use within the zero-cost metrics
                  measure_names=None,        # an array of measure names to compute, if left blank, all measures are computed by default
                  measures_arr=None):        # [not used] if the measures are already computed but need to be summarized, pass them here

The available zero-cost metrics are in the measures directory. You can add new metrics by simply following one of the examples then registering the metric in the load_all function. More examples of how to use this function can be found in the code to reproduce results (below). You can also modify data loading functions in p_utils.py

Reproducing Results

NAS-Bench-201

Download the NAS-Bench-201 dataset and put in the data directory in the root folder of this project.
Run python nasbench2_pred.py with the appropriate cmd-line options -- a pickle file is produced with zero-cost metrics (see notebooks folder on how to use the pickle file.
Note that you need to manually download ImageNet16 and put in _datasets/ImageNet16 directory in the root folder. CIFAR-10/100 will be automatically downloaded.

NAS-Bench-101

Download the data directory and save it to the root folder of this repo. This contains pre-cached info from the NAS-Bench-101 repo.
[Optional] Download the NAS-Bench-101 dataset and put in the data directory in the root folder of this project and also clone the NAS-Bench-101 repo and install the package.
Run python nasbench1_pred.py. Note that this takes a long time to go through ~400k architectures, but precomputed results are in the notebooks folder (with a link to the results).

PyTorchCV

Run python ptcv_pred.py

NAS-Bench-ASR

Coming soon...

NAS with Zero-Cost Proxies

For the full list of NAS algorithms in our paper, we used a different NAS tool which is not publicly released. However, we included a notebook nas_examples.ipynb to show how to use zero-cost proxies to speed up aging evolution and random search methods using both warmup and move proposal.

Citation

@inproceedings{
  abdelfattah2021zerocost,
  title={{Zero-Cost Proxies for Lightweight NAS}},
  author={Mohamed S. Abdelfattah and Abhinav Mehrotra and {\L}ukasz Dudziak and Nicholas D. Lane},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2021}
}

Zero-Cost Proxies for Lightweight NAS

Related tags

Overview

Zero-Cost-NAS

Summary

Running the Code

API

Reproducing Results

NAS-Bench-201

NAS-Bench-101

PyTorchCV

NAS-Bench-ASR

NAS with Zero-Cost Proxies

Citation

Owner

SamsungLabs

State-to-Distribution (STD) Model

For medical image segmentation

Unofficial JAX implementations of Deep Learning models

Full Stack Deep Learning Labs

small collection of functions for neural networks

This is a Keras implementation of a CNN for estimating age, gender and mask from a camera.

✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

1st ranked 'driver careless behavior detection' for AI Online Competition 2021, hosted by MSIT Korea.

DetCo: Unsupervised Contrastive Learning for Object Detection

Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Automatic tool focused on deriving metallicities of open clusters

A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation.

A 3D sparse LBM solver implemented using Taichi

source code of Adversarial Feedback Loop Paper

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

This repository contains a PyTorch implementation of the paper Learning to Assimilate in Chaotic Dynamical Systems.

[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.