AlphaNet Improved Training of Supernet with Alpha-Divergence

Related tags

Deep LearningAlphaNet
Overview

AlphaNet: Improved Training of Supernet with Alpha-Divergence

This repository contains our PyTorch training code, evaluation code and pretrained models for AlphaNet.

PWC

Our implementation is largely based on AttentiveNAS. To reproduce our results, please first download the AttentiveNAS repo, and use our train_alphanet.py for training and test_alphanet.py for testing.

For more details, please see AlphaNet: Improved Training of Supernet with Alpha-Divergence by Dilin Wang, Chengyue Gong, Meng Li, Qiang Liu, Vikas Chandra.

If you find this repo useful in your research, please consider citing our work and AttentiveNAS:

@article{wang2021alphanet,
  title={AlphaNet: Improved Training of Supernet with Alpha-Divergence},
  author={Wang, Dilin and Gong, Chengyue and Li, Meng and Liu, Qiang and Chandra, Vikas},
  journal={arXiv preprint arXiv:2102.07954},
  year={2021}
}

@article{wang2020attentivenas,
  title={AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling},
  author={Wang, Dilin and Li, Meng and Gong, Chengyue and Chandra, Vikas},
  journal={arXiv preprint arXiv:2011.09011},
  year={2020}
}

Evaluation

To reproduce our results:

  • Please first download our pretrained AlphaNet models from a Google Drive path and put the pretrained models under your local folder ./alphanet_data

  • To evaluate our pre-trained AlphaNet models, from AlphaNet-A0 to A6, on ImageNet with a single GPU, please run:

    python test_alphanet.py --config-file ./configs/eval_alphanet_models.yml --model a[0-6]

    Expected results:

    Name MFLOPs Top-1 (%)
    AlphaNet-A0 203 77.87
    AlphaNet-A1 279 78.94
    AlphaNet-A2 317 79.20
    AlphaNet-A3 357 79.41
    AlphaNet-A4 444 80.01
    AlphaNet-A5 (small) 491 80.29
    AlphaNet-A5 (base) 596 80.62
    AlphaNet-A6 709 80.78
  • Additionally, here is our pretrained supernet with KL based inplace-KD and here is our pretrained supernet without inplace-KD.

Training

To train our AlphaNet models from scratch, please run:

python train_alphanet.py --config-file configs/train_alphanet_models.yml --machine-rank ${machine_rank} --num-machines ${num_machines} --dist-url ${dist_url}

We adopt SGD training on 64 GPUs. The mini-batch size is 32 per GPU; all training hyper-parameters are specified in train_alphanet_models.yml.

Evolutionary search

In case you want to search the set of models of your own interest - we provide an example to show how to search the Pareto models for the best FLOPs vs. accuracy tradeoffs in parallel_supernet_evo_search.py; to run this example:

python parallel_supernet_evo_search.py --config-file configs/parallel_supernet_evo_search.yml 

License

AlphaNet is licensed under CC-BY-NC.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more info.

Owner
Facebook Research
Facebook Research
Python implementation of "Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation"

MIPNet: Multi-Instance Pose Networks This repository is the official pytorch python implementation of "Multi-Instance Pose Networks: Rethinking Top-Do

Rawal Khirodkar 57 Dec 12, 2022
TransferNet: Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network

TransferNet: Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network Created by Seunghoon Hong, Junhyuk Oh,

42 Jun 29, 2022
DAN: Unfolding the Alternating Optimization for Blind Super Resolution

DAN-Basd-on-Openmmlab DAN: Unfolding the Alternating Optimization for Blind Super Resolution We reproduce DAN via mmediting based on open-sourced code

AlexZou 72 Dec 13, 2022
Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

Nguyễn Trường Lâu 4 Nov 24, 2022
Implementation of the GBST block from the Charformer paper, in Pytorch

Charformer - Pytorch Implementation of the GBST (gradient-based subword tokenization) module from the Charformer paper, in Pytorch. The paper proposes

Phil Wang 105 Dec 26, 2022
A sequence of Jupyter notebooks featuring the 12 Steps to Navier-Stokes

CFD Python Please cite as: Barba, Lorena A., and Forsyth, Gilbert F. (2018). CFD Python: the 12 steps to Navier-Stokes equations. Journal of Open Sour

Barba group 2.6k Dec 30, 2022
PyTorch Implement of Context Encoders: Feature Learning by Inpainting

Context Encoders: Feature Learning by Inpainting This is the Pytorch implement of CVPR 2016 paper on Context Encoders 1) Semantic Inpainting Demo Inst

321 Dec 25, 2022
EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

EdiBERT, a generative model for image editing EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation. The

16 Dec 07, 2022
An intelligent, flexible grammar of machine learning.

An english representation of machine learning. Modify what you want, let us handle the rest. Overview Nylon is a python library that lets you customiz

Palash Shah 79 Dec 02, 2022
NAS-HPO-Bench-II is the first benchmark dataset for joint optimization of CNN and training HPs.

NAS-HPO-Bench-II API Overview NAS-HPO-Bench-II is the first benchmark dataset for joint optimization of CNN and training HPs. It helps a fair and low-

yoichi hirose 8 Nov 21, 2022
Continual World is a benchmark for continual reinforcement learning

Continual World Continual World is a benchmark for continual reinforcement learning. It contains realistic robotic tasks which come from MetaWorld. Th

41 Dec 24, 2022
Count the MACs / FLOPs of your PyTorch model.

THOP: PyTorch-OpCounter How to install pip install thop (now continously intergrated on Github actions) OR pip install --upgrade git+https://github.co

Ligeng Zhu 3.9k Dec 29, 2022
Minimal fastai code needed for working with pytorch

fastai_minima A mimal version of fastai with the barebones needed to work with Pytorch #all_slow Install pip install fastai_minima How to use This lib

Zachary Mueller 14 Oct 21, 2022
GeDML is an easy-to-use generalized deep metric learning library

GeDML is an easy-to-use generalized deep metric learning library

Borui Zhang 32 Dec 05, 2022
Generic U-Net Tensorflow implementation for image segmentation

Tensorflow Unet Warning This project is discontinued in favour of a Tensorflow 2 compatible reimplementation of this project found under https://githu

Joel Akeret 1.8k Dec 10, 2022
Efficient Lottery Ticket Finding: Less Data is More

The lottery ticket hypothesis (LTH) reveals the existence of winning tickets (sparse but critical subnetworks) for dense networks, that can be trained in isolation from random initialization to match

VITA 20 Sep 04, 2022
This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models"

GreaseLM: Graph REASoning Enhanced Language Models This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language

137 Jan 02, 2023
Norm-based Analysis of Transformer

Norm-based Analysis of Transformer Implementations for 2 papers introducing to analyze Transformers using vector norms: Kobayashi+'20 Attention is Not

Goro Kobayashi 52 Dec 05, 2022
Optimus: the first large-scale pre-trained VAE language model

Optimus: the first pre-trained Big VAE language model This repository contains source code necessary to reproduce the results presented in the EMNLP 2

314 Dec 19, 2022
Highway networks implemented in PyTorch.

PyTorch Highway Networks Highway networks implemented in PyTorch. Just the MNIST example from PyTorch hacked to work with Highway layers. Todo Make th

Conner Vercellino 56 Dec 14, 2022