Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)

Overview

DNA

This repository provides the code of our paper: Blockwisely Supervised Neural Architecture Search with Knowledge Distillation.

Illustration of DNA. Each cell of the supernet is trained independently to mimic the behavior of the corresponding teacher block.

Comparison of model ranking for DNA vs. DARTS, SPOS and MnasNet under two different hyper-parameters.

Our Trained Models

Usage

1. Requirements

2. Searching

The code for supernet training, evaluation and searching is under searching directory.

  • cd searching

i) Train & evaluate the block-wise supernet with knowledge distillation

  • Modify datadir in initialize/data.yaml to your ImageNet path.
  • Modify nproc_per_node in dist_train.sh to suit your GPU number. The default batch size is 64 for 8 GPUs, you can change batch size and learning rate in initialize/train_pipeline.yaml
  • By default, the supernet will be trained sequentially from stage 1 to stage 6 and evaluate after each stage. This will take about 2 days on 8 GPUs with EfficientNet B7 being the teacher. Resuming from checkpoints is supported. You can also change start_stage in initialize/train_pipeline.yaml to force start from a intermediate stage without loading checkpoint.
  • sh dist_train.sh

ii) Search for the best architecture under constraint.

Our traversal search can handle a search space with 6 ops in each layer, 6 layers in each stage, 6 stages in total. A search process like this should finish in half an hour with a single cpu. To perform search over a larger search space, you can manually divide the search space or use other search algorithms such as Evolution Algorithms to process our evaluated architecture potential files.

  • Copy the path of architecture potential files generated in step i) to potential_yaml in process_potential.py. Modify the constraint in process_potential.py.
  • python process_potential.py

3. Retraining

The retraining code is simplified from the repo: pytorch-image-models and is under retraining directory.

  • cd retraining

  • Retrain our models or your searched models

    • Modify the run_example.sh: change data path and hyper-params according to your requirements
    • Add your searched model architecture to model.py. You can also use our searched and predefined DNA models.
    • sh run_example.sh
  • You can evaluate our models with the following command:
    python validate.py PATH/TO/ImageNet/validation --model DNA_a --checkpoint PATH/TO/model.pth.tar

    • PATH/TO/ImageNet/validation should be replaced by your validation data path.
    • --model : DNA_a can be replaced by DNA_b, DNA_c, DNA_d for our different models.
    • --checkpoint : Suggest the path of your downloaded checkpoint here.
Owner
Changlin Li
Changlin Li
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
An OpenAI Gym environment for multi-agent car racing based on Gym's original car racing environment.

Multi-Car Racing Gym Environment This repository contains MultiCarRacing-v0 a multiplayer variant of Gym's original CarRacing-v0 environment. This env

Igor Gilitschenski 56 Nov 01, 2022
An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches

Transformer-in-Transformer An Implementation of the Transformer in Transformer paper by Han et al. for image classification, attention inside local pa

Rishit Dagli 40 Jul 25, 2022
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Bayesian Methods for Hackers Using Python and PyMC The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chap

Cameron Davidson-Pilon 25.1k Jan 02, 2023
I3-master-layout - Simple master and stack layout script

Simple master and stack layout script | ------ | ----- | | | | | Ma

Tobias S 18 Dec 05, 2022
Author's PyTorch implementation of TD3 for OpenAI gym tasks

Addressing Function Approximation Error in Actor-Critic Methods PyTorch implementation of Twin Delayed Deep Deterministic Policy Gradients (TD3). If y

Scott Fujimoto 1.3k Dec 25, 2022
Heterogeneous Deep Graph Infomax

Heterogeneous-Deep-Graph-Infomax Parameter Setting: HDGI-A: Node-level dimension: 16 Attention head: 4 Semantic-level attention vector: 8 learning rat

52 Oct 31, 2022
Efficient Multi Collection Style Transfer Using GAN

Proposed a new model that can make style transfer from single style image, and allow to transfer into multiple different styles in a single model.

Zhaozheng Shen 2 Jan 15, 2022
Avatarify Python - Avatars for Zoom, Skype and other video-conferencing apps.

Avatarify Python - Avatars for Zoom, Skype and other video-conferencing apps.

Ali Aliev 15.3k Jan 05, 2023
This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer Capacitor domain using text similarity indexes: An experimental analysis "

kwd-extraction-study This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer

ping 543f 1 Dec 05, 2022
Visualization toolkit for neural networks in PyTorch! Demo -->

FlashTorch A Python visualization toolkit, built with PyTorch, for neural networks in PyTorch. Neural networks are often described as "black box". The

Misa Ogura 692 Dec 29, 2022
The official github repository for Towards Continual Knowledge Learning of Language Models

Towards Continual Knowledge Learning of Language Models This is the official github repository for Towards Continual Knowledge Learning of Language Mo

Joel Jang | 장요엘 65 Jan 07, 2023
Clustering is a popular approach to detect patterns in unlabeled data

Visual Clustering Clustering is a popular approach to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a data

Tarek Naous 24 Nov 11, 2022
Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

Volume rendering + 3D implicit surface Showcase What? previous: surface rendering; now: volume rendering previous: NeRF's volume density; now: implici

Jianfei Guo 682 Jan 04, 2023
Official code for the ICCV 2021 paper "DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders"

DECA Official code for the ICCV 2021 paper "DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders". All the code is writte

23 Dec 01, 2022
BMVC 2021: This is the github repository for "Few Shot Temporal Action Localization using Query Adaptive Transformers" accepted in British Machine Vision Conference (BMVC) 2021, Virtual

FS-QAT: Few Shot Temporal Action Localization using Query Adaptive Transformer Accepted as Poster in BMVC 2021 This is an official implementation in P

Sauradip Nag 14 Dec 09, 2022
ALBERT-pytorch-implementation - ALBERT pytorch implementation

ALBERT-pytorch-implementation developing... 모델의 개념이해를 돕기 위한 구현물로 현재 변수명을 상세히 적었고

BG Kim 3 Oct 06, 2022
This repo contains the code for paper Inverse Weighted Survival Games

Inverse-Weighted-Survival-Games This repo contains the code for paper Inverse Weighted Survival Games instructions general loss function (--lfn) can b

3 Jan 12, 2022
Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation Created by Zeyu HU Introduction This work is based on our paper VMNet: Voxel-Mes

HU Zeyu 82 Dec 27, 2022