[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

Related tags

Deep LearningMosaicKD
Overview

MosaicKD

Code for NeurIPS-21 paper "Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data"

1. Motivation

Natural images share common local patterns. In MosaicKD, these local patterns are first dissembled from OOD data and then assembled to synthesize in-domain data, making OOD-KD feasible.

2. Method

MosaicKD establishes a four-player minimax game between a generator G, a patch discriminator D, a teacher model T and a student model S. The generator, as those in prior GANs, takes as input a random noise vector and learns to mosaic synthetic in-domain samples with locally-authentic and globally-legitimate distributions, under the supervisions back-propagated from the other three players.

3. Reproducing our results

3.1 Prepare teachers

Please download our pre-trained models from Dropbox (266 M) and extract them as "checkpoints/pretrained/*.pth". You can also train your own models as follows:

python train_scratch.py --lr 0.1 --batch-size 256 --model wrn40_2 --dataset cifar100

3.2 OOD-KD: CIFAR-100 (ID) + CIFAR10 (OOD)

  • Vanilla KD (Blind KD)

    python kd_vanilla.py --lr 0.1 --batch-size 128 --teacher wrn40_2 --student wrn16_1 --dataset cifar100 --unlabeled cifar10 --epoch 200 --gpu 0 
  • Data-Free KD (DFQAD)

    python kd_datafree.py --lr 0.1 --batch-size 256 --teacher wrn40_2 --student wrn16_1 --dataset cifar100 --unlabeled cifar10 --epoch 200 --lr 0.1 --local 1 --align 1 --adv 1 --balance 10 --gpu 0
  • MosaicKD (This work)

    python kd_mosaic.py --lr 0.1 --batch-size 256 --teacher wrn40_2 --student wrn16_1 --dataset cifar100 --unlabeled cifar10 --epoch 200 --lr 0.1 --local 1 --align 1 --adv 1 --balance 10 --gpu 0

3.3 OOD-KD: CIFAR-100 (ID) + ImageNet/Places365 OOD Subset (OOD)

  • Prepare 32x32 datasets
    Please prepare the 32x32 ImageNet following the instructions from https://patrykchrabaszcz.github.io/Imagenet32/ and extract them as "data/ImageNet_32x32/train" and "data/ImageNet_32x32/val". You can prepare Places365 in the same way.

  • MosaicKD on OOD subset
    As ImageNet & Places365 contain a large number of in-domain samples, we construct OOD subset for training. Please run the scripts with ''--ood_subset'' to enable subset selection.

    python kd_mosaic.py --lr 0.1 --batch-size 256 --teacher wrn40_2 --student wrn16_1 --dataset cifar100 --unlabeled cifar10 --epoch 200 --lr 0.1 --local 1 --align 1 --adv 1 --balance 10 --ood_subset --gpu 0

4. Visualization of synthetic data

5. Citation

If you found this work useful for your research, please cite our paper:

@article{fang2021mosaicking,
  title={Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data},
  author={Gongfan Fang and Yifan Bao and Jie Song and Xinchao Wang and Donglin Xie and Chengchao Shen and Mingli Song},
  journal={arXiv preprint arXiv:2110.15094},
  year={2021}
}
Owner
ZJU-VIPA
Laboratory of Visual Intelligence and Pattern Analysis
ZJU-VIPA
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking

StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking Datasets You can download datasets that have been pre-pr

25 May 29, 2022
Milano is a tool for automating hyper-parameters search for your models on a backend of your choice.

Milano (This is a research project, not an official NVIDIA product.) Documentation https://nvidia.github.io/Milano Milano (Machine learning autotuner

NVIDIA Corporation 147 Dec 17, 2022
ROS-UGV-Control-Interface - Control interface which can be used in any UGV

ROS-UGV-Control-Interface Cam Closed: Cam Opened:

Ahmet Fatih Akcan 1 Nov 04, 2022
Rule Based Classification Project For Python

Rule-Based-Classification-Project (ENG) Business Problem: A game company wants to create new level-based customer definitions (personas) by using some

Deniz Can OĞUZ 4 Oct 29, 2022
Training Very Deep Neural Networks Without Skip-Connections

DiracNets v2 update (January 2018): The code was updated for DiracNets-v2 in which we removed NCReLU by adding per-channel a and b multipliers without

Sergey Zagoruyko 585 Oct 12, 2022
Classification models 1D Zoo - Keras and TF.Keras

Classification models 1D Zoo - Keras and TF.Keras This repository contains 1D variants of popular CNN models for classification like ResNets, DenseNet

Roman Solovyev 12 Jan 06, 2023
A trashy useless Latin programming language written in python.

Codigum! The first programming langage in latin! (please keep your eyes closed when if you read the source code) It is pretty useless though. Document

Bic 2 Oct 25, 2021
Methods to get the probability of a changepoint in a time series.

Bayesian Changepoint Detection Methods to get the probability of a changepoint in a time series. Both online and offline methods are available. Read t

Johannes Kulick 554 Dec 30, 2022
Randomizes the warps in a stock pokeemerald repo.

pokeemerald warp randomizer Randomizes the warps in a stock pokeemerald repo. Usage Instructions Install networkx and matplotlib via pip3 or similar.

Max Thomas 6 Mar 17, 2022
Outlier Exposure with Confidence Control for Out-of-Distribution Detection

OOD-detection-using-OECC This repository contains the essential code for the paper Outlier Exposure with Confidence Control for Out-of-Distribution De

Nazim Shaikh 64 Nov 02, 2022
Code for IntraQ, PyTorch implementation of our paper under review

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization paper Requirements Python = 3.7.10 Pytorch == 1.7

1 Nov 19, 2021
Breast Cancer Detection 🔬 ITI "AI_Pro" Graduation Project

BreastCancerDetection - This program is designed to predict two severity of abnormalities associated with breast cancer cells: benign and malignant. Mammograms from MIAS is preprocessed and features

6 Nov 29, 2022
ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representa

Bats Research 94 Nov 21, 2022
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf] The official repository for TransReID: Transformer-based Object Re-Identificati

DamoCV 569 Dec 30, 2022
Automatic Attendance marker for LMS Practice School Division, BITS Pilani

LMS Attendance Marker Automatic script for lazy people to mark attendance on LMS for Practice School 1. Setup Add your LMS credentials and time slot t

Nihar Bansal 3 Jun 12, 2021
Implementation of OpenAI paper with Simple Noise Scale on Fastai V2

README Implementation of OpenAI paper "An Empirical Model of Large-Batch Training" for Fastai V2. The code is based on the batch size finder implement

13 Dec 10, 2021
Analysing poker data from home games with friends

Poker Game Analysis Analysing poker data from home games with friends. Not a lot of data is collected, so this project is primarily focussed on descri

Stavros Karmaniolos 1 Oct 15, 2022
scAR (single-cell Ambient Remover) is a package for data denoising in single-cell omics.

scAR scAR (single cell Ambient Remover) is a package for denoising multiple single cell omics data. It can be used for multiple tasks, such as, sgRNA

19 Nov 28, 2022
Reviving Iterative Training with Mask Guidance for Interactive Segmentation

This repository provides the source code for training and testing state-of-the-art click-based interactive segmentation models with the official PyTorch implementation

Visual Understanding Lab @ Samsung AI Center Moscow 406 Jan 01, 2023