[ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

Overview

Undistillable: Making A Nasty Teacher That CANNOT teach students

License: MIT

"Undistillable: Making A Nasty Teacher That CANNOT teach students"

Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang
In ICLR 2021 Spotlight Oral

Overview

  • We propose the concept of Nasty Teacher, a defensive approach to prevent knowledge leaking and unauthorized model cloning through KD without sacrificing performance.
  • We propose a simple yet efficient algorithm, called self-undermining knowledge distillation, to directly build a nasty teacher through self-training, requiring no additional dataset nor auxiliary network.

Prerequisite

We use Pytorch 1.4.0, and CUDA 10.1. You can install them with

conda install pytorch=1.4.0 torchvision=0.5.0 cudatoolkit=10.1 -c pytorch

It should also be applicable to other Pytorch and CUDA versions.

Then install other packages by

pip install -r requirements.txt

Usage

Teacher networks

Step 1: Train a normal teacher network
python train_scratch.py --save_path [XXX]

Here, [XXX] specifies the directory of params.json, which contains all hyperparameters to train a network. We already include all hyperparameters in experiments to reproduce the results in our paper.

For example, normally train a ResNet18 on CIFAR-10

python train_scratch.py --save_path experiments/CIFAR10/baseline/resnet18

After finishing training, you will get training.log, best_model.tar in that directory.

The normal teacher network will serve as the adversarial network for the training of the nasty teacher.

Step 2: Train a nasty teacher network
python train_nasty.py --save_path [XXX]

Again, [XXX] specifies the directory of params.json, which contains the information of adversarial networks and hyperparameters for training.
You need to specify the architecture of adversarial network and its checkpoint in this file.

For example, train a nasty ResNet18

python train_nasty.py --save_path experiments/CIFAR10/kd_nasty_resnet18/nasty_resnet18

Knowledge Distillation for Student networks

You can train a student distilling from normal or nasty teachers by

python train_kd.py --save_path [XXX]

Again, [XXX] specifies the directory of params.json, which contains the information of student networks and teacher networks

For example,

  • train a plain CNN distilling from a nasty ResNet18
python train_kd.py --save_path experiments/CIFAR10/kd_nasty_resnet18/cnn
  • Train a plain CNN distilling from a normal ResNet18
python train_kd.py --save_path experiments/CIFAR10/kd_normal_resnet18/cnn

Citation

@inproceedings{
ma2021undistillable,
title={Undistillable: Making A Nasty Teacher That {\{}CANNOT{\}} teach students},
author={Haoyu Ma and Tianlong Chen and Ting-Kuei Hu and Chenyu You and Xiaohui Xie and Zhangyang Wang},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=0zvfm-nZqQs}
}

Acknowledgement

Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
Coarse implement of the paper "A Simultaneous Denoising and Dereverberation Framework with Target Decoupling", On DNS-2020 dataset, the DNSMOS of first stage is 3.42 and second stage is 3.47.

SDDNet Coarse implement of the paper "A Simultaneous Denoising and Dereverberation Framework with Target Decoupling", On DNS-2020 dataset, the DNSMOS

Cyril Lv 43 Nov 21, 2022
A Pytorch implementation of the multi agent deep deterministic policy gradients (MADDPG) algorithm

Multi-Agent-Deep-Deterministic-Policy-Gradients A Pytorch implementation of the multi agent deep deterministic policy gradients(MADDPG) algorithm This

Phil Tabor 159 Dec 28, 2022
Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Face Recognition: Too Bias, or Not Too Bias? Robinson, Joseph P., Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. "Face recognition:

Joseph P. Robinson 41 Dec 12, 2022
Image Super-Resolution by Neural Texture Transfer

SRNTT: Image Super-Resolution by Neural Texture Transfer Tensorflow implementation of the paper Image Super-Resolution by Neural Texture Transfer acce

Zhifei Zhang 413 Nov 30, 2022
Implementation of [Time in a Box: Advancing Knowledge Graph Completion with Temporal Scopes].

Time2box Implementation of [Time in a Box: Advancing Knowledge Graph Completion with Temporal Scopes].

LingCai 4 Aug 23, 2022
PyTorch implementation of Glow

glow-pytorch PyTorch implementation of Glow, Generative Flow with Invertible 1x1 Convolutions (https://arxiv.org/abs/1807.03039) Usage: python train.p

Kim Seonghyeon 433 Dec 27, 2022
A quick recipe to learn all about Transformers

Transformers have accelerated the development of new techniques and models for natural language processing (NLP) tasks.

DAIR.AI 772 Dec 31, 2022
PyTorch implementation of the paper Dynamic Data Augmentation with Gating Networks

Dynamic Data Augmentation with Gating Networks This is an official PyTorch implementation of the paper Dynamic Data Augmentation with Gating Networks

九州大学 ヒューマンインタフェース研究室 3 Oct 26, 2022
[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

NerfingMVS Project Page | Paper | Video | Data NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo Yi Wei, Shaohui

Yi Wei 369 Dec 24, 2022
Volumetric parameterization of the placenta to a flattened template

placenta-flattening A MATLAB algorithm for volumetric mesh parameterization. Developed for mapping a placenta segmentation derived from an MRI image t

Mazdak Abulnaga 12 Mar 14, 2022
Link prediction using Multiple Order Local Information (MOLI)

Understanding the network formation pattern for better link prediction Authors: [e

Wu Lab 0 Oct 18, 2021
PyTorch implementation of UPFlow (unsupervised optical flow learning)

UPFlow: Upsampling Pyramid for Unsupervised Optical Flow Learning By Kunming Luo, Chuan Wang, Shuaicheng Liu, Haoqiang Fan, Jue Wang, Jian Sun Megvii

kunming luo 87 Dec 20, 2022
Bulk2Space is a spatial deconvolution method based on deep learning frameworks

Bulk2Space Spatially resolved single-cell deconvolution of bulk transcriptomes using Bulk2Space Bulk2Space is a spatial deconvolution method based on

Dr. FAN, Xiaohui 60 Dec 27, 2022
Evaluation framework for testing segmentation networks in PyTorch

Evaluation framework for testing segmentation networks in PyTorch. What segmentation network to choose for next Kaggle competition? This benchmark knows the answer!

Eugene Khvedchenya 37 Apr 27, 2022
A PyTorch Implementation of the Luna: Linear Unified Nested Attention

Unofficial PyTorch implementation of Luna: Linear Unified Nested Attention The quadratic computational and memory complexities of the Transformer’s at

Soohwan Kim 32 Nov 07, 2022
Event sourced bank - A wide-and-shallow example using the Python event sourcing library

Event Sourced Bank A "wide but shallow" example of using the Python event sourci

3 Mar 09, 2022
Gradient Inversion with Generative Image Prior

Gradient Inversion with Generative Image Prior This repository is an implementation of "Gradient Inversion with Generative Image Prior", accepted to N

MLLab @ Postech 25 Jan 09, 2023
Self-Regulated Learning for Egocentric Video Activity Anticipation

Self-Regulated Learning for Egocentric Video Activity Anticipation Introduction This is a Pytorch implementation of the model described in our paper:

qzhb 13 Sep 23, 2022
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep

268 Jan 09, 2023
The official github repository for Towards Continual Knowledge Learning of Language Models

Towards Continual Knowledge Learning of Language Models This is the official github repository for Towards Continual Knowledge Learning of Language Mo

Joel Jang | 장요엘 65 Jan 07, 2023