Self-training for Few-shot Transfer Across Extreme Task Differences

Last update: Oct 31, 2022

Related tags

Overview

Self-training for Few-shot Transfer Across Extreme Task Differences (STARTUP)

Introduction

This repo contains the official implementation of the following ICLR2021 paper:

Title: Self-training for Few-shot Transfer Across Extreme Task Differences
Authors: Cheng Perng Phoo, Bharath Hariharan
Institution: Cornell University
Arxiv: https://arxiv.org/abs/2010.07734
Abstract:
Most few-shot learning techniques are pre-trained on a large, labeled "base dataset". In problem domains where such large labeled datasets are not available for pre-training (e.g., X-ray, satellite images), one must resort to pre-training in a different "source" problem domain (e.g., ImageNet), which can be very different from the desired target task. Traditional few-shot and transfer learning techniques fail in the presence of such extreme differences between the source and target tasks. In this paper, we present a simple and effective solution to tackle this extreme domain gap: self-training a source domain representation on unlabeled data from the target domain. We show that this improves one-shot performance on the target domain by 2.9 points on average on the challenging BSCD-FSL benchmark consisting of datasets from multiple domains.

Requirements

This codebase is tested with:

PyTorch 1.7.1
Torchvision 0.8.2
NumPy
Pandas
wandb (used for logging. More here: https://wandb.ai/)

Running Experiments

Step 0: Dataset Preparation

MiniImageNet and CD-FSL: Download the datasets for CD-FSL benchmark following step 1 and step 2 here: https://github.com/IBM/cdfsl-benchmark
tieredImageNet: Prepare the tieredImageNet dataset following https://github.com/mileyan/simple_shot. Note after running the preparation script, you will need to split the saved images into 3 different folders: train, val, test.

Step 1: Teacher Training on the Base Dataset

We provide scripts to produce teachers for different base datasets. Regardless of the base datasets, please follow the following steps to produce the teachers:

Go into the directory teacher_miniImageNet/ (teacher_ImageNet/ for ImageNet)
Take care of the TODO: in run.sh and configs.py (if applicable).
Run bash run.sh to produce the teachers.

Note that for miniImageNet and tieredImageNet, the training script is adapted based on the official script provided by the CD-FSL benchmark. For ImageNet, we simply download the pre-trained models from PyTorch and convert them to relevant format.

Step 2: Student Training

To train the STARTUP's representation, please follow the following steps:

Go into the directory student_STARTUP/ (student_STARTUP_no_self_supervision/ for the version without SimCLR)
Take care of the TODO: in run.sh and configs.py
Run bash run.sh to produce the student/STARTUP representation.

Step 3: Evaluation

To evaluate different representations, go into evaluation/, modify the TODO: in run.sh and configs.py and run bash run.sh.

Notes

When producing the results for the submitted paper, we did not set torch.backends.cudnn.deterministic and torch.backends.cudnn.benchmark properly, thus causing non-deterministic behaviors. We have rerun our experiments and the updated numbers can be found here: https://docs.google.com/spreadsheets/d/1O1e9xdI1SxVvRWK9VVxcO8yefZhePAHGikypWfhRv8c/edit?usp=sharing. Although some of the numbers has changed, the conclusion in the paper remains unchanged. STARTUP is able to outperform all the baselines, bringing forth tremendous improvements to cross-domain few-shot learning.
All the trainings are done on Nvidia Titan RTX GPU. Evaluation of different representations are performed using Nvidia RTX 2080Ti. Regardless of the GPU models, CUDA11 is used.
This repo is built upon the official CD-FSL benchmark repo: https://github.com/IBM/cdfsl-benchmark/tree/9c6a42f4bb3d2638bb85d3e9df3d46e78107bc53. We thank the creators of the CD-FSL benchmark for releasing code to the public.
If you find this codebase or STARTUP useful, please consider citing our paper:

@inproceeding{phoo2021STARTUP,
    title={Self-training for Few-shot Transfer Across Extreme Task Differences},
    author={Phoo, Cheng Perng and Hariharan, Bharath},
    booktitle={Proceedings of the International Conference on Learning Representations},
    year={2021}
}

Self-training for Few-shot Transfer Across Extreme Task Differences

Related tags

Overview

Self-training for Few-shot Transfer Across Extreme Task Differences (STARTUP)

Introduction

Requirements

Running Experiments

Step 0: Dataset Preparation

Step 1: Teacher Training on the Base Dataset

Step 2: Student Training

Step 3: Evaluation

Notes

Owner

Cheng Perng Phoo

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Vision-and-Language Navigation in Continuous Environments using Habitat

Remote sensing change detection using PaddlePaddle

TAUFE: Task-Agnostic Undesirable Feature DeactivationUsing Out-of-Distribution Data

Rafael Project- Classifying rockets to different types using data science algorithms.

Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Pytorch implement of 'Unmixing based PAN guided fusion network for hyperspectral imagery'

OpenMMLab Image and Video Editing Toolbox

利用yolov5和TensorRT从0到1实现目标检测的模型训练到模型部署全过程

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

PyTorch Implementation of AnimeGANv2

Official code repository for Continual Learning In Environments With Polynomial Mixing Times

Differentiable Optimizers with Perturbations in Pytorch

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Prml - Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop

Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting (HMNet)

Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

Plotting points that lie on the intersection of the given curves using gradient descent.

Gauge equivariant mesh cnn

A deep-learning pipeline for segmentation of ambiguous microscopic images.