Robust fine-tuning of zero-shot models

Related tags

Deep Learningwise-ft
Overview

Robust fine-tuning of zero-shot models

This repository contains code for the paper Robust fine-tuning of zero-shot models by Mitchell Wortsman*, Gabriel Ilharco*, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo-Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig Schmidt.

Abstract

Large pre-trained models such as CLIP offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning approaches substantially improve accuracy in-distribution, they also reduce out-of-distribution robustness. We address this tension by introducing a simple and effective method for improving robustness: ensembling the weights of the zero-shot and fine-tuned models. Compared to standard fine-tuning, the resulting weight-space ensembles provide large accuracy improvements out-of-distribution, while matching or improving in-distribution accuracy. On ImageNet and five derived distribution shifts, weight-space ensembles improve out-of-distribution accuracy by 2 to 10 percentage points while increasing in-distribution accuracy by nearly 1 percentage point relative to standard fine-tuning. These improvements come at no additional computational cost during fine-tuning or inference.

Summary figure

figure1

Compared to standard fine-tuning, weight-space ensembles for fine-tuning (WiSE-FT) improve out-of-distribution (OOD) accuracy without decreasing in-distribution (ID) performance. Top left: Zero-shot CLIP models exhibit high effective robustness and moderate in-distribution accuracy, while standard fine-tuning (end-to-end or with a linear classifier) attains higher ID accuracy and less effective robustness. Top right: Our method linearly interpolates between the zero-shot and fine-tuned models with a mixing coefficient alpha in [0,1]. Bottom: On five distribution shifts derived from ImageNet (ImageNetV2, ImageNet-R, ImageNet Sketch, ObjectNet, and ImageNet-A), WiSE-FT improves average OOD accuracy by 8.7 percentage points (pp) when fine-tuning end-to-end (+2.1 pp when fine-tuning a linear classifier) while maintaining ID accuracy.

Code

Overview

WiSE-FT can be implemented in a few lines of code in addition to standard fine-tuning, as shown below. See src/wise_ft.py for more details.

# Load models
zeroshot = ImageClassifier.load(zeroshot_checkpoint)
finetuned = ImageClassifier.load(finetuned_checkpoint)
theta_0 = zeroshot.state_dict()
theta_1 = finetuned.state_dict()

# make sure checkpoints are compatible
assert set(theta_0.keys()) == set(theta_1.keys())

# interpolate between checkpoints with mixing coefficient alpha
theta = {
    key: (1-alpha) * theta_0[key] + alpha * theta_1[key]
    for key in theta_0.keys()
}

# update the model acccording to the new weights
finetuned.load_state_dict(theta)

# evaluate
evaluate(finetuned, args)

Install dependencies

conda env create
conda activate wiseft

Add directory to PYTHONPATH:

cd wise-ft
export PYTHONPATH="$PYTHONPATH:$PWD"

Download data

When necessary, please refer to datasets.md for instructions on how to download datasets.

Run WiSE-FT

Sample command when zeroshot and fine-tuned models are available:

python src/wise_ft.py   \
    --eval-datasets=ImageNet,ImageNetV2,ImageNetR,ImageNetA,ImageNetSketch  \
    --load=models/zeroshot.pt,models/finetuned.pt  \
    --results-db=results.jsonl  \
    --save=models/wiseft  \
    --data-location=~/data \
    --alpha 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Sample command for running WiSE-FT from scratch using ViT-B/32:

python src/wise_ft.py   \
    --train-dataset=ImageNet  \
    --epochs=10  \
    --lr=0.00003  \
    --batch-size=512  \
    --cache-dir=cache  \
    --model=ViT-B/32  \
    --eval-datasets=ImageNet,ImageNetV2,ImageNetR,ImageNetA,ImageNetSketch  \
    --template=openai_imagenet_template  \
    --results-db=results.jsonl  \
    --save=models/wiseft/ViTB32  \
    --data-location=~/data \
    --alpha 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Note: the flag --freeze-encoder controls whether only a linear classifier is fine-tuned, or if all weights are fine-tuned (end-to-end).

Plotting results

Sample command for generating a scatter plot:

python src/scatter_plot.py  \
    --eval-datasets=ImageNetV2,ImageNetR,ImageNetA,ImageNetSketch  \
    --results-db=results.jsonl  \
    --save plots

We show samples of expected behavior below when running the commands above using ViT-B/16 (models can be downloaded here):

ImageNet-Sketch         ImageNet-A

ImageNet-R         ImageNetV2

ObjectNet

Citing

If you found this repository useful, please consider citing:

@article{wortsman2021robust,
  title={Robust fine-tuning of zero-shot models},
  author={Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Gontijo-Lopes, Raphael and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
  journal={arXiv preprint arXiv:2109.01903},
  note={\url{https://arxiv.org/abs/2109.01903}},
  year={2021}
}
Official PyTorch implementation of "Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient".

Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient This repository is the official PyTorch implementation of "Edge Rewiring Go

Shanchao Yang 4 Dec 12, 2022
Adabelief-Optimizer - Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"

AdaBelief Optimizer NeurIPS 2020 Spotlight, trains fast as Adam, generalizes well as SGD, and is stable to train GANs. Release of package We have rele

Juntang Zhuang 998 Dec 29, 2022
Official code repository for ICCV 2021 paper: Gravity-Aware Monocular 3D Human Object Reconstruction

GraviCap Official code repository for ICCV 2021 paper: Gravity-Aware Monocular 3D Human Object Reconstruction. Gravity-Aware Monocular 3D Human-Object

Rishabh Dabral 15 Dec 09, 2022
Rax is a Learning-to-Rank library written in JAX

๐Ÿฆ– Rax: Composable Learning to Rank using JAX Rax is a Learning-to-Rank library written in JAX. Rax provides off-the-shelf implementations of ranking

Google 247 Dec 27, 2022
Generative Adversarial Text to Image Synthesis

Text To Image Synthesis This is a tensorflow implementation of synthesizing images. The images are synthesized using the GAN-CLS Algorithm from the pa

Hao 575 Jan 08, 2023
PyTorch implementation for View-Guided Point Cloud Completion

PyTorch implementation for View-Guided Point Cloud Completion

22 Jan 04, 2023
YOLO-v5 ๊ธฐ๋ฐ˜ ๋‹จ์•ˆ ์นด๋ฉ”๋ผ์˜ ์˜์ƒ์„ ํ™œ์šฉํ•ด ์ฐจ๊ฐ„ ๊ฑฐ๋ฆฌ๋ฅผ ์ผ์ •ํ•˜๊ฒŒ ์œ ์ง€ํ•˜๋ฉฐ ์ฃผํ–‰ํ•˜๋Š” Adaptive Cruise Control ๊ธฐ๋Šฅ ๊ตฌํ˜„

์ž์œจ ์ฃผํ–‰์ฐจ์˜ ์˜์ƒ ๊ธฐ๋ฐ˜ ์ฐจ๊ฐ„๊ฑฐ๋ฆฌ ์œ ์ง€ ๊ฐœ๋ฐœ Table of Contents ํ”„๋กœ์ ํŠธ ์†Œ๊ฐœ ์ฃผ์š” ๊ธฐ๋Šฅ ์‹œ์Šคํ…œ ๊ตฌ์กฐ ๋””๋ ‰ํ† ๋ฆฌ ๊ตฌ์กฐ ๊ฒฐ๊ณผ ์‹คํ–‰ ๋ฐฉ๋ฒ• ์ฐธ์กฐ ํŒ€์› ํ”„๋กœ์ ํŠธ ์†Œ๊ฐœ YOLO-v5 ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹จ์•ˆ ์นด๋ฉ”๋ผ์˜ ์˜์ƒ์„ ํ™œ์šฉํ•ด ์ฐจ๊ฐ„ ๊ฑฐ๋ฆฌ๋ฅผ ์ผ์ •ํ•˜๊ฒŒ ์œ ์ง€ํ•˜๋ฉฐ ์ฃผํ–‰ํ•˜๋Š” Adap

14 Jun 29, 2022
HandFoldingNet โœŒ๏ธ : A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton

HandFoldingNet โœŒ๏ธ : A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton Wencan Cheng, Jae Hyun Park, Jong

cwc1260 23 Oct 21, 2022
Elastic weight consolidation technique for incremental learning.

Overcoming-Catastrophic-forgetting-in-Neural-Networks Elastic weight consolidation technique for incremental learning. About Use this API if you dont

Shivam Saboo 89 Dec 22, 2022
Code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning".

0. Introduction This repository contains the source code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning". Notes The netwo

NetX Group 68 Nov 24, 2022
[SIGGRAPH 2021 Asia] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning

DeepVecFont This is the official Pytorch implementation of the paper: Yizhi Wang and Zhouhui Lian. DeepVecFont: Synthesizing High-quality Vector Fonts

Yizhi Wang 146 Dec 18, 2022
DeepSpamReview: Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures. Summer Internship project at CoreView Systems.

Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures Dataset: https://s3.amazonaws.com/fast-ai-nlp/yelp_review_polar

Ashish Salunkhe 37 Dec 17, 2022
A Dataset for Direct Quotation Extraction and Attribution in News Articles.

DirectQuote - A Dataset for Direct Quotation Extraction and Attribution in News Articles DirectQuote is a corpus containing 19,760 paragraphs and 10,3

THUNLP-MT 9 Sep 23, 2022
Keepsake is a Python library that uploads files and metadata (like hyperparameters) to Amazon S3 or Google Cloud Storage

Keepsake Version control for machine learning. Keepsake is a Python library that uploads files and metadata (like hyperparameters) to Amazon S3 or Goo

Replicate 1.6k Dec 29, 2022
Code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (ASL)".

Shaping Visual Representations with Attributes for Few-Shot Learning This code implements the Shaping Visual Representations with Attributes for Few-S

chx_nju 9 Sep 01, 2022
Matthew Colbrook 1 Apr 08, 2022
A Comparative Review of Recent Kinect-Based Action Recognition Algorithms (TIP2020, Matlab codes)

A Comparative Review of Recent Kinect-Based Action Recognition Algorithms This repo contains: the HDG implementation (Matlab codes) for 'Analysis and

Lei Wang 5 Oct 22, 2022
The code of paper "Block Modeling-Guided Graph Convolutional Neural Networks".

Block Modeling-Guided Graph Convolutional Neural Networks This repository contains the demo code of the paper: Block Modeling-Guided Graph Convolution

22 Dec 08, 2022
DeepLab resnet v2 model in pytorch

pytorch-deeplab-resnet DeepLab resnet v2 model implementation in pytorch. The architecture of deepLab-ResNet has been replicated exactly as it is from

Isht Dwivedi 601 Dec 22, 2022
[ICLR 2022] Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics

CPDeform Code and data for paper Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics at ICLR 2022 (Spotlight). @InProceed

(Lester) Sizhe Li 29 Nov 29, 2022