Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Last update: Sep 28, 2022

Related tags

Overview

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Prerequisites

This repo is built upon a local copy of transformers==2.1.1. This repo has been tested on torch==1.4.0 with python 3.7 and CUDA 10.1.

To start, create a new environment and install:

conda create -n grad2task python=3.7
conda activate grad2task
cd Grad2Task
pip install -e .

We use wandb for logging. Please set it up following this doc and specify your project name on wandb in run_meta_training.sh:

export WANDB=[YOUR PROJECT NAME]

Download the dataset and unzip it under the main folder: https://drive.google.com/file/d/1uAdgZFYv9epk6tQVQ3SwboxFpSlkC_ZW/view?usp=sharing

If need to place it somewhere else, specify its path in path.sh.

Train & Evaluation

To train/evaluate models:

bash meta_learn.sh [MODEL_NAME] [MODE] [EXP_ID]

where [MODEL_NAME] refers to model name, [MODE] is experiment model and [EXP_ID] is an optional experiment id used for mark different runs using the same model. Options for [MODEL_NAM] and MODE are listed as follow:

`[MODE]`	Description
train	Training models.
test_best	Test the model with the best validation performance.
test_latest	Test the latest checkpoint.
test	Test model without meta-training. Only applicable to the `fine-tune-baseline` model.

`[MODEL_NAME]`	Description
fine-tune-baseline	Fine-tuning BERT for each task separately.
bert-protonet-euc	ProtoNet with BERT as encoder, using Euclidean distance as distance metric.
bert-protonet-euc-bn	ProtoNet with BERT+Bottleneck Adapters as encoder, using Euclidean distance as distance metric.
bert-protonet	ProtoNet with BERT as encoder, using cosine distance as distance metric.
bert-protonet-bn	ProtoNet with BERT+Bottleneck Adapters as encoder, using cosine distance as distance metric.
bert-leopard	Leopard with pretrained BERT [1].
bert-leopard-fixlr	Leopard but with fixed learning rates.
bert-cnap-bn-euc-context-cls-shift-scale-ar	Our proposed approach using gradients as task representation.
bert-cnap-bn-euc-context-cls-shift-scale-ar-X	Our proposed approach using average input encoding as task representation.
bert-cnap-bn-euc-context-cls-shift-scale-ar-XGrad	Our proposed approach using both gradients and input encoding as task representation.
bert-cnap-bn-euc-context-cls-shift-scale-ar-XY	Our proposed approach using input and textual label encoding as task representation.
bert-cnap-bn-euc-context-shift-scale-ar	Same with our proposed approach except adapting all tokens instead of just the [CLS] token as we do.
bert-cnap-bn-pretrained-taskemb	Our proposed approach with pretrained task embedding model.
bert-cnap-bn-hyper	A hypernetwork based approach.

To run a model with different hyperparameters, first name this run by [EXP_ID] and then specify the new hyperparameters in run/meta_learn.sh. For example, if one wants to run bert-protonet-euc with a smaller learning rate, they could modify run/meta_learn.sh as:

...
elif [ $1 == "bert-protonet-bn" ]; then # ProtoNet with cosince distance
    export LEARNING_RATE=2e-5
    export CHECKPOINT_FREQ=1000
    if [ ${EXP_ID} == *"lr1e-5" ]; then
        export LEARNING_RATE=1e-5
        export CHECKPOINT_FREQ=2000
        # modify other hyperparameters here
    fi
...

and then run:

bash meta_learn.sh bert-protonet-bn train lr1e-5

Reference

[1] T. Bansal, R. Jha, and A. McCallum. Learning to few-shot learn across diverse natural language classification tasks. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5108–5123, 2020.

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Related tags

Overview

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Prerequisites

Train & Evaluation

Reference

Owner

Jixuan Wang

Implementation for Learning to Track with Object Permanence

Identifying Stroke Indicators Using Rough Sets

DM-ACME compatible implementation of the Arm26 environment from Mujoco

A machine learning malware analysis framework for Android apps.

Pocsploit is a lightweight, flexible and novel open source poc verification framework

Yolov5-lite - Minimal PyTorch implementation of YOLOv5

A toolkit for making real world machine learning and data analysis applications in C++

🦕 NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano

LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs

An implementation of the paper "A Neural Algorithm of Artistic Style"

Code for the bachelors-thesis flaky fault localization

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Notspot robot simulation - Python version

GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

Weakly- and Semi-Supervised Panoptic Segmentation (ECCV18)

Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

A paper using optimal transport to solve the graph matching problem.

This is the implementation of the paper "Self-supervised Outdoor Scene Relighting"

Graph WaveNet apdapted for brain connectivity analysis.

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)