An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

Last update: Jan 07, 2023

Related tags

Overview

Channel LM Prompting (and beyond)

This includes an original implementation of Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. "Noisy Channel Language Model Prompting for Few-Shot Text Classification" 2021.

For any questions about the paper or the code, or to request pretrained checkpoints, please contact the first author (email) or leave issues.

If you find our code or paper useful, please cite the paper:

@article{ min2021noisy ,
  title={ Noisy Channel Language Model Prompting for Few-Shot Text Classification },
  author={ Min, Sewon and Lewis, Mike and Hajishirzi, Hannaneh and Zettlemoyer, Luke },
  journal={ arXiv preprint },
  year={ 2021 }
}

This also includes implementations of many recent papers studying prompt-based learning. Please make sure to cite corresponding papers when you use implementations of the methods in this repo.

Brown et al. NeurIPS 2021. "Language Models are Few-Shot Learners": for zero-shot and concat-based demonstration methods.
Zhao et al. ICML 2021. "Calibrate before use: Improving few-shot performance of language models": for direct++ formulations.
Holzman et al. EMNLP 2021. "Surface Form Competition: Why the Highest Probability Answer Isn't Always Right": for direct++ formulations.
Lester et al. 2021. "The Power of Scale for Parameter-Efficient Prompt Tuning": for prompt tuning methods

You can run the channel model and the direct model for each of these methods. Please see Section 3 of the paper for more details about these formulations.

Installation

$ conda create -n lm-prompt python=3.8
$ conda activate lm-prompt
$ conda install pytorch=1.7.1 -c pytorch
$ pip install transformers==4.3.0

Download and Preprocess Data

We use (and modify) the data and the preprocessing script from Gao et al. ACL 2021 (paper, code) and Zhang et al. NeurIPS 2015 (paper, data).

To download the k-shot data (already preprocessed): Download the data (776MB) from this link. Pleae place data.zip under the same directory as the code and unzip it.

To download the original data and preprocess yourself:

pip install pandas==1.1.5 # for preprocessing script
mkdir data
cd data
wget https://nlp.cs.princeton.edu/projects/lm-bff/datasets.tar
tar xvf datasets.tar
cd ..

Also, download the data from here and place it in data/original.

Then, run python3 generative_k_shot_data.py, and you are done!

Optionally, you can specify arguments such as

--k: number of training examples (default is 16).
--balance: whether or not to guarantee the balance between labels in the training data; more precisely, whether k is the number of training examples in total or per label (default is False).
--data_dir: directory for the original data (default is data/original).
--output_dir: directory for the preprocessed data (default is data).

To check the data: You can see the list of eleven datasets used in the paper by ls data/k-shot. Each dataset consists of five different splits based on five different splits (test sets are the same).

Demonstration-based methods

This section is for methods which does not update any of the model parameters. For details about methods, please see Section 4.1 of the paper.

Zero-shot

python main.py \
    --task {task_name} \
    --split {dev|test} \
    --data_dir data \
    --out_dir out \
    --gpt2 gpt2-large \
    --do_zeroshot \
    --method {direct|channel}

This command will run zero-shot inference using GPT2-large using four different templates (verbalizers) as reported in the paper.

For "channel", please specify --method channel.
For "direct", please specify --method direct.
For "direct++", please run the command line without --split first (this will run inference using the N/A input, following Zhao et al. ICML 2021), and then run the command line with --method direct --use_calibration.

Useful notes:

Note that, once you run inference, it will save a cache in the out directory, and will re-load the cache file when you run the exact same command line.
You can adjust --batch_size if you run into OOM issue (default is 32).
Please note that GPU parallization is not implemented for inference.
To save a log file, please specify --log_file.
To use GPT2 with different sizes, please use --gpt2 {gpt2|gpt2-medium|gpt2-xl}.

Concat-based demonstration

python main.py \
    --task {task_name} \
    --split {dev|test} \
    --data_dir data \
    --out_dir out \
    --gpt2 gpt2-large \
    --do_zeroshot \
    --method {direct|channel} \
    --use_demonstrations \
    --k 16 \
    --seed {13|21|42|87|100}

You can modify k and seed to try different numbers of training examples and different seeds for the k-shot data.

Ensemble-based demonstration

Add --ensemble to the command line for the Concat-based demonstration method.

Tuning methods

This section is for methods that fully finetune the model parameters (standard finetuning), or update a very limited number of parameters (prompt tuning, head tuning and transformation tuning). For details about the methods, please see Section 4.2 of the paper.

Prompt tuning

python main.py \
    --task {task_name} \
    --split {dev|test} \
    --data_dir data \
    --out_dir out \
    --gpt2 gpt2-large \
    --method {direct|channel} \
    --prompt_tune \
    --do_train \
    --batch_size 32 \
    --lr {0.1|0.01|0.001}

Please see Appendix B of the paper to see which learning rate we used for each dataset.
Once you train the model, you can specify --do_check to load the existing checkpoint without retraining the model.
Please note that GPU parallization is implemented for training, but is not implemented for inference.
Note that, by default, we use the checkpoint that is trained for 100 steps.
To explore different numbers of prompts, please specify --n_prefix. The default value is 20, following the original prompt tuning paper (Lester et al. 2021).
If you want to explore zero-shot task transfer (Section 6.4 in the paper), you can (1) first train the model on the training data, and (2) run inference by specifying --task {task_name_for_test} --train_task {task_name_for_train} --do_check.

Head tuning

Use --head_tune instead of --prompt_tune to the command line for the Prompt tuning method. Note that head tuning is only for the direct baseline.

Transformation tuning

Use --transform_tune instead of --prompt_tune to the command line for the Prompt tuning method. Note that transformation tuning is only for the direct baseline.

Standard finetuning

To finetune the entire model parameters, as in typical finetuning, please do not specify any of --prompt_tune, --head_tune or --transform_tune.

Results

For all results, please check out Table 3 and Table 4 of the paper.

An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

Related tags

Overview

Channel LM Prompting (and beyond)

Content

Installation

Download and Preprocess Data

Demonstration-based methods

Zero-shot

Concat-based demonstration

Ensemble-based demonstration

Tuning methods

Prompt tuning

Head tuning

Transformation tuning

Standard finetuning

Results

Owner

Sewon Min

Gym environments used in the paper: "Developmental Reinforcement Learning of Control Policy of a Quadcopter UAV with Thrust Vectoring Rotors"

A repository for benchmarking neural vocoders by their quality and speed.

Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

Equivariant layers for RC-complement symmetry in DNA sequence data

本项目是一个带有前端界面的垃圾分类项目，加载了训练好的模型参数，模型为efficientnetb4，暂时为40分类问题。

Phy-Q: A Benchmark for Physical Reasoning

A large-scale database for graph representation learning

Learning Optical Flow from a Few Matches (CVPR 2021)

Code for paper [ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot] (ICCV 2021, oral))

QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

Pytorch implemenation of Stochastic Multi-Label Image-to-image Translation (SMIT)

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

(NeurIPS 2021) Realistic Evaluation of Transductive Few-Shot Learning

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation

Author's PyTorch implementation of TD3 for OpenAI gym tasks

Official PyTorch implementation of "Improving Face Recognition with Large AgeGaps by Learning to Distinguish Children" (BMVC 2021)

Official implementation for the paper "SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization".

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"