Revisiting Self-Training for Few-Shot Learning of Language Model.

Last update: Nov 19, 2022

Related tags

Overview

SFLM

This is the implementation of the paper Revisiting Self-Training for Few-Shot Learning of Language Model. SFLM is short for self-training for few-shot learning of language model.

Requirements

To run our code, please install all the dependency packages by using the following command:

pip install -r requirements.txt

Preprocess

The original data can be found from LM-BFF. To generate data for the few-shot experiments, please run the below command:

python tools/generate_data.py

The original data shall be in ./data/original, and the sampled data will be in ./data/few-shot/$K-$MU-$SEED. Please refer to ./tools/generate_data.py for more options.

Train

Our code can be run as the below example:

python3 run.py \
  --task_name SST-2 \
  --data_dir data/few-shot/SST-2/16-4-100 \
  --do_train \
  --do_eval \
  --do_predict \
  --evaluate_during_training \
  --model_name_or_path roberta-base \
  --few_shot_type prompt-demo \
  --num_k 16 \
  --max_seq_length 256 \
  --per_device_train_batch_size 2 \
  --per_device_eval_batch_size 16 \
  --gradient_accumulation_steps 4 \
  --learning_rate 1e-5 \
  --max_steps 1000 \
  --logging_steps 100 \
  --eval_steps 100 \
  --num_train_epochs 0 \
  --output_dir result/SST-2-16-4-100 \
  --save_logit_dir result/SST-2-16-4-100 \
  --seed 100 \
  --template "*cls**sent_0*_It_was*mask*.*sep+*" \
  --mapping "{'0':'terrible','1':'great'}" \
  --num_sample 16 \
  --threshold 0.95 \
  --lam1 0.5 \
  --lam2 0.1

Most arguments are the same as LM-BFF, and the same manual prompts are used in our experiments. We list additional arguments used in SFLM:

threshold: The threshold used to filter out low-confidence samples for self-training loss
lam1: The weight of self-training loss
lam2: The weight of self-supervised loss

Citation

Please cite our paper if you use SFLM in your work:

@inproceedings{chen2021revisit,        
    title={Revisiting Self-Training for Few-Shot Learning of Language Model},         
    author={Chen, Yiming and Zhang, Yan and Zhang, Chen and Lee, Grandee and Cheng, Ran and Li, Haizhou},         
    booktitle={EMNLP},        
    year={2021},
}

Acknowledgements

Code is implemented based on LM-BFF. We would like to thank the authors of LM-BFF for making their code public.

Revisiting Self-Training for Few-Shot Learning of Language Model.

Related tags

Overview

SFLM

Requirements

Preprocess

Train

Citation

Acknowledgements

Owner

Instant neural graphics primitives: lightning fast NeRF and more

Benchmark VAE - Library for Variational Autoencoder benchmarking

Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

D2LV: A Data-Driven and Local-Verification Approach for Image Copy Detection

Certis - Certis, A High-Quality Backtesting Engine

Everything you want about DP-Based Federated Learning, including Papers and Code. (Mechanism: Laplace or Gaussian, Dataset: femnist, shakespeare, mnist, cifar-10 and fashion-mnist. )

NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

Relative Uncertainty Learning for Facial Expression Recognition

Official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR)

Self-Supervised Contrastive Learning of Music Spectrograms

Efficient 3D Backbone Network for Temporal Modeling

Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Submanifold sparse convolutional networks

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Pytorch implementation of few-shot semantic image synthesis