EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Overview

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Ruiqi Zhong, Kristy Lee*, Zheng Zhang*, Dan Klein

EMNLP 2021 Findings, https://arxiv.org/abs/2104.04670

Data

Please download the dataset from here: https://drive.google.com/file/d/1hrLlpk6Pla95Bnv_e1MAhCx7uJSDgA-w/view?usp=sharing

If you are using this dataset, please cite all the papers in the custom_citations.txt, anthology_citations.txt, urls.txt file in the citations folder. Thanks!

Each datapoint is represented as a dictionary.

{"q": [label description], "c": [text input], "a": [0 or 1]},

where "q" stands for question, which contains label information, "c" stands for context, which contains the input text, "a" stands for answer, which is either 1 (Yes) or 0 (No).

training_dicts/ contains all the datasets for training, and each of the .pkl file is a list of datapoints. testing_dicts/ contains all the datasets for evaluation, and each of the .pkl file is a map from (label, label descriptions) to a list of datapoints.

Datasets that have the same group number in front of their filenames are considered similar. Notice that, for each dataset, there might be overlapping datapoints between the training and testing split, but it is okay since we never train and test on the same dataset.

Additionally, to speedup evaluation, we performed subsampling for many of the test datasets, so the numbers will not be directly comparable to those in the other paper.

Specialized Models are Better

Meta-tune a model that is initialized with T5-large and test it on unseen (non-similar) datasets

python3 default_train.py large

Test UnifiedQA on all datatsets used for evaluation

python3 baseline.py large

Evaluate and compare the meta-tuned model and the UnifiedQA baseline with AUC-ROC for each label description.

python3 evaluate_and_plot.py large

We should expect to see that meta-tuned model is better than the UnifiedQA model on the majority of label descriptions.

Larger Models are Better

We can train another smaller-sized model using the command

python3 default_train.py base

and then we can compare the large vs. base model modifying evaluate_and_plot.py.

Owner
Ruiqi Zhong
Berkeley NLP Group
Ruiqi Zhong
A programming language written with python

Kaoft A programming language written with python How to use A simple Hello World: c="Hello World" c Output: "Hello World" Operators: a=12

1 Jan 24, 2022
AdvStyle - Official PyTorch Implementation

AdvStyle - Official PyTorch Implementation Paper | Supp Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes. Huiting Ya

Beryl 37 Oct 21, 2022
This program will stylize your photos with fast neural style transfer.

Neural Style Transfer (NST) Using TensorFlow Demo TensorFlow TensorFlow is an end-to-end open source platform for machine learning. It has a comprehen

Ismail Boularbah 1 Aug 08, 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Bilateral Denoising Diffusion Models (BDDMs) This is the official PyTorch implementation of the following paper: BDDM: BILATERAL DENOISING DIFFUSION M

172 Dec 23, 2022
WatermarkRemoval-WDNet-WACV2021

WatermarkRemoval-WDNet-WACV2021 Thank you for your attention. Citation Please cite the related works in your publications if it helps your research: @

LUYI 63 Dec 05, 2022
Self-Guided Contrastive Learning for BERT Sentence Representations

Self-Guided Contrastive Learning for BERT Sentence Representations This repository is dedicated for releasing the implementation of the models utilize

Taeuk Kim 16 Dec 04, 2022
Perform Linear Classification with Multi-way Data

MultiwayClassification This is an R package to perform linear classification for data with multi-way structure. The distance-weighted discrimination (

Eric F. Lock 2 Dec 15, 2020
这是一个mobilenet-yolov4-lite的库,把yolov4主干网络修改成了mobilenet,修改了Panet的卷积组成,使参数量大幅度缩小。

YOLOV4:You Only Look Once目标检测模型-修改mobilenet系列主干网络-在Keras当中的实现 2021年2月8日更新: 加入letterbox_image的选项,关闭letterbox_image后网络的map一般可以得到提升。

Bubbliiiing 65 Dec 01, 2022
A Number Recognition algorithm

Paddle-VisualAttention Results_Compared SVHN Dataset Methods Steps GPU Batch Size Learning Rate Patience Decay Step Decay Rate Training Speed (FPS) Ac

1 Nov 12, 2021
[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

BCMI 49 Jul 27, 2022
Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation Woncheol Shin1, Gyubok Lee1, Jiyoung Lee1, Joonseok Lee2,3, Edward Ch

Woncheol Shin 7 Sep 26, 2022
Code accompanying "Adaptive Methods for Aggregated Domain Generalization"

Adaptive Methods for Aggregated Domain Generalization (AdaClust) Official Pytorch Implementation of Adaptive Methods for Aggregated Domain Generalizat

Xavier Thomas 15 Sep 20, 2022
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences an

Microsoft 8k Jan 04, 2023
Akshat Surolia 2 May 11, 2022
This repository contains the entire code for our work "Two-Timescale End-to-End Learning for Channel Acquisition and Hybrid Precoding"

Two-Timescale-DNN Two-Timescale End-to-End Learning for Channel Acquisition and Hybrid Precoding This repository contains the entire code for our work

QiyuHu 3 Mar 07, 2022
Code base for the paper "Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation"

This repository contains code for the paper Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiati

8 Aug 28, 2022
Jittor 64*64 implementation of StyleGAN

StyleGanJittor (Tsinghua university computer graphics course) Overview Jittor 64

Song Shengyu 3 Jan 20, 2022
Code release of paper "Deep Multi-View Stereo gone wild"

Deep MVS gone wild Pytorch implementation of "Deep MVS gone wild" (Paper | website) This repository provides the code to reproduce the experiments of

François Darmon 53 Dec 24, 2022
Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

ATLOP Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. If you make use of this co

Wenxuan Zhou 146 Nov 29, 2022
Code accompanying the NeurIPS 2021 paper "Generating High-Quality Explanations for Navigation in Partially-Revealed Environments"

Generating High-Quality Explanations for Navigation in Partially-Revealed Environments This work presents an approach to explainable navigation under

RAIL Group @ George Mason University 1 Oct 28, 2022