The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

Last update: Nov 03, 2022

Related tags

Text Data & NLP analogy-language-model

Overview

BERT is to NLP what AlexNet is to CV

This is the official implementation of BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies? (the camera-ready version of the paper is here) which has been accepted by the ACL 2021 main conference. We evaluate pretrained language models (LM) on five analogy tests that follow SAT-style format as below.

QUERY word:language
OPTION
  (1) paint:portrait
  (2) poetry:rhythm 
  (3) note:music <-- the answer!
  (4) tale:story
  (5) week:year

We devise a new class of scoring functions, referred to as analogical proportion (AP) scores, to solve word analogies in an unsurpervised fashion and investigate the relational knowledge that LM learnt through pretraining.

Please see our paper for more information and discussion.

Get started

git clone https://github.com/asahi417/analogy-language-model
cd analogy-language-model
pip install -e .

Run Experiments

The following scripts reproduce our results in the paper.

# get result for our main AP score
python experiments/experiment_ppl_variants.py 
# get result for word embedding baseline
python experiments/experiment_word_embedding.py 
# get result for other scoring function such as vector difference, etc
python experiments/experiment_scoring_comparison.py

Here's the result summary that can be attained by running those scripts.

experimental results

Dataset

The datasets used in our experiments can be downloaded from the following link:

Analogy Datasets

Please see the Analogy Tool for more information about the dataset and baselines.

Citation

Please cite our reference paper if you use our data or code:

@inproceedings{ushio-etal-2021-bert,
    title = "{BERT} is to {NLP} what {A}lex{N}et is to {CV}: Can Pre-Trained Language Models Identify Analogies?",
    author = "Ushio, Asahi  and
      Espinosa Anke, Luis  and
      Schockaert, Steven  and
      Camacho-Collados, Jose",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-long.280",
    doi = "10.18653/v1/2021.acl-long.280",
    pages = "3609--3624",
    abstract = "Analogies play a central role in human commonsense reasoning. The ability to recognize analogies such as {``}eye is to seeing what ear is to hearing{''}, sometimes referred to as analogical proportions, shape how we structure knowledge and understand language. Surprisingly, however, the task of identifying such analogies has not yet received much attention in the language model era. In this paper, we analyze the capabilities of transformer-based language models on this unsupervised task, using benchmarks obtained from educational settings, as well as more commonly used datasets. We find that off-the-shelf language models can identify analogies to a certain extent, but struggle with abstract and complex relations, and results are highly sensitive to model architecture and hyperparameters. Overall the best results were obtained with GPT-2 and RoBERTa, while configurations using BERT were not able to outperform word embedding models. Our results raise important questions for future work about how, and to what extent, pre-trained language models capture knowledge about abstract semantic relations.",
}

Please also cite the relevant reference papers if using any of the analogy datasets.

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

138 Dec 30, 2022

Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish Language Models 💃🏻 Corpora 📃 Corpora Number of documents Size (GB) BNE 201,080,084 570GB Models 🤖 RoBERTa-base BNE: https://huggingface.co

203 Dec 20, 2022

Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

KR-BERT-SimCSE Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT. Training Unsupervised python train_unsupervised.py --mi

27 Dec 12, 2022

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

BanglaBERT This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced i

197 Dec 25, 2022

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

GPT-NeoX An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hun

3.1k Jan 8, 2023

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

Related tags

Overview

BERT is to NLP what AlexNet is to CV

Get started

Run Experiments

Dataset

Citation

You might also like...

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

The official repository of the ISBI 2022 KNIGHT Challenge

official ( API ) for the zAmericanEnglish app in [ Google play ] and [ App store ]

Official codebase for Can Wikipedia Help Offline Reinforcement Learning?

SAINT PyTorch implementation

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

Releases(0.0.0)

0.0.0(Apr 29, 2021)

Owner

Asahi Ushio

PyWorld3 is a Python implementation of the World3 model

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

AutoGluon: AutoML for Text, Image, and Tabular Data

Fine-tune GPT-3 with a Google Chat conversation history

Pipelines de datos, 2021.

Paddle2.x version AI-Writer

Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.

AI Assistant for Building Reliable, High-performing and Fair Multilingual NLP Systems

HAIS_2GNN: 3D Visual Grounding with Graph and Attention

NLP-SentimentAnalysis - Coursera Course ( Duration : 5 weeks ) offered by DeepLearning.AI

Training RNNs as Fast as CNNs

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

中文空间语义理解评测

Tools, wrappers, etc... for data science with a concentration on text processing

NLPShala , the best IDE for all Natural language processing tasks.

Arabic-Phonetic-Output - You can input the phonetic version of any Arabic text here. This software will show you output in Arabic (with vowels)

The swas programming language

基于GRU网络的句子判断程序/A program based on GRU network for judging sentences

Datasets of Automatic Keyphrase Extraction

Watson Natural Language Understanding and Knowledge Studio