Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

Related tags

Deep LearningERICA
Overview

ERICA

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

The code is based on huggingface's transformers, the trained models and pre-training data can be downloaded from Google Drive.

Quick Start

You can quickly run our code by following steps:

  • Install dependencies as described in following section.
  • cd to pretrain or finetune directory then download and pre-process data for pre-training or finetuning.

1. Dependencies

Run the following script to install dependencies.

pip install -r requirement.txt

You need to install transformers and apex manually.

transformers We use huggingface transformers to implement Bert and RoBERTa, and the version is 2.5.0. For convenience, we have downloaded transformers into code/pretrain/ so you can easily import it, and we have also modified some lines in the class BertForMaskedLM in src/transformers/modeling_bert.py while keeping the other codes unchanged.

You just need run

pip install .

to install transformers manually.

apex Install apex under the offical guidance.

process pretraining data

In folder prepare_pretrain_data, we provide the codes for processing pre-training data.

2. Pretraining

To pretrain ERICA_bert:

cd code/pretrain

python -m torch.distributed.launch --nproc_per_node 8  main.py  \
    --model DOC  --lr 3e-5 --batch_size_per_gpu 16 --max_epoch 105  \
    --gradient_accumulation_steps 16    --save_step 500  --temperature 0.05  \
    --train_sample  --save_dir ckpt_doc_dw_f_alpha_1_uncased --n_gpu 8  --debug 1  --add_none 1 \
    --alpha 1 --flow 0 --dataset_name none.json  --wiki_loss 1 --doc_loss 1 \
    --change_dataset 1  --start_end_token 0 --bert_model bert \
    --pretraining_size -1 --ablation 0 --cased 0

some explanations for hyper-parameters: temperature (\tau used in loss function of contrastive learning); debug (whether to debug (we provide an example_debug file for pre-training); add_none (whether to add no_relation pair in RD loss); alpha (the proportion of masking (1 means no masking, in experiments, we find masking is not helpful as is described in the main paper, so for all models, we do not mask in the pre-training phase. However, we leave this function here for further research explorations.)); flow (if masking, whether to use a linear decay); wiki_loss (whether to add ED loss); doc_loss (whether to add RD loss); start_end_token (use another entity encoding method); cased (whether to use cased version of BERT).

3. Fine-tuning

Enter each folder for downstream task (document-level / sentence-level relation extraction, entity typing and question answering) fine-tuning. Before fine-tuning, we assume you have already pre-trained an ERICA model. Excecute the bash in each folder for reimplementation.

Owner
THUNLP
Natural Language Processing Lab at Tsinghua University
THUNLP
Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

Cheng Zhang 149 Jan 08, 2023
BookMyShowPC - Movie Ticket Reservation App made with Tkinter

Book My Show PC What is this? Movie Ticket Reservation App made with Tkinter. Tk

The Nithin Balaji 3 Dec 09, 2022
An Open-Source Package for Information Retrieval.

OpenMatch An Open-Source Package for Information Retrieval. 😃 What's New Top Spot on TREC-COVID Challenge (May 2020, Round2) The twin goals of the ch

THUNLP 439 Dec 27, 2022
Picasso: a methods for embedding points in 2D in a way that respects distances while fitting a user-specified shape.

Picasso Code to generate Picasso embeddings of any input matrix. Picasso maps the points of an input matrix to user-defined, n-dimensional shape coord

Pachter Lab 45 Dec 23, 2022
[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Graph Contrastive Learning Automated PyTorch implementation for Graph Contrastive Learning Automated [talk] [poster] [appendix] Yuning You, Tianlong C

Shen Lab at Texas A&M University 80 Nov 23, 2022
Few-NERD: Not Only a Few-shot NER Dataset

Few-NERD: Not Only a Few-shot NER Dataset This is the source code of the ACL-IJCNLP 2021 paper: Few-NERD: A Few-shot Named Entity Recognition Dataset.

THUNLP 319 Dec 30, 2022
Python version of the amazing Reaction Mechanism Generator (RMG).

Reaction Mechanism Generator (RMG) Description This repository contains the Python version of Reaction Mechanism Generator (RMG), a tool for automatic

Reaction Mechanism Generator 284 Dec 27, 2022
A module for solving and visualizing Schrödinger equation.

qmsolve This is an attempt at making a solid, easy to use solver, capable of solving and visualize the Schrödinger equation for multiple particles, an

506 Dec 28, 2022
Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

B-Pref Official codebase for B-Pref: Benchmarking Preference-BasedReinforcement Learning contains scripts to reproduce experiments. Install conda env

48 Dec 20, 2022
Large scale embeddings on a single machine.

Marius Marius is a system under active development for training embeddings for large-scale graphs on a single machine. Training on large scale graphs

Marius 107 Jan 03, 2023
Time Dependent DFT in Tamm-Dancoff Approximation

Density Function Theory Program - kspy-tddft(tda) This is an implementation of Time-Dependent Density Functional Theory(TDDFT) using the Tamm-Dancoff

Peter Borthwick 2 Nov 17, 2022
Autonomous Robots Kalman Filters

Autonomous Robots Kalman Filters The Kalman Filter is an easy topic. However, ma

20 Jul 18, 2022
Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

TensorFlow2-GAN Collection of tf2.0 implementations of Generative Adversarial Network varieties presented in research papers. Model architectures will

41 Apr 28, 2022
magiCARP: Contrastive Authoring+Reviewing Pretraining

magiCARP: Contrastive Authoring+Reviewing Pretraining Welcome to the magiCARP API, the test bed used by EleutherAI for performing text/text bi-encoder

EleutherAI 43 Dec 29, 2022
A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

What Dead simple python wrapper for Yolo V3 using AlexyAB's darknet fork. Works with CUDA 10.1 and OpenCV 4.1 or later (I use OpenCV master as of Jun

Pliable Pixels 6 Jan 12, 2022
PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our paper

Flow Gaussian Mixture Model (FlowGMM) This repository contains a PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our pa

Pavel Izmailov 124 Nov 06, 2022
Edison AT is software Depression Assistant personal.

Edison AT Edison AT is software / program Depression Assistant personal. Feature: Analyze emotional real-time from face. Audio Edison(Comingsoon relea

Ananda Rauf 2 Apr 24, 2022
Simple Linear 2nd ODE Solver GUI - A 2nd constant coefficient linear ODE solver with simple GUI using euler's method

Simple_Linear_2nd_ODE_Solver_GUI Description It is a 2nd constant coefficient li

:) 4 Feb 05, 2022
HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

HeatNet HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events glob

Google Research 6 Jul 07, 2022
Open-AI's DALL-E for large scale training in mesh-tensorflow.

DALL-E in Mesh-Tensorflow [WIP] Open-AI's DALL-E in Mesh-Tensorflow. If this is similarly efficient to GPT-Neo, this repo should be able to train mode

EleutherAI 432 Dec 16, 2022