Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Last update: Oct 29, 2022

Overview

GDAP

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Environment

Python (verified: v3.8)
CUDA (verified: v11.1)
Packages (see requirements.txt)

Usage

Preprocessing

We follow dygiepp for data preprocessing.

text2et: Event Type Detection
ettext2tri: Trigger Extraction
etrttext2role: Argument Extraction

# data processed by dyieapp
data/text2target/dyiepp_ace1005_ettext2tri_subtype
├── event.schema 
├── test.json
├── train.json
└── val.json

# data processed by  data_convert.convert_text_to_target
data/text2target/dyiepp_ace1005_ettext2tri_subtype
├── event.schema
├── test.json
├── train.json
└── val.json

Useful commands:

python -m data_convert.convert_text_to_target # data/raw_data -> data/text2target
python convert_dyiepp_to_sentence.py data/raw_data/dyiepp_ace2005 # doc -> sentence, used in evaluation

Training

Relevant scripts:

run_seq2seq.py: Python code entry, modified from the transformers/examples/seq2seq/run_seq2seq.py
run_seq2seq_span.bash: Model training script logging to the log file.

Example (see the above two files for more details):

# ace05 event type detection t5-base, the metric_format use eval_trigger-F1 
bash run_seq2seq_span.bash --data=dyiepp_ace2005_text2et_subtype --model=t5-base --format=et --metric_format=eval_trigger-F1

# ace05 tri extraction t5-base
bash run_seq2seq_span.bash --data=dyiepp_ace2005_ettext2tri_subtype --model=t5-base --format=tri --metric_format=eval_trigger-F1

# ace05 argument extraction t5-base
bash run_seq2seq_span.bash --data=dyiepp_ace2005_etrttext2role_subtype --model=t5-base --format=role --metric_format=eval_role-F1

Trained models are saved in the models/ folder.

Evaluation

run_tri_predict.bash: trigger extraction evaluation and inference script.
run_arg_predict.bash: argument extraction evaluation and inference script.

Todo

We aim to expand the codebase for a wider range of tasks, including

Name Entity Recognition
Keyword Generation
Event Relation Identification

If you find this repo helpful...

Please give us a ⭐ and cite our paper as

@misc{si2021-GDAP,
      title={Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works}, 
      author={Jinghui Si and Xutan Peng and Chen Li and Haotian Xu and Jianxin Li},
      year={2021},
      eprint={2110.04525},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

This project borrows code from Text2Event

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Related tags

Overview

GDAP

Environment

Usage

Preprocessing

Training

Evaluation

Todo

If you find this repo helpful...

Owner

Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators

PyToch implementation of A Novel Self-supervised Learning Task Designed for Anomaly Segmentation

VQGAN+CLIP Colab Notebook with user-friendly interface.

Latte: Cross-framework Python Package for Evaluation of Latent-based Generative Models

SoGCN: Second-Order Graph Convolutional Networks

This repository contains implementations of all Machine Learning Algorithms from scratch in Python. Mathematics required for ML and many projects have also been included.

Unofficial pytorch-lightning implement of Mip-NeRF

This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

SparseML is a libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Pytorch Lightning Implementation of SC-Depth Methods.

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

PointCloud Annotation Tools, support to label object bound box, ground, lane and kerb

Goal of the project : Detecting Temporal Boundaries in Sign Language videos

[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

implementation of the paper "MarginGAN: Adversarial Training in Semi-Supervised Learning"

Classifies galaxy morphology with Bayesian CNN

The Official PyTorch Implementation of DiscoBox.