Code for "Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments".

Overview

crfsrl

image

Yu Zhang, Qingrong Xia, Shilin Zhou, Yong Jiang, Zhenghua Li, Guohong Fu, Min Zhang. Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments. 2021. [arxiv]

Setup

The following packages should be installed:

Run the following scripts to obtain the training data. Please make sure PTB and OntoNotes are available:

bash scripts/conll05.sh PTB=<path-to-ptb>             SRL=data
bash scripts/conll12.sh ONTONOTES=<path-to-ontonotes> SRL=data

Run

Try the following commands to train first-order CRF and second-order CRF2o models:

# LSTM
# CRF
python -u crf.py   train -b -c configs/conll05.crf.srl.lstm.char-lemma.ini   -d 0 -f char lemma -p exp/conll05.crf.srl.lstm.char-lemma/model
# CRF2o
python -u crf2o.py train -b -c configs/conll05.crf2o.srl.lstm.char-lemma.ini -d 0 -f char lemma -p exp/conll05.crf2o.srl.lstm.char-lemma/model
# BERT finetuning
# CRF
python -u crf.py   train -b -c configs/conll05.crf.srl.bert.ini   -d 0 -p exp/conll05.crf.srl.bert/model   --batch-size=1000 --encoder bert --bert bert-large-cased 
# CRF2o
python -u crf2o.py train -b -c configs/conll05.crf2o.srl.bert.ini -d 0 -p exp/conll05.crf2o.srl.bert/model --batch-size=1000 --encoder bert --bert bert-large-cased

To do evaluation:

# end-to-end
python -u crf.py   evaluate -c configs/conll05.crf.srl.bert.ini  -d 0 -p exp/conll05.crf.srl.bert/model    --data data/conll05/test.conllu
# w/ gold predicates
python -u crf.py   evaluate -c configs/conll05.crf.srl.bert.ini  -d 0 -p exp/conll05.crf.srl.bert/model    --data data/conll05/test.conllu --prd

To make predictions:

python -u crf.py   predict  -c configs/conll05.crf.srl.bert.ini   -d 0 -p exp/conll05.crf.srl.bert/model   --data data/conll05/test.conllu --pred pred.conllu
bash scripts/eval.sh pred=pred.conllu gold=data/conll05/test.conllu

Contact

If you have any questions, feel free to contact me via emails.

Owner
Yu Zhang
PhD student @SUDA-LA; NLP/CL/ML.
Yu Zhang
A complete NLP guideline for enthusiasts

NLP-NINJA A complete guide for Natural Language Processing in Python Table of Contents S.No. Topic Level Meaning 1 Tokenization 🤍 Beginner 2 Stemming

MAINAK CHAUDHURI 22 Dec 27, 2022
A Fast Sequence Transducer Implementation with PyTorch Bindings

transducer A Fast Sequence Transducer Implementation with PyTorch Bindings. The corresponding publication is Sequence Transduction with Recurrent Neur

Awni Hannun 184 Dec 18, 2022
Mysticbbs-rjam - rJAM splitscreen message reader for MysticBBS A46+

rJAM splitscreen message reader for MysticBBS A46+

Robbert Langezaal 4 Nov 22, 2022
Pytorch NLP library based on FastAI

Quick NLP Quick NLP is a deep learning nlp library inspired by the fast.ai library It follows the same api as fastai and extends it allowing for quick

Agis pof 283 Nov 21, 2022
HAN2HAN : Hangul Font Generation

HAN2HAN : Hangul Font Generation

Changwoo Lee 36 Dec 28, 2022
Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

A Infomation Grathering tool that reverse search phone numbers and get their details ! What is phomber? Phomber is one of the best tools available fo

S41R4J 121 Dec 27, 2022
Code for the Python code smells video on the ArjanCodes channel.

7 Python code smells This repository contains the code for the Python code smells video on the ArjanCodes channel (watch the video here). The example

55 Dec 29, 2022
Fast topic modeling platform

The state-of-the-art platform for topic modeling. Full Documentation User Mailing List Download Releases User survey What is BigARTM? BigARTM is a pow

BigARTM 633 Dec 21, 2022
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

LightSpeech UnOfficial PyTorch implementation of LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search.

Rishikesh (ऋषिकेश) 54 Dec 03, 2022
Script to generate VAD dataset used in Asteroid recipe

About the dataset LibriVAD is an open source dataset for voice activity detection in noisy environments. It is derived from LibriSpeech signals (clean

11 Sep 15, 2022
An implementation of the Pay Attention when Required transformer

Pay Attention when Required (PAR) Transformer-XL An implementation of the Pay Attention when Required transformer from the paper: https://arxiv.org/pd

7 Aug 11, 2022
A full spaCy pipeline and models for scientific/biomedical documents.

This repository contains custom pipes and models related to using spaCy for scientific documents. In particular, there is a custom tokenizer that adds

AI2 1.3k Jan 03, 2023
Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

1.1k Dec 27, 2022
This is my reading list for my PhD in AI, NLP, Deep Learning and more.

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

Zhong Peixiang 156 Dec 21, 2022
Simple GUI where you can enter an article and get a crisp summarized version.

Text-Summarization-using-TextRank-BART Simple GUI where you can enter an article and get a crisp summarized version. How to run: Clone the repo Instal

Rohit P 4 Sep 28, 2022
MASS: Masked Sequence to Sequence Pre-training for Language Generation

MASS: Masked Sequence to Sequence Pre-training for Language Generation

Microsoft 1.1k Dec 17, 2022
SpikeX - SpaCy Pipes for Knowledge Extraction

SpikeX is a collection of pipes ready to be plugged in a spaCy pipeline. It aims to help in building knowledge extraction tools with almost-zero effort.

Erre Quadro Srl 384 Dec 12, 2022
The RWKV Language Model

RWKV-LM We propose the RWKV language model, with alternating time-mix and channel-mix layers: The R, K, V are generated by linear transforms of input,

PENG Bo 877 Jan 05, 2023
LSTM model - IMDB review sentiment analysis

NLP - Movie review sentiment analysis The colab notebook contains the code for building a LSTM Recurrent Neural Network that gives 87-88% accuracy on

Sundeep Bhimireddy 1 Jan 29, 2022
2021搜狐校园文本匹配算法大赛baseline

sohu2021-baseline 2021搜狐校园文本匹配算法大赛baseline 简介 分享了一个搜狐文本匹配的baseline,主要是通过条件LayerNorm来增加模型的多样性,以实现同一模型处理不同类型的数据、形成不同输出的目的。 线下验证集F1约0.74,线上测试集F1约0.73。

苏剑林(Jianlin Su) 45 Sep 06, 2022