Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

Related tags

Deep LearningATLOP
Overview

ATLOP

Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling.

If you make use of this code in your work, please kindly cite the following paper:

@inproceedings{zhou2021atlop,
	title={Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling},
	author={Zhou, Wenxuan and Huang, Kevin and Ma, Tengyu and Huang, Jing},
	booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
	year={2021}
}

Requirements

  • Python (tested on 3.7.4)
  • CUDA (tested on 10.2)
  • PyTorch (tested on 1.7.0)
  • Transformers (tested on 3.4.0)
  • numpy (tested on 1.19.4)
  • apex (tested on 0.1)
  • opt-einsum (tested on 3.3.0)
  • wandb
  • ujson
  • tqdm

Dataset

The DocRED dataset can be downloaded following the instructions at link. The CDR and GDA datasets can be obtained following the instructions in edge-oriented graph. The expected structure of files is:

ATLOP
 |-- dataset
 |    |-- docred
 |    |    |-- train_annotated.json        
 |    |    |-- train_distant.json
 |    |    |-- dev.json
 |    |    |-- test.json
 |    |-- cdr
 |    |    |-- train_filter.data
 |    |    |-- dev_filter.data
 |    |    |-- test_filter.data
 |    |-- gda
 |    |    |-- train.data
 |    |    |-- dev.data
 |    |    |-- test.data
 |-- meta
 |    |-- rel2id.json

Training and Evaluation

DocRED

Train the BERT model on DocRED with the following command:

>> sh scripts/run_bert.sh  # for BERT
>> sh scripts/run_roberta.sh  # for RoBERTa

The training loss and evaluation results on the dev set are synced to the wandb dashboard.

The program will generate a test file result.json in the official evaluation format. You can compress and submit it to Colab for the official test score.

CDR and GDA

Train CDA and GDA model with the following command:

>> sh scripts/run_cdr.sh  # for CDR
>> sh scripts/run_gda.sh  # for GDA

The training loss and evaluation results on the dev and test set are synced to the wandb dashboard.

Saving and Evaluating Models

You can save the model by setting the --save_path argument before training. The model correponds to the best dev results will be saved. After that, You can evaluate the saved model by setting the --load_path argument, then the code will skip training and evaluate the saved model on benchmarks. I've also released the trained atlop-bert-base and atlop-roberta models.

Comments
  • The results of ATLOP based on the bert-base-cased model on the DocRED dataset

    The results of ATLOP based on the bert-base-cased model on the DocRED dataset

    Hello, I retrained ATLOP based on the bert-base-cased model on the DocRED dataset. However, the max F1 and F1_ign score on the dev dataset is 58.81 and 57.09, respectively. However, these scores are much lower than the reported score in your paper (61.09, 59.22). Is the default model config correct? My environment is as follows: Best regards

    Python 3.7.8
    PyTorch 1.4.0
    Transformers 3.3.1
    apex 0.1
    opt-einsum 3.3.0
    
    opened by donghaozhang95 11
  • The main purpose of the function: get_label

    The main purpose of the function: get_label

    Hi @wzhouad ,

    Thanks so much for releasing your source code. I only wonder about the main purpose of the function get_label() in the file losses.py in calculating the final loss. Could you please explain it? Thanks for your help!

    opened by angelotran05 5
  • model.py

    model.py

    When I run train.py, there is an err in model.py:

    line 45, in get_hrt e_att.append(attention[i, :, start + offset])
    IndexError: too many indices for tensor of dimension 1

    Thanks.

    opened by qiunlp 5
  • Mention embedding

    Mention embedding

    Hi there, thanks for your nice work. I'm a bit confused that in the function get_hrt(), do you use the embedding of the first subword token as the mention embedding instead of summing up all the wordpieces? So the offset used here is due to the insertion of especial token "*" ? Please correct me if I'm wrong, thanks!

    opened by mk2x15 4
  • about the labels

    about the labels

    I see there a line of code before output the loss that is if labels is not None: labels = [torch.tensor(label) for label in labels] labels = torch.cat(labels, dim=0).to(logits) loss = self.loss_fnt(logits.float(), labels.float()) output = (loss.to(sequence_output),) + output

    and i also tried why sometimes the label could be none??? am I got something wrong?

    opened by ChristopherAmadeusMiao 4
  • The best results of same random seed are different at each time  when I trained the ATLOP

    The best results of same random seed are different at each time when I trained the ATLOP

    Hello I trained the ATLOP with same random seed=66 every time, but the final best result are different. Have you met the same situation before? thank you for your replying.

    opened by Lanyu123 4
  • Any plans to release the codes for CDR?

    Any plans to release the codes for CDR?

    Hello Zhou

    Thank you for releasing the codes of your work. In your paper, it has the experiment results on CDR. I want to reproduce the performance using the CDR dataset on your approach. Do you have any plans to release the codes for CDR?

    opened by mjeensung 4
  • About the process_long_input.py

    About the process_long_input.py

    I got the error, could you help me ? thank you!

    Traceback (most recent call last): File "train.py", line 228, in main() File "train.py", line 216, in main train(args, model, train_features, dev_features, test_features) File "train.py", line 74, in train finetune(train_features, optimizer, args.num_train_epochs, num_steps) File "train.py", line 38, in finetune outputs = model(**inputs) File "D:\Anaconda\envs\pytorch-GPU\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "D:\code\ATLOP\model.py", line 95, in forward sequence_output, attention = self.encode(input_ids, attention_mask) File "D:\code\ATLOP\model.py", line 32, in encode sequence_output, attention = process_long_input(self.model, input_ids, attention_mask, start_tokens, end_tokens) File "D:\code\ATLOP\long_seq.py", line 17, in process_long_input output_attentions=True, File "D:\Anaconda\envs\pytorch-GPU\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) TypeError: forward() got an unexpected keyword argument 'output_attentions'

    opened by MingYang1127 3
  • Can you please release trained model?

    Can you please release trained model?

    Hi. Thank you for releasing the codes of your model, it is really helpful.

    However I tried to retrain ATLOP based on the bert-base-cased model on the DocRED dataset but I can't get high result as your result on the paper. And I can't retrain roberta-large model because I don't have strong enough GPU (strongest GPU on Google Colab is V100). So can you please release your trained model. I would be very very happy if you can release your model, and I believe that it can help many other people, too.

    Thank you so much.

    opened by nguyenhuuthuat09 3
  • Where did the

    Where did the "/meta/rel2id.json" come from?

    I only want to use DocRED dataset,and there is only "rel_info.json" in it. Could you please tell me how can I get rel2id.json?I try to rename rel_info.json to rel2id.json but ValueError: invalid literal for int() with base 10: 'headquarters location' occured in File "train.py", line 197, in main train_features = read(train_file, tokenizer, max_seq_length=args.max_seq_length) File "/home/kw/ATLOP/prepro.py", line 56, in read_docred r = int(docred_rel2id[label['r']]) Thanks for your attention,I'm waiting for your reply.

    opened by AQA6666 2
  • How should I be running the Enhanced BERT Baseline model?

    How should I be running the Enhanced BERT Baseline model?

    Hi. I recently tried to run the Enhanced BERT Baseline model (i.e., without adaptive threshold loss and local contextualized pooling) and just wanted to confirm if I'm doing it right.

    Basically, in model.py lines 86-111 (i.e., the forward method) I modified the code so that I don't use rs and changed self.head_extractor and self.tail_extractor to have in_features and out_features accordingly. I did this because I'm assuming that within the get_hrt method, rs is what LOP is since we're using attention there. Modifying the extractors also implies that I'm not concatenating hs and ts with rs.

    After that I changed loss_fnt to be a simple nn.BCEWithLogitsLoss rather than ATLoss. That means I also changed the get_label method within ATLoss to be a function so that I'm not depending on the class.

    Am I doing this right? Or is there another way that I should be implementing it?

    The reason why I'm suspicious as to whether I implemented this correctly or not is because I'm currently running the code on the TACRED dataset rather than the DocRED dataset, and while ATLOP itself shows satisfactory performance the performance of the Enhanced BERT Baseline is much lower.

    Thanks.

    opened by seanswyi 2
  • The usage of the ATLoss

    The usage of the ATLoss

    Thanks for your amazing work! I am very interested in the ATLoss, but there is a little question I want to ask. When using the ATLoss, should we add a no-relation label? For example, there are 26 relation types, the gold labels may contain multiple relation types, but at least one relation type. How to represent the no-relation? Show I create a tensor of size 27 and set the first label 1 or a tensor of size 26 and set all the labels zero? Look forward to your reply. Many Thanks,

    opened by Onion12138 0
  • --save_path issue

    --save_path issue

    I edit the script file and add --save_path followed by the directory. I can't see any saved models after running the script. Could you please explain how to save a model in detail?

    opened by rijukandathil 0
Owner
Wenxuan Zhou
Ph.D. student at University of Southern California
Wenxuan Zhou
rliable is an open-source Python library for reliable evaluation, even with a handful of runs, on reinforcement learning and machine learnings benchmarks.

Open-source library for reliable evaluation on reinforcement learning and machine learning benchmarks. See NeurIPS 2021 oral for details.

Google Research 529 Jan 01, 2023
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

zyrant丶 14 Dec 29, 2021
Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.

Tensorflow-Mobile-Generic-Object-Localizer Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label. Ori

Ibai Gorordo 11 Nov 15, 2022
一个多模态内容理解算法框架,其中包含数据处理、预训练模型、常见模型以及模型加速等模块。

Overview 架构设计 插件介绍 安装使用 框架简介 方便使用,支持多模态,多任务的统一训练框架 能力列表: bert + 分类任务 自定义任务训练(插件注册) 框架设计 框架采用分层的思想组织模型训练流程。 DATA 层负责读取用户数据,根据 field 管理数据。 Parser 层负责转换原

Tencent 265 Dec 22, 2022
PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval (M2HSE) PyTorch code fo

Xinlei-Pei 6 Dec 23, 2022
I3-master-layout - Simple master and stack layout script

Simple master and stack layout script | ------ | ----- | | | | | Ma

Tobias S 18 Dec 05, 2022
Pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"

About this repository This repo contains an Pytorch implementation for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Netwo

wxDai 7 Oct 14, 2022
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

Shichao Li 138 Dec 09, 2022
Yolov5-opencv-cpp-python - Example of using ultralytics YOLO V5 with OpenCV 4.5.4, C++ and Python

yolov5-opencv-cpp-python Example of performing inference with ultralytics YOLO V

183 Jan 09, 2023
Split your patch similarly to `git add -p` but supporting multiple buckets

split-patch.py This is git add -p on steroids for patches. Given a my.patch you can run ./split-patch.py my.patch You can choose in which bucket to p

102 Oct 06, 2022
Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition (NeurIPS 2019)

MLCR This is the source code for paper Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition. Xuesong Niu, Hu Han, Shiguang

Edson-Niu 60 Nov 29, 2022
git《Commonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Commonsense Knowledge Base Completion with Structural and Semantic Context Code for the paper Commonsense Knowledge Base Completion with Structural an

AI2 96 Nov 05, 2022
Empowering journalists and whistleblowers

Onymochat Empowering journalists and whistleblowers Onymochat is an end-to-end encrypted, decentralized, anonymous chat application. You can also host

Samrat Dutta 19 Sep 02, 2022
This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

ICCV Workshop 2021 VTGAN This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

Sharif Amit Kamran 25 Dec 08, 2022
Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

dcf-game-infrastructure All the components necessary to run a game of the OOO DC

Order of the Overflow 46 Sep 13, 2022
一个目标检测的通用框架(不需要cuda编译),支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

一个目标检测的通用框架(不需要cuda编译),支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

Haoyu Xu 203 Jan 03, 2023
Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

You Only Cut Once (YOCO) YOCO is a simple method/strategy of performing augmenta

88 Dec 28, 2022
Equivariant GNN for the prediction of atomic multipoles up to quadrupoles.

Equivariant Graph Neural Network for Atomic Multipoles Description Repository for the Model used in the publication 'Learning Atomic Multipoles: Predi

16 Nov 22, 2022
This app is a simple example of using Strealit to create a financial data web app.

Streamlit Demo: Finance Chart This app is a simple example of using Streamlit to create a financial data web app. This demo use streamlit, pandas and

91 Jan 02, 2023
Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

ADGC: Awesome Deep Graph Clustering ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets).

yueliu1999 297 Dec 27, 2022