Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

Overview

Interactive Machine Comprehension with Dynamic Knowledge Graphs


Implementation for the EMNLP 2021 paper.

Dependencies

apt-get -y update
apt-get install -y unzip zip parallel
conda create -p /tmp/imrc python=3.6 numpy scipy cython nltk
conda activate /tmp/imrc
pip install --upgrade pip
pip install numpy==1.16.2
pip install gym==0.15.4
pip install tqdm pipreqs pyyaml pytz visdom
conda install pytorch torchvision cudatoolkit=9.2 -c pytorch
pip install transformers
pip install allennlp

Data Preparation

Split SQuAD 1.1 and preprocess

The original SQuAD dataset does not provide its test set, we take 23 wiki articles from its training set as our validation set. We then use the SQuAD dev set as our test set.

# download SQuAD from official website, then
python utils/split_original_squad.py

To speed up training, we parse (tokenization and SRL) the dataset in advance.

python utils/preproc_squad.py

This will result squad_split/processed_squad.1.1.split.[train/valid/test].json, which are used in iMRC tasks.

Preprocess Wikipedia data for self-supervised learning

python utils/get_wiki_filter_squad.py
python utils/split_wiki_data.py

This will result wiki_without_squad/wiki_without_squad_[train/valid/test].json, which are used to pre-train the continuous belief graph generator.

Training

To train the agent equipped with different types of graphs, run:

# without graph
python main.py configs/imrc_none.yaml

# co-occurrence graph
python main.py configs/imrc_cooccur.yaml

# relative position graph
python main.py configs/imrc_rel_pos.yaml

# SRL graph
python main.py configs/imrc_srl.yaml

# continuous belief graph
# in this setting, we need a pre-trained graph generator.
# we provide our pre-trained graph generator at
# https://drive.google.com/drive/folders/1zZ7C_-xaYsfg2Ms7_BO5n3Qzx69UqMKD?usp=sharing

# one can choose to train their own version by:
python pretrain_observation_infomax.py configs/pretrain_cont_bnelief.yaml
# then using the downloaded/saved model checkpoint
python main.py configs/imrc_cont_belief.yaml

To change the task settings/configurations:

general:
  naozi_capacity: 1  # capacity of agent's external memory queue (1, 3, 5)
  generate_or_point: "point"  # "qmpoint": q+o_t, "point": q, "generate": vocab
  disable_prev_next: False  # False: Easy Mode, True: Hard Mode

model:
  recurrent: True  # recurrent component described in Section 3.3 and Section 4.Additional Results

Citation

@inproceedings{Yuan2021imrc_graph,
  title={Interactive Machine Comprehension with Dynamic Knowledge Graphs},
  author={Xingdi Yuan},
  year={2021},
  booktitle="EMNLP",
}
Owner
Xingdi (Eric) Yuan
Senior Research Engineer at Microsoft Research, Montréal
Xingdi (Eric) Yuan
Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.

How The New York Times can increase Engagement on Facebook Using machine learning to understand characteristics of news content that garners "high" Fa

Jessica Miles 0 Sep 16, 2021
Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow.

Denoised-Smoothing-TF Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow. Denoised Smoothing is

Sayak Paul 19 Dec 11, 2022
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

LEI TAI 111 Dec 08, 2022
Code repository for "Free View Synthesis", ECCV 2020.

Free View Synthesis Code repository for "Free View Synthesis", ECCV 2020. Setup Install the following Python packages in your Python environment - num

Intelligent Systems Lab Org 253 Dec 07, 2022
A novel framework to automatically learn high-quality scanning of non-planar, complex anisotropic appearance.

appearance-scanner About This repository is an implementation of the neural network proposed in Free-form Scanning of Non-planar Appearance with Neura

Xiaohe Ma 14 Oct 18, 2022
Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

Updates (2020/06/21) Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training. Pyr

1.3k Jan 04, 2023
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

Microsoft 29 Dec 29, 2022
Modelisation on galaxy evolution using PEGASE-HR

model_galaxy Modelisation on galaxy evolution using PEGASE-HR This is a labwork done in internship at IAP directed by Damien Le Borgne (https://github

Adrien Anthore 1 Jan 14, 2022
PyTorch implementation of ARM-Net: Adaptive Relation Modeling Network for Structured Data.

A ready-to-use framework of latest models for structured (tabular) data learning with PyTorch. Applications include recommendation, CRT prediction, healthcare analytics, and etc.

48 Nov 30, 2022
Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Ceph.

Project Aquarium Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Cep

Aquarist Labs 73 Jul 21, 2022
Resources for the Ki testnet challenge

Ki Testnet Challenge This repository hosts ki-testnet-challenge. A set of scripts and resources to be used for the Ki Testnet Challenge What is the te

Ki Foundation 23 Aug 08, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

249 Dec 28, 2022
PyTorch implementation of Weak-shot Fine-grained Classification via Similarity Transfer

SimTrans-Weak-Shot-Classification This repository contains the official PyTorch implementation of the following paper: Weak-shot Fine-grained Classifi

BCMI 60 Dec 02, 2022
A machine learning project which can detect and predict the skin disease through image recognition.

ML-Project-2021 A machine learning project which can detect and predict the skin disease through image recognition. The dataset used for this is the H

Debshishu Ghosh 1 Jan 13, 2022
Code for the Paper "Diffusion Models for Handwriting Generation"

Code for the Paper "Diffusion Models for Handwriting Generation"

62 Dec 21, 2022
Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]

Introduction This repository is for X-Linear Attention Networks for Image Captioning (CVPR 2020). The original paper can be found here. Please cite wi

JDAI-CV 240 Dec 17, 2022
FairyTailor: Multimodal Generative Framework for Storytelling

FairyTailor: Multimodal Generative Framework for Storytelling

Eden Bens 172 Dec 30, 2022
The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals

Wearables Development Toolkit (WDK) The Wearables Development Toolkit (WDK) is a framework and set of tools to facilitate the iterative development of

Juan Haladjian 114 Nov 27, 2022
Genshin-assets - 👧 Public documentation & static assets for Genshin Impact data.

genshin-assets This repo provides easy access to the Genshin Impact assets, primarily for use on static sites. Sources Genshin Optimizer - An Artifact

Zerite Development 5 Nov 22, 2022
ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation

ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation This repository provides a PyTorch implementation of ADSPM. Requirements Pyth

24 Jul 24, 2022