Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

Overview

marge

This repository releases the code for Generating Query Focused Summaries from Query-Free Resources.

Please cite the following paper [bib] if you use this code,

Xu, Yumo, and Mirella Lapata. "Generating Query Focused Summaries from Query-Free Resources." In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 6096–6109. 2021.

The availability of large-scale datasets has driven the development of neural models that create generic summaries from single or multiple documents. In this work we consider query focused summarization (QFS), a task for which training data in the form of queries, documents, and summaries is not readily available. We propose to decompose QFS into (1) query modeling (i.e., finding supportive evidence within a set of documents for a query) and (2) conditional language modeling (i.e., summary generation). We introduce MaRGE, a Masked ROUGE Regression framework for evidence estimation and ranking which relies on a unified representation for summaries and queries, so that summaries in generic data can be converted into proxy queries for learning a query model. Experiments across QFS benchmarks and query types show that our model achieves state-of-the-art performance despite learning from weak supervision.

Should you have any query please contact me at [email protected].

Preliminary setup

Project structure

marge
└───requirements.txt
└───README.md
└───log        # logging files
└───run        # scripts for MaRGE training
└───src        # source files
└───data       # generic data for training; qfs data for test/dev
└───graph      # graph components for query expansion
└───model      # MaRGE models for inference
└───rank       # ranking results
└───text       # summarization results
└───unilm_in   # input files to UniLM
└───unilm_out  # output files from UniLM

After cloning this project, use the following command to initialize the structure:

mkdir log data graph model rank text unilm_in unilm_out

Creating environment

cd ..
virtualenv -p python3.6 marge
cd marge
. bin/activate
pip install -r requirements.txt

You need to install apex:

cd ..
git clone https://www.github.com/nvidia/apex
cd apex
python3 setup.py install

Also, you need to setup ROUGE evaluation if you have not yet done it. Please refer to this repository. After finishing the setup, specify the ROUGE path in frame/utils/config_loader.py as an attribute of PathParser:

self.rouge_dir = '~/ROUGE-1.5.5/data'  # specify your ROUGE dir

Preparing benchmark data

Since we are not allowed to distribute DUC clusters and summaries, you can request DUC 2005-2007 from NIST. After acquiring the data, gather each year's clusters and summaries under data/duc_cluster and data/duc_summary, respectively. For instance, DUC 2006's clusters and summaries should be found under data/duc_cluster/2006/ and data/duc_summary/2006/, respectively. For DUC queries: you don't have to prepare queries by yourself; we have put 3 json files for DUC 2005-2007 under data/masked_query, which contain a raw query and a masked query for each cluster. Queries will be fetched from these files at test time.

TD-QFS data can be downloaded from here. You can also use the processed version here.

After data preparation, you should have the following directory structure with the right files under each folder:

marge
└───data
│   └───duc_clusters   # DUC clusters 
│   └───duc_summaries  # DUC reference summaries 
│   └───masked_query   # DUC queries (raw and masked)
│   └───tdqfs          # TD-QFS clusters, queries and reference summaries

MaRGE: query modeling

Preparing training data

Source files for building training data are under src/sripts. For each dataset (Multi-News or CNN/DM), there are three steps create MaRGE training data.

A training sample for Marge can be represented as {sentence, masked summary}->ROUGE(sentence, summary). So we need to get the ROUGE scores for all sentences (step 1) and creating masked summaries (step 2). Then we put them together (step 3).

  1. Calculate ROUGE scores for all sentences:
python src/sripts/dump_sentence_rouge_mp.py
  1. Build masked summaries:
python src/sripts/mask_summary_with_ratio.py
  1. Build train/val/test datasets:
python src/sripts/build_marge_dataset_mn.py

In our experiments, Marge trained on data from Multi-News yielded the best performance in query modeling. If you want to build training data from CNN/DM:

  1. Use the function gathered_mp_dump_sentence_cnndm() in the first step (otherwise, use the function gathered_mp_dump_sentence_mn() )
  2. Set dataset='cnndm' in the second step (otherwise, dataset='mn')
  3. Use build_marge_dataset_cnndm.py instead for the last step

Model training

Depending on which training data you have built, you can run either one of the following two scripts:

. ./run/run_rr_cnndm.sh   # train MaRGE with data from CNN/DM
. ./run/run_rr_mn.sh  # train MaRGE with data from Multi-News

Configs specified in these two files are used in our experiments, but feel free to change them for further experimentation.

Inference and evaluation

Use src/frame/rr/main.py for DUC evaluation and src/frame/rr/main_tdqfs.py for TD-QFS evalaution. We will take DUC evaluation for example.

In src/frame/rr/main.py, run the following methods in order (or at once):

init()
dump_rel_scores()  # inference with MaRGE
rel_scores2rank()  # turn sentence scores to sentence rank
rr_rank2records()  # take top sentences

To evaluate evidence rank, in src/frame/rr/main.py, run:

select_e2e()

MaRGESum: summary generation

Prepare training data from Multi-News

To train a controllable generator, we make the following three changes to the input from Multi-News (and CNN/DM):

  1. Re-order input sentences according to their ROUGE scores, so the top ones will be biased over:
python scripts/selector_for_train.py
  1. Prepend a summary-length token
  2. Prepend a masked summary (UMR-S)

Prepare training data from CNN/DM

Our best generation result is obtained with CNN/DM data. To train MargeSum on CNN/DM data, apart from the above-mentioned three customizations, we need an extra step: build a multi-document version of CNN/DM.

This is mainly because the summaries in the original CNN/DM are fairly short, while testing on QFS requires 250 words as output. To fix this issue, we concatenate summaries from a couple of relevant samples to get a long enough summary. Therefore, the input is now a cluster of the documents from these relevant samples.

This involves in Dr.QA to index all summaries in CNN/DM. After indexing, you can use the following script to cluster samples via retrieving similar summaries:

python scripts/build_cnndm_clusters.py
  • upload the training data, so you can use this multi-document CNN/DM without making it from scratch.

Inference and evaluation

Setting up UniLM environment

To evaluate abstractive summarization, you need to setup an UniLM evironment following the instructions here.

After setting up UnILM, in src/frame/rr/main.py, run:

build_unilm_input(src='rank')

This turns ranked evidence from Marge into MargeSum input files.

Now You can evaluate the trained UniLM model for developement and testing. Go to the UniLM project root, set the correct input directory, and deocode the summaries.

  • add detailed documentation for setting up UniLM.
  • add detailed documentation for decoding.

To evaluate the output, use the following function in src/frame/rr/main.py:

eval_unilm_out()

You can specifiy inference configs in src/frame/rr/rr_config.py.

Owner
Yumo Xu
PhD student @EdinburghNLP.
Yumo Xu
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2020

Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2020

Phillip Lippe 1.1k Jan 07, 2023
State-to-Distribution (STD) Model

State-to-Distribution (STD) Model In this repository we provide exemplary code on how to construct and evaluate a state-to-distribution (STD) model fo

<a href=[email protected]"> 2 Apr 07, 2022
(AAAI2020)Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing This repository contains pytorch source code for AAAI2020 oral paper: Grapy-ML

54 Aug 04, 2022
Just Go with the Flow: Self-Supervised Scene Flow Estimation

Just Go with the Flow: Self-Supervised Scene Flow Estimation Code release for the paper Just Go with the Flow: Self-Supervised Scene Flow Estimation,

Himangi Mittal 50 Nov 22, 2022
MoveNet Single Pose on OpenVINO

MoveNet Single Pose tracking on OpenVINO Running Google MoveNet Single Pose models on OpenVINO. A convolutional neural network model that runs on RGB

35 Nov 11, 2022
The codebase for Data-driven general-purpose voice activity detection.

Data driven GPVAD Repository for the work in TASLP 2021 Voice activity detection in the wild: A data-driven approach using teacher-student training. S

Heinrich Dinkel 75 Nov 27, 2022
RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

[3DV 2021] We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator

Phong Nguyen Ha 4 May 26, 2022
A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Visdom A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Python. Overview Concepts Setup Usage API To

FOSSASIA 9.4k Jan 07, 2023
Pytorch Implementation of Adversarial Deep Network Embedding for Cross-Network Node Classification

Pytorch Implementation of Adversarial Deep Network Embedding for Cross-Network Node Classification (ACDNE) This is a pytorch implementation of the Adv

陈志豪 8 Oct 13, 2022
Implementation of the "PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences" paper.

PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences Introduction Point cloud sequences are irregular and unordered in the spatial dimen

Hehe Fan 63 Dec 09, 2022
Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

UnRigidFlow This is the official PyTorch implementation of UnRigidFlow (IJCAI2019). Here are two sample results (~10MB gif for each) of our unsupervis

Liang Liu 28 Nov 16, 2022
Bridging Vision and Language Model

BriVL BriVL (Bridging Vision and Language Model) 是首个中文通用图文多模态大规模预训练模型。BriVL模型在图文检索任务上有着优异的效果,超过了同期其他常见的多模态预训练模型(例如UNITER、CLIP)。 BriVL论文:WenLan: Bridgi

235 Dec 27, 2022
Guided Internet-delivered Cognitive Behavioral Therapy Adherence Forecasting

Guided Internet-delivered Cognitive Behavioral Therapy Adherence Forecasting #Dataset The folder "Dataset" contains the dataset use in this work and m

0 Jan 08, 2022
PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

Anti-Backdoor Learning PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data. Check the unlearning effect

Yige-Li 51 Dec 07, 2022
A python library for time-series smoothing and outlier detection in a vectorized way.

tsmoothie A python library for time-series smoothing and outlier detection in a vectorized way. Overview tsmoothie computes, in a fast and efficient w

Marco Cerliani 517 Dec 28, 2022
Hypercomplex Neural Networks with PyTorch

HyperNets Hypercomplex Neural Networks with PyTorch: this repository would be a container for hypercomplex neural network modules to facilitate resear

Eleonora Grassucci 21 Dec 27, 2022
Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021) Overview of paths used in DIG and IG. w is the word being attributed. The

INK Lab @ USC 17 Oct 27, 2022
PyTorch implementation for COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction (CVPR 2021)

Completer: Incomplete Multi-view Clustering via Contrastive Prediction This repo contains the code and data of the following paper accepted by CVPR 20

XLearning Group 72 Dec 07, 2022
🔮 A refreshing functional take on deep learning, compatible with your favorite libraries

Thinc: A refreshing functional take on deep learning, compatible with your favorite libraries From the makers of spaCy, Prodigy and FastAPI Thinc is a

Explosion 2.6k Dec 30, 2022
HDMapNet: A Local Semantic Map Learning and Evaluation Framework

HDMapNet_devkit Devkit for HDMapNet. HDMapNet: A Local Semantic Map Learning and Evaluation Framework Qi Li, Yue Wang, Yilun Wang, Hang Zhao [Paper] [

Tsinghua MARS Lab 421 Jan 04, 2023