INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing

Existing studies on semantic parsing focus primarily on mapping a natural-language utterance to a corresponding logical form in one turn. However, because natural language can contain a great deal of ambiguity and variability, this is a difficult challenge. In this work, we investigate an interactive semantic parsing framework that explains the predicted logical form step by step in natural language and enables the user to make corrections through natural-language feedback for individual steps. We focus on question answering over knowledge bases (KBQA) as an instantiation of our framework, aiming to increase the transparency of the parsing process and help the user appropriately trust the final answer. To do so, we construct INSPIRED, a crowdsourced dialogue dataset derived from the ComplexWebQuestions dataset.

This repository will contain the dataset and code for our paper Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction.

Data

Dataset Download

The dataset can be downloaded under this path: ./data/dataset.jsonl

Data Structure

In the dataset file, each line is a dictionary with several keys:

{
    "id": "ID number",
    "cwq_question": "Original complex question in CWQ dataset",
    "rephrased_question": "Rephrased complex question by workers",
    "rephrased_question_label": " 'Replacement' or 'Alternative' ",
    "question": "If rephrased_question_label is marked as 'Replacement', set the value the same as rephrased_question; Otherwise, set it the same as cwq_question",
    "final_answer": "Final answer for the complex question",
    "gold_parse": "Gold sparql query for complex question",
    "preprocessed_gold_parse": "Preprocessed gold parse with entities and prefix replaced",
    "predicted_parse": "Predicted sparql query by initial semantic parser",
    "gold_sub_lfs": "A list of gold sub-logical forms after decomposition",
    "pred_sub_lfs": "A list of predicted sub-logical forms after decomposition",
    "gold_sub_qs": [
        {
          "sub_id": "ID of sub questions",
          "sub_question": "Rephrased sub question",
          "temp_sub_question": "Templated sub question for gold sub-logical form",
          "answer": "Answer for each sub question",
        }, "..."], 
    "pred_sub_qs": [
        {
          "sub_id": "ID of sub questions",
          "sub_question": "Rephrased sub question",
          "temp_sub_question": "Templated sub question for predicted sub-logical form",
          "answer": "Answer for each sub question",
        }, "..."], 
    "feedback": "A list of human feedback"
    
}

INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing

Related tags

Overview

INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing

Data

Dataset Download

Data Structure

Owner

The Deep Learning with Julia book, using Flux.jl.

Neural Network to colorize grayscale images

This repository contains a Ruby API for utilizing TensorFlow.

Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Count the MACs / FLOPs of your PyTorch model.

Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

SimulLR - PyTorch Implementation of SimulLR

A decent AI that solves daily Wordle puzzles. Works with different websites with similar wordlists,.

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

Code for the paper "Generative design of breakwaters usign deep convolutional neural network as a surrogate model"

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Companion repository to the paper accepted at the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities

QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

BoxInst: High-Performance Instance Segmentation with Box Annotations

Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

Type4Py: Deep Similarity Learning-Based Type Inference for Python

2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

Pytorch implementation of One-Shot Affordance Detection