Answer a series of contextually-dependent questions like they may occur in natural human-to-human conversations.

Overview

SCAI-QReCC-21

[leaderboards] [registration] [forum] [contact] [SCAI]

Answer a series of contextually-dependent questions like they may occur in natural human-to-human conversations.

  • Submission deadline: September 8, 2021 Extended: September 15, 2021
  • Results announcement: September 30, 2021
  • Workshop presentations: October 8, 2021

Data

[Zenodo] [original]

File names here refer to the respective files hosted on [Zenodo].

The passage collection (passages.zip) is 27.5GB with 54M passages!

The input format for the task (scai-qrecc21-[toy,training,test]-questions[,-rewritten].json) is a JSON file:

, "Turn_no": X, "Question": " " }, ... ]">
[
  {
    "Conversation_no": 
    
     ,
    "Turn_no": X,
    "Question": "
     
      "
  }, ...
]

     
    

With X being the number of the question in the conversation. Questions with the same Conversation_no are from the same conversation.

The questions-rewritten.json-files contain human rewritten questions that can be used by systems that do not want to participate in question rewriting.

Submission

Register for the task using this form. We will then send you your TIRA login once it is ready.

The challenge is hosted on TIRA. Participants are encouraged to upload their code and run the evaluation on the VMs provided by the platform to ensure reproducibility of the results. It is also possible to upload the submission as a single JSON file.

The submission format for the task is a JSON file similar to the input (all Model_xxx-fields are optional and you can omit them from the submission, e.g. provide only Conversation_no, Turn_no and Model_answer to get the EM and F1 scores for the generated answers):

, "Turn_no": X, "Model_rewrite": " ", "Model_passages": { " ": , ... }, "Model_answer": " " }, ... ]">
[
  {
    "Conversation_no": 
       
        ,
    "Turn_no": X,
    "Model_rewrite": "
        
         ",
    "Model_passages": { 
      "
         
          ": 
          
           , ...
    },
    "Model_answer": "
           
            " }, ... ] 
           
          
         
        
       

Example: scai-qrecc21-naacl-baseline.zip

You can use the code of our simple baseline to get started.

Software Submission

We recommend participants to upload (through SSH or RDP) their software/system to their dedicated TIRA virtual machine (assigned after registration), so that their runs can be reproduced and so that they can be easily applied to different data (of same format) in the future. The mail send to you after registration gives you the credentials to access the TIRA web interface and your VM. If you cannot connect to your VM, ensure it is powered on in the TIRA web interface.

Your software is expected to accept two arguments:

  • An input directory (named $inputDataset in TIRA) that contains the questions.json input file and passages-index-anserini directory. The latter contains a full Anserini index of the passage collection. Note that you need to install openjdk-11-jdk-headless to use it. We may be able to add more of such indices on request.
  • An output directory (named $outputDir in TIRA) into which your software needs to place the submission as run.json.

Install your software to your VM. Then go to the TIRA web interface and click "Add software". Specify the command to run your software (see the image for the simple baseline).

IMPORTANT: To ensure reproducibility, create a "Software" in the TIRA web interface for each parameter setting that you consider a submission to the challenge.

Click on "Run" to execute your software for the selected input dataset. Your VM will not be accessible while your system is running, be detached from the internet (to ensure your software is fully installed in your virtual machine), and afterwards restored to the state before the run. Since the test set is rather large (the simple baseline takes nearly 11 hours to complete), we highly recommend you first test your software on the scai-qrecc21-toy-dataset-2021-07-20 input dataset. This dataset contains the first conversation (6 turns/questions) only. For the test-dataset, send us a mail at [email protected] so that we unblind your results.

TIRA Interface: VM status and submission

Then go to the "Runs" section below and click on the blue (i)-icon of the software run to check the software output. You can also download the run from there.

NOTE: By submitting your software you retain full copyrights. You agree to grant us usage rights for evaluation of the corresponding data generated by your software. We agree not to share your software with a third party or use it for any purpose other than research.

Run Submission

You can upload a JSON file as a submission at https://www.tira.io/run-upload-scai-qrecc21.

TIRA Interface: VM status and submission

Please specify the name and a description of your run in the form. After a successful upload, the page will redirect you to the overview of all your submissions where you should evaluate your run to verify that your run is valid. At the "Runs" section, you can click on the blue (i)-icon to double-check your upload. You can also download the run from there.

Evaluation

[script]

Once you run your software or uploaded your run, "Run" the evaluator on that run through the TIRA web interface (below the software; works out-of-the-box).

TIRA Interface: Evaluation

Then go to the "Runs" section below and click on the blue (i)-icon of the evaluator run to see your scores.

Ground truth

We use the QReCC paper annotations in the initial phase, and will update them with alternative answer spans and passages by pooling and crowdsourcing the relevance judgements over the results submitted by the challenge participants (similar to the TREC evaluation setup).

Metrics

We use the same metrics as the QReCC paper, but may add more for the final evaluation: ROUGE1-R for question rewriting, Mean Reciprocal Rank (MRR) for passage retrieval, and F1 and Exact Match for question answering.

Baselines

We provide the following baselines for comparison:

  • scai-qrecc21-simple-baseline: BM25 baseline for passage retrieval using original conversational questions without rewriting. We recommend to use this code as a boilerplate to kickstart your own submission using the VM.
  • scai-qrecc21-naacl-baseline: results for the end-to-end approach using supervised question rewriting and QA models reported in the QReCC paper (accepted at NAACL'21). This sample run is available on Zenodo as scai-qrecc21-naacl-baseline.zip.

Note that the baseline results differ from the ones reported in the paper since we made several corrections to the evaluation script and the ground truth annotations:

  • We excluded the samples for which the ground truth is missing from the evaluation (i.e., no relevant passages or no answer text or no rewrite provided by the human annotators)

  • We removed 5,251 passages judgements annotated by the heuristic as relevant for the short answers with lengths <= 5 since these matches are often trivial and unrelated, e.g., the same noun phrase appearing in different contexts.

Resources

Some useful links to get you started on a new conversational open-domain QA system:

Conversational Passage Retrieval

Answer Generation

Passage Retrieval

Conversational Question Reformulation

PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation

PocketNet This is the official repository of the paper: PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and M

Fadi Boutros 40 Dec 22, 2022
Official implementation of Unfolded Deep Kernel Estimation for Blind Image Super-resolution.

Unfolded Deep Kernel Estimation for Blind Image Super-resolution Hongyi Zheng, Hongwei Yong, Lei Zhang, "Unfolded Deep Kernel Estimation for Blind Ima

Z80 15 Dec 26, 2022
This is the research repository for Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition.

Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition This is the research repository for Vid2

Future Interfaces Group (CMU) 26 Dec 24, 2022
DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations This repository contains the data, scripts and baseline co

Alexa 51 Dec 17, 2022
Deep universal probabilistic programming with Python and PyTorch

Getting Started | Documentation | Community | Contributing Pyro is a flexible, scalable deep probabilistic programming library built on PyTorch. Notab

7.7k Dec 30, 2022
[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

Sparse Structure Learning via Graph Neural Networks for inductive document classification Make graph dataset create co-occurrence graph for datasets.

16 Dec 22, 2022
Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

ASFormer: Transformer for Action Segmentation This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segme

42 Dec 23, 2022
A Unified Framework and Analysis for Structured Knowledge Grounding

UnifiedSKG 📚 : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu

HKU NLP Group 370 Dec 21, 2022
Breaking the Dilemma of Medical Image-to-image Translation

Breaking the Dilemma of Medical Image-to-image Translation Supervised Pix2Pix and unsupervised Cycle-consistency are two modes that dominate the field

Kid Liet 86 Dec 21, 2022
Propose a principled and practically effective framework for unsupervised accuracy estimation and error detection tasks with theoretical analysis and state-of-the-art performance.

Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles This project is for the paper: Detecting Errors and Estimating

Jiefeng Chen 13 Nov 21, 2022
This was initially the repo for the project of [email protected] of Asaf Mazar, Millad Kassaie and Georgios Chochlakis named "Powered by the Will? Exploring Lay Theories of Behavior Change through Social Media"

Subreddit Analysis This repo includes tools for Subreddit analysis, originally developed for our class project of PSYC 626 in USC, titled "Powered by

Georgios Chochlakis 1 Dec 17, 2021
A library for graph deep learning research

Documentation | Paper [JMLR] | Tutorials | Benchmarks | Examples DIG: Dive into Graphs is a turnkey library for graph deep learning research. Why DIG?

DIVE Lab, Texas A&M University 1.3k Jan 01, 2023
Implementation of CaiT models in TensorFlow and ImageNet-1k checkpoints. Includes code for inference and fine-tuning.

CaiT-TF (Going deeper with Image Transformers) This repository provides TensorFlow / Keras implementations of different CaiT [1] variants from Touvron

Sayak Paul 9 Jun 26, 2022
A Pytorch implementation of the multi agent deep deterministic policy gradients (MADDPG) algorithm

Multi-Agent-Deep-Deterministic-Policy-Gradients A Pytorch implementation of the multi agent deep deterministic policy gradients(MADDPG) algorithm This

Phil Tabor 159 Dec 28, 2022
Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

Perceiver IO Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs Usage import torch from src.perceiver.

Timur Ganiev 111 Nov 15, 2022
QR2Pass-project - A proof of concept for an alternative (passwordless) authentication system to a web server

QR2Pass This is a proof of concept for an alternative (passwordless) authenticat

4 Dec 09, 2022
Use CLIP to represent video for Retrieval Task

A Straightforward Framework For Video Retrieval Using CLIP This repository contains the basic code for feature extraction and replication of results.

Jesus Andres Portillo Quintero 54 Dec 22, 2022
Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

136 Dec 12, 2022
This repository contains an implementation of the Permutohedral Attention Module in Pytorch

Permutohedral_attention_module This repository contains an implementation of the Permutohedral Attention Module

Samuel JOUTARD 26 Nov 27, 2022
Contextual Attention Network: Transformer Meets U-Net

Contextual Attention Network: Transformer Meets U-Net Contexual attention network for medical image segmentation with state of the art results on skin

Reza Azad 67 Nov 28, 2022