A Survey of Natural Language Generation in Task-Oriented Dialogue System (TOD): Recent Advances and New Frontiers

Overview

A Survey of Natural Language Generation in Task-Oriented Dialogue System (TOD): Recent Advances and New Frontiers

This repository contains a list of papers, open-sourced codes, datasets and leaderboards in NLG field which is carefully and comprehensively organized. If you found any error, please don't hesitate to open an issue or pull request.

Contributor

Contributed by Libo Qin, Zhouyang Li, Jieming Lou, Qiying Yu, Wanxiang Che.

Thanks for supports from our adviser Wanxiang Che!

Introduction

Natural Language Generation (NLG) in task-oriented dialogue system is a critical component in task-oriented dialogue systems, which has attracted increasing research attention.

NLG aims to convert dialogue acts into natural language responses. The example is shown in the table below: the input contains dialogue act inform and slot value pairs (name=Blue Spice, priceRange=low, familyFriendly=yes), and the task of NLG is to transform the input into the corresponding natural language reply: "The Blue Spice is a low cost venue. It is a family friendly location."

Input inform (name = Blue Spice, priceRange =low, familyFriendly = yes)
Output The Blue Spice is a low cost venue. It is a family friendly location.

For the purpose of alleviating pressure in article/dataset collation, we worked on sorting out the relevant data sets, papers, codes and lists of NLG in this project.

At present, the project has been completely open source, including:

  1. NLG domain dataset sorting table: we sorted out the dataset used in NLG field. You can index in it and get the message of general scale, basic structure, content, characteristics, source and acquisition method of the dataset you want to know.
  2. Articles and infos in different directions in the field of NLG: we classified and arranged the papers according to the current mainstream frontiers. Each line of the list contains not only the title of the paper, but also the year of publication, the source of publication, the paper link and code link for quick indexing, as well as the dataset used.
  3. Leaderboard list on the mainstream datasets of NLG: we sorted out the leaderboard on the mainstream datasets. In addition to the paper/model/method name and related scores, each line also has links to year, paper and code if it has.

The taxonomy of deeplearning-based models in TOD NLG can be summarized into this picture below.

NLG-taxonomy

Quick path

Resources

1. Classic methods of task-oriented natural language generation

1.1 Templete-based

[1] NLG vs. TemplatesComputational Linguistics 1995 [pdf]

[2] A versatile system for language generation in conversational system applicationICSLP 2000 [pdf]

[3] Natural language generation in the ibm flight information systemANLP-NAACL 2000 Workshop [pdf]

1.2 Plan-based

[1] SPoT: A trainable sentence plannerNAACL2001 [pdf]

[2] Training a sentence planner for spoken dialogue using boostingComputer Speech & Language 2002 [pdf]

[3] Response planning and generation in the MERCURY flight reservation systemComputer Speech & Language 2002 [pdf]

[4] Acquiring correct knowledge for natural language generationJAIR 2003 [pdf]

[5] Trainable sentence planning for complex information presentation in spoken dialog systemsACL2004 [pdf]

[6] Trainable sentence planning system [pdf]

[7] A probabilistic framework for dialog simulation and optimal strategy learningIEEE Transactions on Audio, Speech and Language Processing 2006[pdf]

[8] An investigation into the validity of some metrics for automatically evaluating natural language generation systems Computational Linguistics 2009[pdf]

[9] Individual and domain adaptation in sentence planning for dialoguearXiv 2011 [pdf]

[10] Controlling user perceptions of linguistic style: Trainable generation of personality traitsACL 2011 [pdf]

[11] Towards personality-based user adaptation: psychologically informed stylistic language generation [pdf]

1.3 Class-based

[1] Stochastic language generation for spoken dialogue systems NAACL 2000 [pdf]

[2] Bootstrapping lexical choice via multiple-sequence alignmentEMNLP 2002 [pdf]

[3] Automatic generation of weather forecast texts using comprehensive probabilistic generation-space modelsNatural Language Engineering 2008 [pdf]

1.4 Phrase-based

[1] Phrase-based statistical language generation using graphical models and active learningACL 2010 [pdf]

[2] Training a natural language generator from unaligned data ACL 2015 [pdf]

[3] Imitation learning for language generation from unaligned dataCOLING 2016 [pdf]

2. Deeplearning-based methods of task-oriented natural language generation

2.1 RNN-based

[1] Stochastic Language Generation in Dialogue using RNN with Convolutional Sentence Reranking(Restaurant dataset) Sigdial 2015[pdf]

[2] Semantically Conditioned LSTM-based NLG for spoken dialogue systems(Restaurant/Hotel dataset)EMNLP 2015[pdf]

[3] What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment(WeatherGov/RoboCup dataset)[pdf]

[4] Toward multi-domain language generation using recurrent neural networks(Restaurant/Hotel dataset)NIPS Workshop 2015[pdf]

[5] Multi-domain Neural Network Language Generation for Spoken Dialogue Systems(Restaurant/Hotel/Television/Laptop dataset)NAACL-HLT 2016 [pdf]

[6] Neural-based Natural Language Generation in Dialogue using RNN Encoder-Decoder with Semantic Aggregation(Restaurant/Hotel/Television/Laptop dataset) SIGDIAL 2017[pdf]

2.2 Seq2Seq-based

[1] Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings(Restaurant dataset)ACL 2016[pdf][code]

[2] A Context-aware Natural Language Generator for Dialogue Systems(alex context nlg dataset) SIGDIAL 2016[pdf][code]

[3] A Network-based End-to-End Trainable Task-oriented Dialogue System(Woz dataset)EACL 2017[pdf]

[4] Adversarial Domain Adaptation for Variational Neural Language Generation in Dialogue Systems (Restaurant/Hotel/Television/Laptop dataset) COLING 2018[pdf]

[5] Dual Latent Variable Model for Low-Resource Natural Language Generation in Dialogue Systems (Restaurant/Hotel/Television/Laptop dataset) CoNLL 2018[pdf]

[6] Variational Cross-domain Natural Language Generation for Spoken Dialogue Systems(Restaurant/Hotel/Television/Laptop dataset) Sigdial 2018[pdf]

[7] A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation (Restaurant/Television/Laptop dataset) NAACL 2018[pdf]

[8] A Simple Recipe towards Reducing Hallucination in Neural Surface Realisation(E2E challenge dataset)ACL 2018[pdf]

[9] Char2char Generation with Reranking for the E2E NLG Challenge(E2E challenge dataset) cs.CL 2018[pdf]

[10] Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue(E2E challenge dataset) EMNLP|IJCNLP 2019[pdf]

[11] Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue(E2E challenge dataset) ACL 2019[pdf]

[12] Retrospective and Prospective Mixture-of-Generators for Task-Oriented Dialogue Response Generation(Multi-Domain-Woz dataset)arXiv 2020[pdf][code]

[13] Template Guided Text Generation for Task-Oriented Dialogue(E2E challenge /SGD/Multi-Domain-Woz/Multi-Domain-Woz-2.1 dataset)EMNLP 2020[pdf][code]

[14] How to Make Neural Natural Language Generation as Reliable as Templates in Task-Oriented Dialogue(E2E challenge dataset)EMNLP 2020|ACL 2020[pdf][code]

[15] Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines (Restaurant/Hotel/Television/Laptop/E2E challenge dataset)AAAI 2021[pdf]

2.3 Transformer-based

[1] Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention(Multi-Domain-Woz dataset) ACL 2019[pdf][code]

[2] Few-shot Natural Language Generation for Task-Oriented Dialog(Few-Shot-Woz/Multi-Domain-Woz dataset) arXiv 2020[pdf][code]

[3] Efficient Retrieval Augmented Generation from Unstructured Knowledge for Task-Oriented Dialog(Multi-Domain-Woz-2.1 dataset)Workshop of AAAI 2021[pdf][code]

[4] Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation(Multi-Domain-Woz-2.1 dataset)arXiv 2021[pdf]

Dataset

Name Intro Links Detail Size & Stats
Restaurant dataset Collected by a spoken dialogue system providing information about certain venues in San Francisco Download: https://github.com/shawnwun/RNNLG Paper: https://arxiv.org/pdf/1508.01745.pdf Restaurant Information;Dialogue act types: 8 Slots: 12 Dialogue acts: 248 Dialogues: around 1k Utterances: around 5192 the ratio of training, validation, and testing set: 3:1:1
Hotel dataset Collected by a spoken dialogue system providing information about certain venues in San Francisco Download: https://github.com/shawnwun/RNNLG Paper:https://arxiv.org/pdf/1508.01745.pdf Hotel Information;Dialogue act types: 8 Slots: 12 Dialogue acts: 164 Dialogues: around 1k Utterances: 5373 the ratio of training, validation, and testing set: 3:1:1
Laptop dataset 1. Created by workers recruited by Amazon Mechanical Turk (AMT) by asking them to propose an appropriate natural language realisation corresponding to each system dialogue act actually generated by a dialogue system 2. Enumerated all possible combinations of dialogue act types and slots based on the ontology Download: https://github.com/shawnwun/RNNLG Paper:https://arxiv.org/pdf/1603.01232.pdf Laptop Information;Dialogue act types: 14 Slots: 19 Dialogue acts: about 13k Dialogues: around 3k Utterances: 13242 the ratio of training, validation, and testing set: 3:1:1
Television dataset 1. Created by workers recruited by Amazon Mechanical Turk (AMT) by asking them to propose an appropriate natural language realisation corresponding to each system dialogue act actually generated by a dialogue system 2. Enumerated all possible combinations of dialogue act types and slots based on the ontologyo Download: https://github.com/shawnwun/RNNLG Paper: https://arxiv.org/pdf/1603.01232.pdf Laptop Information;Dialogue act types: 14 Slots: 16 Dialogue acts: about 7k Dialogues: around 2k Utterances: 7035 the ratio of training, validation, and testing set: 3:1:1
E2E challenge dataset 1. Collected using the Crowd-Flower platform. 2. For training end-to-end, data-driven natural language generation systems in the restaurant domain. 3. Homepage: http://www.macs.hw.ac.uk/InteractionLab/E2E/ Download: data(original version):https://github.com/tuetschek/e2e-dataset data(cleaned version):https://github.com/tuetschek/e2e-cleaning evaluator:https://github.com/tuetschek/e2e-metrics ; Paper: https://arxiv.org/pdf/1706.09254.pdf Restaurant Information ;However, this dataset has shown more lexical richness and syntactic variation, including discourse phenomena and to generate from this set requires content selection; Dialogue act types: 1 Slots: 8 Train:4862 MRs,42061 References Dev:547 MRs, 4672 References Teest: 630 MRs, 4693 References
Multi-Domain-Woz dataset 1. a fully-labeled multi-domain collection of human-human written conversations 2. for Task-Oriented Dialogue Modelling Download: https://github.com/budzianowski/multiwoz/blob/master/data/MultiWOZ_2.0.zip Paper: https://aclanthology.org/D18-1547.pdf With dialogues spanning across 7 domains(Attraction, Hospital, Police, Hotel, Restaurant, Taxi, Train) and several topics. Each dialogue is annotated with a sequence of dialogue states and corresponding system dialogue acts domains: 7 Dialogues: 8438 Turns: 115424 Slots: 25 Values: 4510
Multi-Domain-Woz-2.1 dataset 1. Correct some mistakes in the Multi-Domain-Woz dataset 2. Re-annotate state and utterances based on the original utterances in the dataset Download: https://github.com/budzianowski/multiwoz/blob/master/data/MultiWOZ_2.1.zip Paper: https://arxiv.org/pdf/1907.01669.pdf 1. correction process results in changes to over 32% of state annotations across 40% of the dialogue turns 2. fix 146 dialogue utterances by canonicalizing slot values in the utterances to the values in the dataset ontology domains: 7 Dialogues: 10438 Slots: 25 Values: 4510

LeaderBoard

Restaurant dataset

Model Slot-err rate BELU Paper / Source Code link Conference
IRN(+KNN) 0.11 0.807 Slot-consistent NLG for Task-oriented Dialogue Systems with Iterative Rectification Network[[pdf]](https://aclanthology.org/2020.acl-main.10.pdf) - Sigdial
RALSTM 0.16 0.779 Paper: Natural Language Generation for Spoken Dialogue System using RNN Encoder-Decoder Networks[[pdf]](https://arxiv.org/pdf/1706.00139.pdf) - CoNLL
Softmax 0.20 0.812 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
VQ-VAE 0.18 0.789 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
NLG-LM - 0.795 Paper: Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue[[pdf]](https://aclanthology.org/D19-1123.pdf) - EMNLP|IJCNLP
Gumbel-softmax 0.21 0.776 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
ARoA 0.30 0.776 Neural-based Natural Language Generation in Dialogue using RNN Encoder-Decoder with Semantic Aggregation[[pdf]](https://arxiv.org/pdf/1706.06714.pdf) - Sigdial
SCLSTM 0.38 0.753 Paper: Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems[[pdf]](https://arxiv.org/pdf/1508.01745.pdf) https://github.com/shawnwun/RNNLG EMNLP
HLSTM 0.74 0.747 Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking[[pdf]](https://arxiv.org/pdf/1508.01755.pdf) https://github.com/shawnwun/RNNLG Sigdial

Hotel dataset

Model Slot-err rate BELU Paper / Source Code link Conference
NLG-LM - 0.939 Paper: Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue[[pdf]](https://aclanthology.org/D19-1123.pdf) - EMNLP|IJCNLP
Softmax 0.46 0.923 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
VQ-VAE 0.55 0.921 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
IRN(+KNN) 0.32 0.911 Slot-consistent NLG for Task-oriented Dialogue Systems with Iterative Rectification Network[[pdf]](https://aclanthology.org/2020.acl-main.10.pdf) - Sigdial
Gumbel-softmax 0.77 0.903 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
RALSTM 0.43 0.898 Paper: Natural Language Generation for Spoken Dialogue System using RNN Encoder-Decoder Networks[[pdf]](https://arxiv.org/pdf/1706.00139.pdf) - CoNLL
ARoA 1.13 0.892 Neural-based Natural Language Generation in Dialogue using RNN Encoder-Decoder with Semantic Aggregation[[pdf]](https://arxiv.org/pdf/1706.06714.pdf) - Sigdial
HLSTM 2.67 0.850 Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking[[pdf]](https://arxiv.org/pdf/1508.01755.pdf) https://github.com/shawnwun/RNNLG Sigdial
SCLSTM 3.07 0.848 Paper: Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems[[pdf]](https://arxiv.org/pdf/1508.01745.pdf) https://github.com/shawnwun/RNNLG EMNLP

Laptop dataset

Model Slot-err rate BELU Paper / Source Code link Conference
Softmax 0.31 0.591 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
NLG-LM - 0.586 Paper: Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue[[pdf]](https://aclanthology.org/D19-1123.pdf) - EMNLP|IJCNLP
Gumbel-softmax 0.63 0.561 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
VQ-VAE 0.65 0.554 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
IRN(+KNN) 0.29 0.537 Slot-consistent NLG for Task-oriented Dialogue Systems with Iterative Rectification Network[[pdf]](https://aclanthology.org/2020.acl-main.10.pdf) - Sigdial
RALSTM 0.42 0.525 Paper: Natural Language Generation for Spoken Dialogue System using RNN Encoder-Decoder Networks[[pdf]](https://arxiv.org/pdf/1706.00139.pdf) - CoNLL
Slug2Slug 1.55 0.524 Paper: Slug2Slug: A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation[[pdf]](http://www.macs.hw.ac.uk/InteractionLab/E2E/final_papers/E2E-Slug2Slug.pdf) - -
ARoA 0.50 0.522 Neural-based Natural Language Generation in Dialogue using RNN Encoder-Decoder with Semantic Aggregation[[pdf]](https://arxiv.org/pdf/1706.06714.pdf) - Sigdial
HLSTM 1.10 0.513 Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking[[pdf]](https://arxiv.org/pdf/1508.01755.pdf) https://github.com/shawnwun/RNNLG Sigdial
SCLSTM 0.79 0.512 Paper: Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems[[pdf]](https://arxiv.org/pdf/1508.01745.pdf) https://github.com/shawnwun/RNNLG EMNLP

Television dataset

Model Slot-err rate BELU Paper / Source Code link Conference
NLG-LM - 0.617 Paper: Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue[[pdf]](https://aclanthology.org/D19-1123.pdf) - EMNLP|IJCNLP
Softmax 0.51 0.610 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
VQ-VAE 0.70 0.598 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
Gumbel-softmax 0.79 0.581 Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
IRN(+KNN) 0.35 0.559 Slot-consistent NLG for Task-oriented Dialogue Systems with Iterative Rectification Network[[pdf]](https://aclanthology.org/2020.acl-main.10.pdf) - Sigdial
RALSTM 0.63 0.541 Paper: Natural Language Generation for Spoken Dialogue System using RNN Encoder-Decoder Networks[[pdf]](https://arxiv.org/pdf/1706.00139.pdf) - CoNLL
ARoA 0.60 0.539 Neural-based Natural Language Generation in Dialogue using RNN Encoder-Decoder with Semantic Aggregation[[pdf]](https://arxiv.org/pdf/1706.06714.pdf) - Sigdial
SCLSTM 2.31 0.527 Paper: Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems[[pdf]](https://arxiv.org/pdf/1508.01745.pdf) https://github.com/shawnwun/RNNLG EMNLP
HLSTM 2.50 0.525 Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking[[pdf]](https://arxiv.org/pdf/1508.01755.pdf) https://github.com/shawnwun/RNNLG Sigdial
Slug2Slug 1.67 0.523 Paper: Slug2Slug: A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation[[pdf]](http://www.macs.hw.ac.uk/InteractionLab/E2E/final_papers/E2E-Slug2Slug.pdf) - -

E2E Challenge dataset

Model BELU NIST Paper / Source Code link Conference
OpenNMT 0.681 8.748 Paper: How to Make Neural Natural Language Generation as Reliable as Templates in Task-Oriented Dialogue[[pdf]](http://doras.dcu.ie/25957/1/2020.emnlp-main.230%20(1).pdf) https://github.com/Henry-E/reliable_neural_nlg EMNLP|ACL
TUDA(Model-D) 0.713 8.502 Paper: E2E NLG Challenge: Neural Models vs. Templates[[pdf]](https://aclanthology.org/W18-6557.pdf) https://github.com/UKPLab/e2e-nlg-challenge-2017 ACL
Softmax 0.697 - Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
Slug2Slug 0.662 8.613 Paper: Slug2Slug: A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation[[pdf]](http://www.macs.hw.ac.uk/InteractionLab/E2E/final_papers/E2E-Slug2Slug.pdf) - -
NLG-LM 0.684 - Paper: Multi-task Learning for Natural Language Generation in Task-Oriented Dialogue[[pdf]](https://aclanthology.org/D19-1123.pdf) - EMNLP|IJCNLP
VQ-VAE 0.681 - Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
TGen 0.659 8.609 Paper: Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings[[pdf]](https://arxiv.org/pdf/1606.05491.pdf) https://github.com/UFAL-DSG/tgen ACL
OpenNMT+Surface Forms 0.628 8.311 Paper: How to Make Neural Natural Language Generation as Reliable as Templates in Task-Oriented Dialogue[[pdf]](http://doras.dcu.ie/25957/1/2020.emnlp-main.230%20(1).pdf) https://github.com/Henry-E/reliable_neural_nlg EMNLP|ACL
Gumbel-softmax 0.667 - Paper: Interpretable NLG for Task-oriented Dialogue Systems with Heterogeneous Rendering Machines[[pdf]](https://www.aaai.org/AAAI21Papers/AAAI-3767.LiY.pdf) - AAAI
TUDA(Model-T) 0.605 7.526 Paper: E2E NLG Challenge: Neural Models vs. Templates[[pdf]](https://aclanthology.org/W18-6557.pdf) https://github.com/UKPLab/e2e-nlg-challenge-2017 ACL

Multi-Domain-Woz dataset

Model Entity F1 BLEU Paper / Source Code link Conference
SC-GPT 88.37 30.76 Paper: Few-shot Natural Language Generation for Task-Oriented Dialog[[pdf]](https://arxiv.org/pdf/2002.12328.pdf) https://github.com/pengbaolin/ SC-GPT arXiv
GPT-2 87.70 30.71 Paper: Few-shot Natural Language Generation for Task-Oriented Dialog[[pdf]](https://arxiv.org/pdf/2002.12328.pdf) https://github.com/pengbaolin/ SC-GPT arXiv
HDSA 87.30 26.48 Paper: Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention[[pdf]](https://arxiv.org/pdf/1905.12866.pdf) https://github. com/wenhuchen/HDSA- Dialog ACL
SCLSTM 80.42 21.6 Paper: Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems[[pdf]](https://arxiv.org/pdf/1508.01745.pdf) https://github.com/shawnwun/RNNLG EMNLP
MoGNet 78.85 20.13 Paper: Retrospective and Prospective Mixture-of-Generators for Task-oriented Dialogue Response Generation[[pdf]](https://arxiv.org/pdf/1911.08151.pdf) https://github.com/Jiahuan-Pei/multiwoz-mdrg arXiv
LaRLAttnGRU 80.95 12.80 Paper: Global-locally self-attentive encoder for dialogue state tracking[[pdf]](https://aclanthology.org/P18-1135.pdf) - ACL
Structured Fusion 77.04 16.34 Paper: Structured fusion networks for dialog[[pdf]](https://arxiv.org/pdf/1907.10016.pdf) - Sigdial

Multi-Domain-Woz-2.1 dataset

Model Entity F1 BLEU Paper / Source Code link Conference
HDNO 87.63 18.97 Paper: MODELLING HIERARCHICAL STRUCTURE BETWEEN DIALOGUE POLICY AND NATURAL LANGUAGE GEN- ERATOR WITH OPTION FRAMEWORK FOR TASK- ORIENTED DIALOGUE SYSTEM[[pdf]](https://arxiv.org/pdf/2006.06814.pdf) https://github.com/mikezhang95/HDNO ICLR
MarCo 84.51 19.54 Paper: Multi-Domain Dialogue Acts and Response Co-Generation[[pdf]](https://arxiv.org/pdf/2004.12363.pdf) https://github.com/InitialBug/ MarCo-Dialog ACL
LAVA 89.52 14.02 Paper: LAVA: Latent Action Spaces via Variational Auto-encoding for Dialogue Policy Optimization[[pdf]](https://www.aclweb.org/anthology/2020.coling-main.41.pdf) https://github.com/budzianowski/multiwoz COLING
UBAR 86.45 16.70 Paper: UBAR: Towards Fully End-to-End Task-Oriented Dialog System with GPT-2[[pdf]](https://arxiv.org/pdf/2012.03539.pdf) https://github.com/ TonyNemo/UBAR- MultiWOZ AAAI
DoTs 79.93 15.9 Paper: Domain State Tracking for a Simplified Dialogue System[[pdf]](https://arxiv.org/pdf/2103.06648.pdf) - arXiv
SimpleTOD 78.87 16.22 Paper: A Simple Language Model for Task-Oriented Dialogue[[pdf]](https://arxiv.org/pdf/2005.00796.pdf) https://github.com/ salesforce/simpletod arXiv
LABES-S2S 72.14 18.3 Paper: A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning[[pdf]](https://arxiv.org/pdf/2009.08115.pdf) https://github.com/thu-spmi/LABES EMNLP
Owner
Libo Qin
Ph.D. Candidate in Harbin Institute of Technology @HIT-SCIR. Homepage: http://ir.hit.edu.cn/~lbqin/
Libo Qin
PG-19 Language Modelling Benchmark

PG-19 Language Modelling Benchmark This repository contains the PG-19 language modeling benchmark. It includes a set of books extracted from the Proje

DeepMind 161 Oct 30, 2022
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,

Mutian He 19 Oct 14, 2022
TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

Alexa 98 Dec 09, 2022
BERT, LDA, and TFIDF based keyword extraction in Python

BERT, LDA, and TFIDF based keyword extraction in Python kwx is a toolkit for multilingual keyword extraction based on Google's BERT and Latent Dirichl

Andrew Tavis McAllister 41 Dec 27, 2022
A python script that will use hydra to get user and password to login to ssh, ftp, and telnet

Hydra-Auto-Hack A python script that will use hydra to get user and password to login to ssh, ftp, and telnet Project Description This python script w

2 Jan 16, 2022
Analyse japanese ebooks using MeCab to determine the difficulty level for japanese learners

japanese-ebook-analysis This aim of this project is to make analysing the contents of a japanese ebook easy and streamline the process for non-technic

Christoffer Aakre 14 Jul 23, 2022
Language-Agnostic SEntence Representations

LASER Language-Agnostic SEntence Representations LASER is a library to calculate and use multilingual sentence embeddings. NEWS 2019/11/08 CCMatrix is

Facebook Research 3.2k Jan 04, 2023
تولید اسم های رندوم فینگیلیش

karafs کرفس تولید اسم های رندوم فینگیلیش installation ➜ pip install karafs usage دو زبانه ➜ karafs -n 10 توت فرنگی بی ناموس toot farangi-ye bi_namoos

Vaheed NÆINI (9E) 36 Nov 24, 2022
[ICLR'19] Trellis Networks for Sequence Modeling

TrellisNet for Sequence Modeling This repository contains the experiments done in paper Trellis Networks for Sequence Modeling by Shaojie Bai, J. Zico

CMU Locus Lab 460 Oct 13, 2022
translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

1 Oct 18, 2021
HAIS_2GNN: 3D Visual Grounding with Graph and Attention

HAIS_2GNN: 3D Visual Grounding with Graph and Attention This repository is for the HAIS_2GNN research project. Tao Gu, Yue Chen Introduction The motiv

Yue Chen 1 Nov 26, 2022
Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish Language Models 💃🏻 A repository part of the MarIA project. Corpora 📃 Corpora Number of documents Number of tokens Size (GB) BNE 201,080,084

Plan de Tecnologías del Lenguaje - Gobierno de España 203 Dec 20, 2022
EMNLP'2021: Can Language Models be Biomedical Knowledge Bases?

BioLAMA BioLAMA is biomedical factual knowledge triples for probing biomedical LMs. The triples are collected and pre-processed from three sources: CT

DMIS Laboratory - Korea University 41 Nov 18, 2022
The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

Speech Separation The simple project to separate mixed voice (2 clean voices) to 2 separate voices. Result Example (Clisk to hear the voices): mix ||

vuthede 31 Oct 30, 2022
Turn clang-tidy warnings and fixes to comments in your pull request

clang-tidy pull request comments A GitHub Action to post clang-tidy warnings and suggestions as review comments on your pull request. What platisd/cla

Dimitris Platis 30 Dec 13, 2022
A python framework to transform natural language questions to queries in a database query language.

__ _ _ _ ___ _ __ _ _ / _` | | | |/ _ \ '_ \| | | | | (_| | |_| | __/ |_) | |_| | \__, |\__,_|\___| .__/ \__, | |_| |_| |___/

Machinalis 1.2k Dec 18, 2022
Lingtrain Aligner — ML powered library for the accurate texts alignment.

Lingtrain Aligner ML powered library for the accurate texts alignment in different languages. Purpose Main purpose of this alignment tool is to build

Sergei Averkiev 76 Dec 14, 2022
Code for the project carried out fulfilling the course requirements for Fall 2021 NLP at NYU

Introduction Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization,

Sai Himal Allu 1 Apr 25, 2022
KR-FinBert And KR-FinBert-SC

KR-FinBert & KR-FinBert-SC Much progress has been made in the NLP (Natural Language Processing) field, with numerous studies showing that domain adapt

5 Jul 29, 2022
Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu

Heartex 135 Dec 29, 2022