Source code for AAAI20 "Generating Persona Consistent Dialogues by Exploiting Natural Language Inference".

Overview

Generating Persona Consistent Dialogues by Exploiting Natural Language Inference

Source code for RCDG model in AAAI20 Generating Persona Consistent Dialogues by Exploiting Natural Language Inference, a natural language inference (NLI) enhanced reinforcement learning dialogue model.

Requirements:

The code is tested under the following env:

  • Python 3.6
  • Pytorch 0.3.1

Install with conda: conda install pytorch==0.3.1 torchvision cudatoolkit=7.5 -c pytorch

This released code has been tested on a Titan-XP 12G GPU.

Data

We have provided some data samples in ./data to show the format. For downloading the full datasets, please refer to the following papers:

How to Run:

For a easier way to run the code, here the NLI model is GRU+MLP, i.e. RCDG_base, and we remove the time-consuming MC search.

Here are a few steps to run this code:

0. Prepare Data

python preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -train_per data/per-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -valid_per data/per-val.txt -train_nli data/nli-train.txt -valid_nli data/nli-valid.txt -save_data data/nli_persona -src_vocab_size 18300 -tgt_vocab_size 18300 -share_vocab

And as introduced in the paper, there are different training stages:

1. NLI model Pretrain

cd NLI_pretrain/

python train.py -data ../data/nli_persona -batch_size 32 -save_model saved_model/consistent_dialogue -rnn_size 500 -word_vec_size 300 -dropout 0.2 -epochs 5 -learning_rate_decay 1 -gpu 0

And you should see something like:

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 1
31432
Epoch  1, nli_step     1/ 4108; nli: 0.28125
Epoch  1, nli_step    11/ 4108; nli: 0.38125
Epoch  1, nli_step    21/ 4108; nli: 0.43438
Epoch  1, nli_step    31/ 4108; nli: 0.48125
Epoch  1, nli_step    41/ 4108; nli: 0.53750
Epoch  1, nli_step    51/ 4108; nli: 0.56250
Epoch  1, nli_step    61/ 4108; nli: 0.49062
...

2. Generator G Pretrain

cd ../G_pretrain/

python train.py -data ../data/nli_persona -batch_size 32 -rnn_size 500 -word_vec_size 300  -dropout 0.2 -epochs 15 -g_optim adam -g_learning_rate 1e-3 -learning_rate_decay 1 -train_from PATH_TO_PRETRAINED_NLI -gpu 0

Here the PATH_TO_PRETRAINED_NLI should be replaced by your model path, e.g., ../NLI_pretrain/saved_model/consistent_dialogue_e3.pt.

If , you should see the ppl comes down during training, which means the dialogue model is in training:

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 131432
Epoch  4, teacher_force     1/ 4108; acc:   0.00; ppl: 18619.76; 125 src tok/s; 162 tgt tok/s;      3 s elapsed
Epoch  4, teacher_force    11/ 4108; acc:   9.69; ppl: 2816.01; 4159 src tok/s; 5468 tgt tok/s;      3 s elapsed
Epoch  4, teacher_force    21/ 4108; acc:   9.78; ppl: 550.46; 5532 src tok/s; 6116 tgt tok/s;      4 s elapsed
Epoch  4, teacher_force    31/ 4108; acc:  11.15; ppl: 383.06; 5810 src tok/s; 6263 tgt tok/s;      5 s elapsed
...
Epoch  4, teacher_force   941/ 4108; acc:  25.40; ppl:  90.18; 5993 src tok/s; 6645 tgt tok/s;     63 s elapsed
Epoch  4, teacher_force   951/ 4108; acc:  27.49; ppl:  77.07; 5861 src tok/s; 6479 tgt tok/s;     64 s elapsed
Epoch  4, teacher_force   961/ 4108; acc:  26.24; ppl:  83.17; 5473 src tok/s; 6443 tgt tok/s;     64 s elapsed
Epoch  4, teacher_force   971/ 4108; acc:  24.33; ppl:  97.14; 5614 src tok/s; 6685 tgt tok/s;     65 s elapsed
...

3. Discriminator D Pretrain

cd ../D_pretrain/

python train.py -epochs 20 -d_optim adam -d_learning_rate 1e-4 -data ../data/nli_persona -train_from PATH_TO_PRETRAINED_G -batch_size 32 -learning_rate_decay 0.99 -gpu 0

Similarly, replace PATH_TO_PRETRAINED_G with the G Pretrain model path.

The acc of D will be displayed during training:

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 131432
Epoch  5, d_step     1/ 4108; d: 0.49587
Epoch  5, d_step    11/ 4108; d: 0.51580
Epoch  5, d_step    21/ 4108; d: 0.49853
Epoch  5, d_step    31/ 4108; d: 0.55248
Epoch  5, d_step    41/ 4108; d: 0.55168
...

4. Reinforcement Training

cd ../reinforcement_train/

python train.py -epochs 30 -batch_size 32 -d_learning_rate 1e-4 -g_learning_rate 1e-4 -learning_rate_decay 0.9 -data ../data/nli_persona -train_from PATH_TO_PRETRAINED_D -gpu 0

Remember to replace PATH_TO_PRETRAINED_D with the D Pretrain model path.

Note that all the -epochs are global among all stages, if you want to tune this parameter. Actually, there are 30 - 20 = 10 training epochs in this Reinforcement Training stage if the D Pretrain model was trained 20 epochs in total.

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 131432
Epoch  7, self_sample     1/ 4108; acc:   2.12; ppl:   0.28; 298 src tok/s; 234 tgt tok/s;      2 s elapsed
Epoch  7, teacher_force    11/ 4108; acc:   3.32; ppl:   0.53; 2519 src tok/s; 2772 tgt tok/s;      3 s elapsed
Epoch  7, d_step    21/ 4108; d: 0.98896
Epoch  7, d_step    31/ 4108; d: 0.99906
Epoch  7, self_sample    41/ 4108; acc:   0.00; ppl:   0.27; 1769 src tok/s; 260 tgt tok/s;      7 s elapsed
Epoch  7, teacher_force    51/ 4108; acc:   2.83; ppl:   0.43; 2368 src tok/s; 2910 tgt tok/s;      9 s elapsed
Epoch  7, d_step    61/ 4108; d: 0.75311
Epoch  7, d_step    71/ 4108; d: 0.83919
Epoch  7, self_sample    81/ 4108; acc:   6.20; ppl:   0.33; 1791 src tok/s; 232 tgt tok/s;     12 s elapsed
...

5. Testing Trained Model

Now we have a trained dialogue model, we can test by:

Still in ./reinforcement_train/

python predict.py -model TRAINED_MODEL_PATH  -src ../data/src-val.txt -tgt ../data/tgt-val.txt -replace_unk -verbose -output ./results.txt -per ../data/per-val.txt -nli nli-val.txt -gpu 0

MISC

  • Initializing Model Seems Slow?

    This is a legacy problem due to pytorch < 0.4, not brought by this project. And the training efficiency will not be affected.

  • BibTex

     @article{Song_RCDG_2020,
     	title={Generating Persona Consistent Dialogues by Exploiting Natural Language Inference},
     	volume={34},
     	DOI={10.1609/aaai.v34i05.6417},
     	number={05},
     	journal={Proceedings of the AAAI Conference on Artificial Intelligence},
     	author={Song, Haoyu and Zhang, Wei-Nan and Hu, Jingwen and Liu, Ting},
     	year={2020},
     	month={Apr.},
     	pages={8878-8885}
     	}
    
Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.

Lightning ASR Modular and extensible speech recognition library leveraging pytorch-lightning and hydra What is Lightning ASR • Installation • Get Star

Soohwan Kim 40 Sep 19, 2022
Plugin repository for Macast

Macast-plugins Plugin repository for Macast. How to use third-party player plugin Download Macast from GitHub Release. Download the plugin you want fr

109 Jan 04, 2023
Model parallel transformers in JAX and Haiku

Table of contents Mesh Transformer JAX Updates Pretrained Models GPT-J-6B Links Acknowledgments License Model Details Zero-Shot Evaluations Architectu

Ben Wang 4.9k Jan 04, 2023
RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

RuCLIPtiny Zero-shot image classification model for Russian language RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network

Shahmatov Arseniy 26 Sep 20, 2022
Simple program that translates the name of files into English

Simple program that translates the name of files into English. Useful for when editing/inspecting programs that were developed in a foreign language.

0 Dec 22, 2021
PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Cross-Covariance Image Transformer (XCiT) PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer L

Facebook Research 605 Jan 02, 2023
Lyrics generation with GPT2-based Transformer

HuggingArtists - Train a model to generate lyrics Create AI-Artist in just 5 minutes! 🚀 Run the demo notebook to train 🚀 Run the GUI demo to test Di

Aleksey Korshuk 65 Dec 19, 2022
Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated. This engine can later be used for downstream tasks in NLP such as Q&A, summarization, generation

Diego 1 Mar 20, 2022
An official repository for tutorials of Probabilistic Modelling and Reasoning (2021/2022) - a University of Edinburgh master's course.

PMR computer tutorials on HMMs (2021-2022) This is a repository for computer tutorials of Probabilistic Modelling and Reasoning (2021/2022) - a Univer

Vaidotas Šimkus 10 Dec 06, 2022
Simple Speech to Text, Text to Speech

Simple Speech to Text, Text to Speech 1. Download Repository Opsi 1 Download repository ini, extract di lokasi yang diinginkan Opsi 2 Jika sudah famil

Habib Abdurrasyid 5 Dec 28, 2021
The SVO-Probes Dataset for Verb Understanding

The SVO-Probes Dataset for Verb Understanding This repository contains the SVO-Probes benchmark designed to probe for Subject, Verb, and Object unders

DeepMind 20 Nov 30, 2022
Share constant definitions between programming languages and make your constants constant again

Introduction Reconstant lets you share constant and enum definitions between programming languages. Constants are defined in a yaml file and converted

Natan Yellin 47 Sep 10, 2022
glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

Glow-Speak glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end. Installation git clone https://g

Rhasspy 8 Dec 25, 2022
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

CRNN paper:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition 1. create your ow

Tsukinousag1 3 Apr 02, 2022
A complete NLP guideline for enthusiasts

NLP-NINJA A complete guide for Natural Language Processing in Python Table of Contents S.No. Topic Level Meaning 1 Tokenization 🤍 Beginner 2 Stemming

MAINAK CHAUDHURI 22 Dec 27, 2022
Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

smart-school-chatbot Smart discord chatbot integrated with Dialogflow to interact with students naturally and manage different classes in a school. De

Tom Huynh 5 Oct 24, 2022
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding

Wav2Vec2CTC With KenLM Using KenLM ARPA language model with beam search to decode audio files and show the most probable transcription. Assuming you'v

farisalasmary 65 Sep 21, 2022
Text Classification in Turkish Texts with Bert

You can watch the details of the project on my youtube channel Project Interface Project Second Interface Goal= Correctly guessing the classification

42 Dec 31, 2022
A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.

A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.

NEC Laboratories Europe 13 Sep 08, 2022