Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

Last update: Dec 28, 2022

Related tags

Overview

Knover

Knover is a toolkit for knowledge grounded dialogue generation based on PaddlePaddle. Knover allows researchers and developers to carry out efficient training/inference of large-scale dialogue generation models.

What's New:

December 2021: We are opening the dialogue generation model of PLATO-XL, with up to 11 billion parameters.
October 2021: We are opening AG-DST, an amendable generation for dialogue state tracking.
February 2021: We are opening our implementation (Team 19) in DSTC9-Track1.
July 2020: We are opening PLATO-2, a large-scale generative model with latent space for open-domain dialogue systems.

Requirements and Installation

python version >= 3.7
paddlepaddle-gpu version >= 2.0.0
- You can install PaddlePaddle following the instructions.
- The specific version of PaddlePaddle is also based on your CUDA version (recommended version: 10.1) and CuDNN version (recommended version: 7.6). See more information on PaddlePaddle document about GPU support
sentencepiece
termcolor
If you want to run distributed training, you'll also need NCCL
Install Knover locally:

git clone https://github.com/PaddlePaddle/Knover.git
cd Knover
pip3 install -e .

Or you can setup PYTHONPATH only:

export PYTHONPATH=/abs/path/to/Knover:$PYTHONPATH

Basic usage

See usage document.

Disclaimer

This project aims to facilitate further research progress in dialogue generation. Baidu is not responsible for the 3rd party's generation with the pre-trained system.

Contact information

For help or issues using Knover, please submit a GitHub issue.

Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

Related tags

Overview

Knover

What's New:

Requirements and Installation

Basic usage

Disclaimer

Contact information

Owner

AI-Broad-casting - AI Broad casting with python

Autoregressive Entity Retrieval

Generating Korean Slogans with phonetic and structural repetition

Club chatbot

🏖 Easy training and deployment of seq2seq models.

jiant is an NLP toolkit

Converts text into a PDF of handwritten notes

[EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction

A complete NLP guideline for enthusiasts

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

Cải thiện Elasticsearch trong bài toán semantic search sử dụng phương pháp Sentence Embeddings

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

Textpipe: clean and extract metadata from text

Conversational-AI-ChatBot - Intelligent ChatBot built with Microsoft's DialoGPT transformer to make conversations with human users!

code for modular summarization work published in ACL2021 by Krishna et al

一个基于Nonebot2和go-cqhttp的娱乐性qq机器人

Th2En & Th2Zh: The large-scale datasets for Thai text cross-lingual summarization

OpenChat: Opensource chatting framework for generative models