The Easy-to-use Dialogue Response Selection Toolkit for Researchers

Overview

Easy-to-use toolkit for retrieval-based Chatbot

Our released data can be found at this link. Make sure the following steps are adopted to use our codes.

How to Use

  1. Init the repo

    Before using the repo, please run the following command to init:

    # create the necessay folders
    python init.py
    
    # prepare the environment
    # if some package cannot be installed, just google and install it from other ways
    pip install -r requirements.txt
  2. train the model

    ./scripts/train.sh <dataset_name> <model_name> <cuda_ids>
  3. test the model [rerank]

    ./scripts/test_rerank.sh <dataset_name> <model_name> <cuda_id>
  4. test the model [recal]

    # different recall_modes are available: q-q, q-r
    ./scripts/test_recall.sh <dataset_name> <model_name> <cuda_id>
  5. inference the responses and save into the faiss index

    Somethings inference will missing data samples, please use the 1 gpu (faiss-gpu search use 1 gpu quickly)

    It should be noted that: 1. For writer dataset, use extract_inference.py script to generate the inference.txt 2. For other datasets(douban, ecommerce, ubuntu), just cp train.txt inference.txt. The dataloader will automatically read the test.txt to supply the corpus.

    # work_mode=response, inference the response and save into faiss (for q-r matching) [dual-bert/dual-bert-fusion]
    # work_mode=context, inference the context to do q-q matching
    # work_mode=gray, inference the context; read the faiss(work_mode=response has already been done), search the topk hard negative samples; remember to set the BERTDualInferenceContextDataloader in config/base.yaml
    ./scripts/inference.sh <dataset_name> <model_name> <cuda_ids>

    If you want to generate the gray dataset for the dataset:

    # 1. set the mode as the **response**, to generate the response faiss index; corresponding dataset name: BERTDualInferenceDataset;
    ./scripts/inference.sh <dataset_name> response <cuda_ids>
    
    # 2. set the mode as the **gray**, to inference the context in the train.txt and search the top-k candidates as the gray(hard negative) samples; corresponding dataset name: BERTDualInferenceContextDataset
    ./scripts/inference.sh <dataset_name> gray <cuda_ids>
    
    # 3. set the mode as the **gray-one2many** if you want to generate the extra positive samples for each context in the train set, the needings of this mode is the same as the **gray** work mode
    ./scripts/inference.sh <dataset_name> gray-one2many <cuda_ids>

    If you want to generate the pesudo positive pairs, run the following commands:

    # make sure the dual-bert inference dataset name is BERTDualInferenceDataset
    ./scripts/inference.sh <dataset_name> unparallel <cuda_ids>
  6. deploy the rerank and recall model

    # load the model on the cuda:0(can be changed in deploy.sh script)
    ./scripts/deploy.sh <cuda_id>

    at the same time, you can test the deployed model by using:

    # test_mode: recall, rerank, pipeline
    ./scripts/test_api.sh <test_mode> <dataset>
  7. test the recall performance of the elasticsearch

    Before testing the es recall, make sure the es index has been built:

    # recall_mode: q-q/q-r
    ./scripts/build_es_index.sh <dataset_name> <recall_mode>
    # recall_mode: q-q/q-r
    ./scripts/test_es_recall.sh <dataset_name> <recall_mode> 0
  8. simcse generate the gray responses

    # train the simcse model
    ./script/train.sh <dataset_name> simcse <cuda_ids>
    # generate the faiss index, dataset name: BERTSimCSEInferenceDataset
    ./script/inference_response.sh <dataset_name> simcse <cuda_ids>
    # generate the context index
    ./script/inference_simcse_response.sh <dataset_name> simcse <cuda_ids>
    # generate the test set for unlikelyhood-gen dataset
    ./script/inference_simcse_unlikelyhood_response.sh <dataset_name> simcse <cuda_ids>
    # generate the gray response
    ./script/inference_gray_simcse.sh <dataset_name> simcse <cuda_ids>
    # generate the test set for unlikelyhood-gen dataset
    ./script/inference_gray_simcse_unlikelyhood.sh <dataset_name> simcse <cuda_ids>
Owner
GMFTBY
Those who are crazy enough to think they can change the world are the ones who can.
GMFTBY
A simple anti-ghostping python bot made using diskord.

Anti Ghostping A simple Anti-Ghostping python bot made with ❤ using Diskord Requirements No one will use this but, all you need for this bot is: Pytho

RyZe 2 Sep 12, 2022
An API serving data on all creatures, monsters, materials, equipment, and treasure in The Legend of Zelda: Breath of the Wild

Hyrule Compendium API An API serving data on all creatures, monsters, materials, equipment, and treasure in The Legend of Zelda: Breath of the Wild. B

Aarav Borthakur 116 Dec 01, 2022
Protection-UB - Simple Group Protection userbot running on python3 with ARQ

Protection-UB Simple Group Protection userbot running on python3 with ARQ ⚠️ Not

szsupunma 1 Feb 06, 2022
light wrapper for indeed.com api

Simple wrapper for indeed api. go to indeed.com - register for api publisher token example from indeed import IndeedApi token = 'your token' api =

16 Sep 21, 2022
Spore REST API asyncio client

Spore REST API asyncio client

LEv145 16 Aug 02, 2022
Modified Version Of Media Search bot

Modified Version Of Media Search bot

1 Oct 09, 2021
QuickStart specific rules for cfn-python-lint

AWS Quick Start cfn-lint rules This repo provides CloudFormation linting rules specific to AWS Quick Start guidelines, for more information see the Co

AWS Quick Start 12 Jul 30, 2022
A multipurpose bot designed to make Discord better for everyone, written in Python.

Hadum A multipurpose bot that makes Discord better for everyone Features A Fully Functional Moderation component: manage your staff, members and permi

1 Jan 25, 2022
Scratch2py or S2py is a easy to use, versatile tool to communicate with the Scratch API Based of Scratch2py

Scratch2py Scratch2py or S2py is a easy to use, versatile tool to communicate with the Scratch API Based of Scratch2py Installation Run this command i

2 Jan 13, 2022
Powerful Url uploader bot With Mongodb support 🔥

Uploader X Bot Telegram RoBot to Upload Links. Features: 👉 Upload YouTube-dl Supported Links to Telegram. 👉 Upload HTTP/HTTPS as File/Video to Teleg

C͡linton Abraꫝam 250 Jan 06, 2023
An API or getting Optifine VersionsList/Version/Download-URL.

Optifine-API An API for getting Optifine VersionsList/Versions/Download-URL. Table of contents Get Versions List Get Specify Versions Download Optifin

2 Dec 04, 2022
This is Telegram Files Store Bot by @AbirHasan2005

PyroFilesStoreBot This is Telegram Parmanent Files Store Bot by @AbirHasan2005. Language: Python3 Library: Pyrogram Features: In PM Just Forward or Se

Abir Hasan 168 Dec 19, 2022
Useful tools for building interactions in Python

discord-interactions-python Types and helper functions for Discord Interactions webhooks. Installation Available via pypi: pip install discord-interac

Discord 77 Dec 07, 2022
Python client for the iNaturalist APIs

pyinaturalist Introduction iNaturalist is a community science platform that helps people get involved in the natural world by observing and identifyin

Nicolas Noé 79 Dec 22, 2022
A Advanced Powerful, Smart And Intelligent Group Management Bot With New And Powerful Features

Vegeta Robot A Advanced Powerful, Smart And Intelligent Group Management Bot With New And Powerful Features ... Written with Pyrogram and Telethon...

⚡ CT_PRO ⚡ 9 Nov 16, 2022
Prime Mega is a modular bot running on python3 with autobots theme and have a lot features.

PRIME MEGA Prime Mega is a modular bot running on python3 with autobots theme and have a lot features. Easiest Way To Deploy On Heroku This Bot is Cre

『TØNIC』 乂 ₭ILLΣR 45 Dec 15, 2022
Project for QVault Hackathon which plays sounds based on the letters of a user's name

virtual_instrument Project for QVault Hackathon which plays sounds based on the letters of a user's name I created a virtual instrument using Python a

Paolo Sidera 2 Feb 11, 2022
Automated JSON API based communication with Fronius Symo

PyFronius - a very basic Fronius python bridge A package that connects to a Fronius device in the local network and provides data that is provided via

Niels Mündler 10 Dec 30, 2022
Simple Discord bot which logs several events in your server

logging-bot Simple Discord bot which logs several events in your server, including: Message Edits Message Deletes Role Adds Role Removes Member joins

1 Feb 14, 2022
Web3 Pancakeswap Sniper & honeypot detector Take Profit/StopLose bot written in python3, For ANDROID WIN MAC & LINUX

🏆 Pancakeswap BSC Sniper Bot web3 with honeypot detector (ANDROID WINDOWS MAC LINUX) 🥇 ⭐️ ⭐️ ⭐️ First SNIPER BOT for ANDROID & WINDOWS with honeypot

HYDRA 2 Dec 24, 2021