A high-level yet extensible library for fast language model tuning via automatic prompt search

Last update: Dec 07, 2022

Related tags

Overview

ruPrompts

ruPrompts is a high-level yet extensible library for fast language model tuning via automatic prompt search, featuring integration with HuggingFace Hub, configuration system powered by Hydra, and command line interface.

Prompt is a text instruction for language model, like

Translate English to French:
cat =>

For some tasks the prompt is obvious, but for some it isn't. With ruPrompts you can define only the prompt format, like {text}, and train it automatically for any task, if you have a training dataset.

You can currently use ruPrompts for text-to-text tasks, such as summarization, detoxification, style transfer, etc., and for styled text generation, as a special case of text-to-text.

Features

Modular structure for convenient extensibility
Integration with HF Transformers, support for all models with LM head
Integration with HF Hub for sharing and loading pretrained prompts
CLI and configuration system powered by Hydra
Pretrained prompts for ruGPT-3

Installation

ruPrompts can be installed with pip:

pip install ruprompts[hydra]

See Installation for other installation options.

Usage

Loading a pretrained prompt for styled text generation:

>> ppln_joke("Говорит кружка ложке") [{"generated_text": 'Говорит кружка ложке: "Не бойся, не утонешь!".'}]">

>>> import ruprompts
>>> from transformers import pipeline

>>> ppln_joke = pipeline("text-generation-with-prompt", prompt="konodyuk/prompt_rugpt3large_joke")
>>> ppln_joke("Говорит кружка ложке")
[{"generated_text": 'Говорит кружка ложке: "Не бойся, не утонешь!".'}]

For text2text tasks:

>> ppln_detox("Опять эти тупые дятлы все испортили, чтоб их черти взяли") [{"generated_text": 'Опять эти люди все испортили'}]">

>>> ppln_detox = pipeline("text2text-generation-with-prompt", prompt="konodyuk/prompt_rugpt3large_detox_russe")
>>> ppln_detox("Опять эти тупые дятлы все испортили, чтоб их черти взяли")
[{"generated_text": 'Опять эти люди все испортили'}]

Proceed to Quick Start for a more detailed introduction or start using ruPrompts right now with our Colab Tutorials.

License

ruPrompts is Apache 2.0 licensed. See the LICENSE file for details.

A high-level yet extensible library for fast language model tuning via automatic prompt search

Related tags

Overview

ruPrompts

Features

Installation

Usage

License

Owner

Sber AI

Twitter-Sentiment-Analysis - Twitter sentiment analysis for india's top online retailers(2019 to 2022)

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Need: Image Search With Python

My Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks using Tensorflow

BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

OpenChat: Opensource chatting framework for generative models

TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset.

Speech Recognition for Uyghur using Speech transformer

Shirt Bot is a discord bot which uses GPT-3 to generate text

AI_Assistant - This is a Python based Voice Assistant.

Transformers and related deep network architectures are summarized and implemented here.

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Behavioral Testing of Clinical NLP Models

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project

Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

Python library for Serbian Natural language processing (NLP)

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

Data loaders and abstractions for text and NLP