mkultra

mkultra is a prompt tuning toolkit for GPT-2 and GPT-Neo.

Prompt tuning injects a string of 20-100 special tokens into the context in order to influence text generation. These tokens are trained on a corpus much like a finetune, but take up a fraction of the space. The Neuromancer example is only 401kb for 100 tokens.

Read the original paper: https://arxiv.org/abs/2104.08691

Text Generation

model = GPT2SoftPromptLM.from_pretrained("gpt2")
tokenizer = GPT2SPTokenizerFast.from_pretrained("gpt2")
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

sp = SoftPrompt.from_file("sample_sps/finetune/neuromancer_gpt2.json")
prompt = sp + "The sky over the port"
output = generator(prompt)

SoftPrompts can be concatenated at any point into your context as if they were strings. When the context is printed, SoftPrompts show up as human-readable tags for debugging. They also tokenize to the underlying number of tokens for easy budgeting.

See the text generation notebook for pointers on adding mkultra to your generator.

Training

For finetune-like soft prompts, the finetune notebook demonstrates training on a corpus.

For AI text adventures or writing, the World Info notebook notebook demonstrates tuning a soft prompt to describe a character or setting. This is highly experimental.

Limitations (for now)

The Huggingface Trainer class should work as long as you set params=[model.get_soft_params()] on the optimizer, but it will still save full model checkpoints.
mkultra syncs a set of special tokens between its tokenizers the scenes. Adding your own tokens may result in unexpected behaviour.

Prompt tuning toolkit for GPT-2 and GPT-Neo

Related tags

Overview

mkultra

Text Generation

Training

Limitations (for now)

Owner

UniSpeech - Large Scale Self-Supervised Learning for Speech

Tools and data for measuring the popularity & growth of various programming languages.

Chinese Grammatical Error Diagnosis

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

The Easy-to-use Dialogue Response Selection Toolkit for Researchers

The proliferation of disinformation across social media has led the application of deep learning techniques to detect fake news.

Code for producing Japanese GPT-2 provided by rinna Co., Ltd.

RecipeReduce: Simplified Recipe Processing for Lazy Programmers

This is a NLP based project to extract effective date of the contract from their text files.

Mesh TensorFlow: Model Parallelism Made Easier

The first online catalogue for Arabic NLP datasets.

Pretty-doc - Composable text objects with python

Binary LSTM model for text classification

自然言語で書かれた時間情報表現を抽出/規格化するルールベースの解析器

To be a next-generation DL-based phenotype prediction from genome mutations.

Using BERT-based models for toxic span detection

Fast, general, and tested differentiable structured prediction in PyTorch

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

The Internet Archive Research Assistant - Daily search Internet Archive for new items matching your keywords

TalkNet: Audio-visual active speaker detection Model