mkultra

mkultra is a prompt tuning toolkit for GPT-2 and GPT-Neo.

Prompt tuning injects a string of 20-100 special tokens into the context in order to influence text generation. These tokens are trained on a corpus much like a finetune, but take up a fraction of the space. The Neuromancer example is only 401kb for 100 tokens.

Read the original paper: https://arxiv.org/abs/2104.08691

Text Generation

model = GPT2SoftPromptLM.from_pretrained("gpt2")
tokenizer = GPT2SPTokenizerFast.from_pretrained("gpt2")
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

sp = SoftPrompt.from_file("sample_sps/finetune/neuromancer_gpt2.json")
prompt = sp + "The sky over the port"
output = generator(prompt)

SoftPrompts can be concatenated at any point into your context as if they were strings. When the context is printed, SoftPrompts show up as human-readable tags for debugging. They also tokenize to the underlying number of tokens for easy budgeting.

See the text generation notebook for pointers on adding mkultra to your generator.

Training

For finetune-like soft prompts, the finetune notebook demonstrates training on a corpus.

For AI text adventures or writing, the World Info notebook notebook demonstrates tuning a soft prompt to describe a character or setting. This is highly experimental.

Limitations (for now)

The Huggingface Trainer class should work as long as you set params=[model.get_soft_params()] on the optimizer, but it will still save full model checkpoints.
mkultra syncs a set of special tokens between its tokenizers the scenes. Adding your own tokens may result in unexpected behaviour.

Prompt tuning toolkit for GPT-2 and GPT-Neo

Related tags

Overview

mkultra

Text Generation

Training

Limitations (for now)

Owner

Google and Stanford University released a new pre-trained model called ELECTRA

A desktop GUI providing an audio interface for GPT3.

A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Code for papers "Generation-Augmented Retrieval for Open-Domain Question Answering" and "Reader-Guided Passage Reranking for Open-Domain Question Answering", ACL 2021

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

Dual languaged (rus+eng) tool for packing and unpacking archives of Silky Engine.

Large-scale pretraining for dialogue

Use AutoModelForSeq2SeqLM in Huggingface Transformers to train COMET

Official Stanford NLP Python Library for Many Human Languages

用Resnet101+GPT搭建一个玩王者荣耀的AI

Cherche (search in French) allows you to create a neural search pipeline using retrievers and pre-trained language models as rankers.

MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data.

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Repository for Graph2Pix: A Graph-Based Image to Image Translation Framework

华为商城抢购手机的Python脚本 Python script of Huawei Store snapping up mobile phones

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

ReCoin - Restoring our environment and businesses in parallel