Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

Last update: Dec 18, 2022

Related tags

Deep Learning COCON_ICLR2021

Overview

COCON_ICLR2021

This is our Pytorch implementation of COCON.

CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021)
Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu
https://arxiv.org/abs/2010.02684

TL;DR: We propose CoCon to control the content of text generation from LMs by conditioning on content inputs at an interleave layer.

Requirements

Python 3.7.6 on Linux
PyTorch 1.4

Dependencies

Install dependencies with:

pip install -r requirements.txt

Dataset

Download COCON's training data from https://github.com/openai/gpt-2-output-dataset
Place the medium-345M-k40.${split}.jsonl files inside the data/gpt2output/ folder

COCON Training

Train COCON with a GPT-2 language model, with the parameters reported in the paper:

sh train_cocon.sh

After training, the COCON block's weights will be saved as models/COCON/cocon_block_pytorch_model.bin.

Training Key Arguments

--do_train : whether to train COCON or not
--output_dir : directory of COCON weights
--model_name_or_path : type of language model to train COCON with
--output_hidden_for_cocon_after_block_ind : index of transformer block whose hidden states are used as input to COCON for content conditioning, value is 6 for results reported in paper, meaning that the output of GPT-2's 7th transformer block is used as COCON block's input.

Pretrained COCON weights

You can download COCON's pretrained weights here and save it in models/COCON/ to start generating with COCON.

COCON Controlled Generation

Sample script on how to generate COCON sentiment-controlled text:

sh generation/generate_cocon_sentiments.sh

Sample script on how to generate COCON topic-controlled text:

sh generation/generate_cocon_topics.sh

COCON-generated texts correspond to the cocon_output key in the output .jsonl files and Cocon AR output in the output .txt files.

Generation Key Arguments

--do_cocon_compute : whether to do COCON generation
--output_dir : directory of COCON block's weights
--model_name_or_path : type of language model
--cocon_output_filename : path of saved generation samples
--cocon_compute_history_source_data_file : filename of text file containing prompt texts for generation
--cocon_compute_context_source_data_file : filename of text file containing target content for generation

Summary of Key Folders/Files

transformers/: code for models and optimizers
transformers/modeling_gpt2.py: code for COCON block and GPT-2 language model
BOW/: target content tokens used for COCON topic control
attr_markers/: target content tokens used for COCON sentiment control
prompts/: prompt text used for text generation

Citation

If you find our repository useful, please consider citing our paper:

@inproceedings{
chan2021cocon,
title={CoCon: A Self-Supervised Approach for Controlled Text Generation},
author={Alvin Chan and Yew-Soon Ong and Bill Pung and Aston Zhang and Jie Fu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=VD_ozqvBy4W}
}

Acknowledgements

Code is based largely on:

https://github.com/huggingface/transformers

Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

Related tags

Overview

COCON_ICLR2021

Requirements

Dependencies

Dataset

COCON Training

Training Key Arguments

Pretrained COCON weights

COCON Controlled Generation

Generation Key Arguments

Summary of Key Folders/Files

Citation

Acknowledgements

Owner

alvinchangw

PyTorch implementation of paper "StarEnhancer: Learning Real-Time and Style-Aware Image Enhancement" (ICCV 2021 Oral)

OBBDetection: an oriented object detection toolbox modified from MMdetection

Official implementation of Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models at NeurIPS 2021

The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021].

Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021]

Multi-Task Learning as a Bargaining Game

Machine Learning Privacy Meter: A tool to quantify the privacy risks of machine learning models with respect to inference attacks, notably membership inference attacks

Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral)

End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

Deep generative modeling for time-stamped heterogeneous data, enabling high-fidelity models for a large variety of spatio-temporal domains.

YolactEdge: Real-time Instance Segmentation on the Edge

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

DeepLab2: A TensorFlow Library for Deep Labeling

Graph parsing approach to structured sentiment analysis.

An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

Simple ray intersection library similar to coldet - succedeed by libacc

Augmented CLIP - Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.

PyTorch Personal Trainer: My framework for deep learning experiments

PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning