Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

Overview

COCON_ICLR2021

This is our Pytorch implementation of COCON.

CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021)
Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu
https://arxiv.org/abs/2010.02684

TL;DR: We propose CoCon to control the content of text generation from LMs by conditioning on content inputs at an interleave layer.

Requirements

  • Python 3.7.6 on Linux
  • PyTorch 1.4

Dependencies

Install dependencies with:

pip install -r requirements.txt

Dataset

  1. Download COCON's training data from https://github.com/openai/gpt-2-output-dataset
  2. Place the medium-345M-k40.${split}.jsonl files inside the data/gpt2output/ folder

COCON Training

Train COCON with a GPT-2 language model, with the parameters reported in the paper:

sh train_cocon.sh

After training, the COCON block's weights will be saved as models/COCON/cocon_block_pytorch_model.bin.

Training Key Arguments

--do_train : whether to train COCON or not
--output_dir : directory of COCON weights
--model_name_or_path : type of language model to train COCON with
--output_hidden_for_cocon_after_block_ind : index of transformer block whose hidden states are used as input to COCON for content conditioning, value is 6 for results reported in paper, meaning that the output of GPT-2's 7th transformer block is used as COCON block's input.

Pretrained COCON weights

You can download COCON's pretrained weights here and save it in models/COCON/ to start generating with COCON.

COCON Controlled Generation

Sample script on how to generate COCON sentiment-controlled text:

sh generation/generate_cocon_sentiments.sh

Sample script on how to generate COCON topic-controlled text:

sh generation/generate_cocon_topics.sh

COCON-generated texts correspond to the cocon_output key in the output .jsonl files and Cocon AR output in the output .txt files.

Generation Key Arguments

--do_cocon_compute : whether to do COCON generation
--output_dir : directory of COCON block's weights
--model_name_or_path : type of language model
--cocon_output_filename : path of saved generation samples
--cocon_compute_history_source_data_file : filename of text file containing prompt texts for generation
--cocon_compute_context_source_data_file : filename of text file containing target content for generation

Summary of Key Folders/Files

  • transformers/: code for models and optimizers
  • transformers/modeling_gpt2.py: code for COCON block and GPT-2 language model
  • BOW/: target content tokens used for COCON topic control
  • attr_markers/: target content tokens used for COCON sentiment control
  • prompts/: prompt text used for text generation

Citation

If you find our repository useful, please consider citing our paper:

@inproceedings{
chan2021cocon,
title={CoCon: A Self-Supervised Approach for Controlled Text Generation},
author={Alvin Chan and Yew-Soon Ong and Bill Pung and Aston Zhang and Jie Fu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=VD_ozqvBy4W}
}

Acknowledgements

Code is based largely on:

Owner
alvinchangw
CS PhD Student @ Nanyang Technological University, Singapore
alvinchangw
Libtorch yolov3 deepsort

Overview It is for my undergrad thesis in Tsinghua University. There are four modules in the project: Detection: YOLOv3 Tracking: SORT and DeepSORT Pr

Xu Wei 226 Dec 13, 2022
Ian Covert 130 Jan 01, 2023
Code & Data for the Paper "Time Masking for Temporal Language Models", WSDM 2022

Time Masking for Temporal Language Models This repository provides a reference implementation of the paper: Time Masking for Temporal Language Models

Guy Rosin 12 Jan 06, 2023
A simple python stock Predictor

Python Stock Predictor A simple python stock Predictor Demo Run Locally Clone the project git clone https://github.com/yashraj-n/stock-price-predict

Yashraj narke 5 Nov 29, 2021
Neural network for recognizing the gender of people in photos

Neural Network For Gender Recognition How to test it? Install requirements.txt file using pip install -r requirements.txt command Run nn.py using pyth

Valery Chapman 1 Sep 18, 2022
Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control

Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control Official implementation of: Cooperative multi-agent reinfor

0 Nov 16, 2021
A MNIST-like fashion product database. Benchmark

Fashion-MNIST Table of Contents Why we made Fashion-MNIST Get the Data Usage Benchmark Visualization Contributing Contact Citing Fashion-MNIST License

Zalando Research 10.5k Jan 08, 2023
A Pytree Module system for Deep Learning in JAX

Treex A Pytree-based Module system for Deep Learning in JAX Intuitive: Modules are simple Python objects that respect Object-Oriented semantics and sh

Cristian Garcia 216 Dec 20, 2022
Voice of Pajlada with model and weights.

Pajlada TTS Stripped down version of ForwardTacotron (https://github.com/as-ideas/ForwardTacotron) with pretrained weights for Pajlada's (https://gith

6 Sep 03, 2021
Implementation of Stochastic Image-to-Video Synthesis using cINNs.

Stochastic Image-to-Video Synthesis using cINNs Official PyTorch implementation of Stochastic Image-to-Video Synthesis using cINNs accepted to CVPR202

CompVis Heidelberg 135 Dec 28, 2022
Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

GANs for Biological Image Synthesis This codes implements the ICCV-2017 paper "GANs for Biological Image Synthesis". The paper and its supplementary m

Anton Osokin 95 Nov 25, 2022
Code for HodgeNet: Learning Spectral Geometry on Triangle Meshes, in SIGGRAPH 2021.

HodgeNet | Webpage | Paper | Video HodgeNet: Learning Spectral Geometry on Triangle Meshes Dmitriy Smirnov, Justin Solomon SIGGRAPH 2021 Set-up To ins

Dima Smirnov 61 Nov 27, 2022
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 EAMLP will come soon Jitto

MenghaoGuo 357 Dec 11, 2022
Example repository for custom C++/CUDA operators for TorchScript

Custom TorchScript Operators Example This repository contains examples for writing, compiling and using custom TorchScript operators. See here for the

106 Dec 14, 2022
Hysterese plugin with two temperature offset areas

craftbeerpi4 plugin OffsetHysterese Temperatur-Steuerungs-Plugin mit zwei tempereaturbereich abhängigen Offsets. Installation sudo pip3 install https:

HappyHibo 1 Dec 21, 2021
Download from Onlyfans.com.

OnlySave: Onlyfans downloader Getting Started: Download the setup executable from the latest release. Install and run. Only works on Windows currently

4 May 30, 2022
Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

Boostcamp AI Tech 3rd : Basic Paper Reading w.r.t Embedding TL;DR 1992년부터 2018년도까지 이루어진 word/sentence embedding의 중요한 줄기를 이루는 기초 논문 스터디를 진행하고자 합니다. 논

Soyeon Kim 14 Nov 14, 2022
Covid-19 Test AI (Deep Learning - NNs) Software. Accuracy is the %96.5, loss is the 0.09 :)

Covid-19 Test AI (Deep Learning - NNs) Software I developed a segmentation algorithm to understand whether Covid-19 Test Photos are positive or negati

Emirhan BULUT 28 Dec 04, 2021
Code accompanying "Evolving spiking neuron cellular automata and networks to emulate in vitro neuronal activity," accepted to IEEE SSCI ICES 2021

Evolving-spiking-neuron-cellular-automata-and-networks-to-emulate-in-vitro-neuronal-activity Code accompanying "Evolving spiking neuron cellular autom

SOCRATES: Self-Organizing Computational substRATES 2 Dec 02, 2022
Can we learn gradients by Hamiltonian Neural Networks?

Can we learn gradients by Hamiltonian Neural Networks? This project was carried out as part of the Optimization for Machine Learning course (CS-439) a

2 Aug 22, 2022