Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Last update: Dec 30, 2022

Related tags

Overview

PTR

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

If you use the code, please cite the following paper:

@article{han2021ptr,
  title={PTR: Prompt Tuning with Rules for Text Classification},
  author={Han, Xu and Zhao, Weilin and Ding, Ning and Liu, Zhiyuan and Sun, Maosong},
  journal={arXiv preprint arXiv:2105.11259},
  year={2021}
}

Requirements

The model is implemented using PyTorch. The versions of packages used are shown below.

numpy>=1.18.0
scikit-learn>=0.22.1
scipy>=1.4.1
torch>=1.3.0
tqdm>=4.41.1
transformers>=4.0.0

Baselines

Some baselines, especially the baselines using entity markers, come from the project [RE_improved_baseline].

Datasets

We provide all the datasets and prompts used in our experiments.

Run the experiments

(1) For TACRED

mkdir results
cd results
mkdir tacred
cd tacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacred.sh

(2) For TACREV

mkdir results
cd results
mkdir tacrev
cd tacrev
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacrev.sh

(3) For RETACRED

mkdir results
cd results
mkdir retacred
cd retacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_retacred.sh

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Related tags

Overview

PTR

Requirements

Baselines

Datasets

Run the experiments

(1) For TACRED

(2) For TACREV

(3) For RETACRED

Owner

THUNLP

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

nlpcommon is a python Open Source Toolkit for text classification.

KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers.

InferSent sentence embeddings

This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

Anomaly Detection 이상치 탐지 전처리 모듈

Easy, fast, effective, and automatic g-code compression!

A python package to fine-tune transformer-based models for named entity recognition (NER).

code for modular summarization work published in ACL2021 by Krishna et al

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

source code for paper: WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach.

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Product-Review-Summarizer - Created a product review summarizer which clustered thousands of product reviews and summarized them into a maximum of 500 characters, saving precious time of customers and helping them make a wise buying decision.

Outreachy TFX custom component project

DensePhrases provides answers to your natural language questions from the entire Wikipedia in real-time

NLTK Source

scikit-learn wrappers for Python fastText.

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

To classify the News into Real/Fake using Features from the Text Content of the article