DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Last update: Nov 14, 2022

Related tags

Overview

DeeBERT

This is the code base for the paper DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference.

Code in this repository is also available in the Huggingface Transformer repo (with minor modification for version compatibility). Check this page for models that we have trained in advance (the latest version of Huggingface Transformers Library is needed).

Installation

This repo is tested on Python 3.7.5, PyTorch 1.3.1, and Cuda 10.1. Using a virtulaenv or conda environemnt is recommended, for example:

conda install pytorch==1.3.1 torchvision cudatoolkit=10.1 -c pytorch

After installing the required environment, clone this repo, and install the following requirements:

git clone https://github.com/castorini/deebert
cd deebert
pip install -r ./requirements.txt
pip install -r ./examples/requirements.txt

Usage

There are four scripts in the scripts folder, which can be run from the repo root, e.g., scripts/train.sh.

In each script, there are several things to modify before running:

path to the GLUE dataset. Check this for more details.
path for saving fine-tuned models. Default: ./saved_models.
path for saving evaluation results. Default: ./plotting. Results are printed to stdout and also saved to npy files in this directory to facilitate plotting figures and further analyses.
model_type (bert or roberta)
model_size (base or large)
dataset (SST-2, MRPC, RTE, QNLI, QQP, or MNLI)

train.sh

This is for fine-tuning and evaluating models as in the original BERT paper.

train_highway.sh

This is for fine-tuning DeeBERT models.

eval_highway.sh

This is for evaluating each exit layer for fine-tuned DeeBERT models.

eval_entropy.sh

This is for evaluating fine-tuned DeeBERT models, given a number of different early exit entropy thresholds.

Citation

Please cite our paper if you find the repository useful:

@inproceedings{xin-etal-2020-deebert,
    title = "{D}ee{BERT}: Dynamic Early Exiting for Accelerating {BERT} Inference",
    author = "Xin, Ji  and
      Tang, Raphael  and
      Lee, Jaejun  and
      Yu, Yaoliang  and
      Lin, Jimmy",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.204",
    pages = "2246--2251",
}

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Related tags

Overview

DeeBERT

Installation

Usage

train.sh

train_highway.sh

eval_highway.sh

eval_entropy.sh

Citation

Owner

Castorini

A telegram bot to translate 100+ Languages

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

Code to reprudece NeurIPS paper: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

A natural language processing model for sequential sentence classification in medical abstracts.

Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

Textlesslib - Library for Textless Spoken Language Processing

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

Collection of useful (to me) python scripts for interacting with napari

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks

A look-ahead multi-entity Transformer for modeling coordinated agents.

Share constant definitions between programming languages and make your constants constant again

Official implementation of Meta-StyleSpeech and StyleSpeech

Fully featured implementation of Routing Transformer

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

Code Implementation of "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".