EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

Related tags

Deep LearningEncT5
Overview

EncT5

(Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

About

  • Finetune T5 model for classification & regression by only using the encoder layers.
  • Implemented of Tokenizer and Model for EncT5.
  • Add BOS Token () for tokenizer, and use this token for classification & regression.
    • Need to resize embedding as vocab size is changed. (model.resize_token_embeddings())
  • BOS and EOS token will be automatically added as below.
    • single sequence: X
    • pair of sequences: A B

Requirements

Highly recommend to use the same version of transformers.

transformers==4.15.0
torch==1.8.1
sentencepiece==0.1.96
datasets==1.17.0
scikit-learn==0.24.2

How to Use

from enc_t5 import EncT5ForSequenceClassification, EncT5Tokenizer

model = EncT5ForSequenceClassification.from_pretrained("t5-base")
tokenizer = EncT5Tokenizer.from_pretrained("t5-base")

# Resize embedding size as we added bos token
if model.config.vocab_size < len(tokenizer.get_vocab()):
    model.resize_token_embeddings(len(tokenizer.get_vocab()))

Finetune on GLUE

Setup

  • Use T5 1.1 base for finetuning.
  • Evaluate on TPU. See run_glue_tpu.sh for more details.
  • Use AdamW optimizer instead of Adafactor.
  • Check best checkpoint on every epoch by using EarlyStoppingCallback.

Results

Metric Result (Paper) Result (Implementation)
CoLA Matthew 53.1 52.4
SST-2 Acc 94.0 94.5
MRPC F1/Acc 91.5/88.3 91.7/88.0
STS-B PCC/SCC 80.5/79.3 88.0/88.3
QQP F1/Acc 72.9/89.8 88.4/91.3
MNLI Mis/Matched 88.0/86.7 87.5/88.1
QNLI Acc 93.3 93.2
RTE Acc 67.8 69.7
You might also like...
Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

xTune Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning. Environment DockerFile: dancingsoul/pytorch:xTune Install the f

 Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation
Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation

Fine-tuning StyleGAN2 for Cartoon Face Generation

Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World
Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

Legged Robots that Keep on Learning Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World, whic

Fine-tuning StyleGAN2 for Cartoon Face Generation
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker
Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker This repository contai

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning This repository is official Tensorflow implementation of paper: Ensemb

Comments
  • Enable tokenizer to be loaded by sentence-transformer

    Enable tokenizer to be loaded by sentence-transformer

    🚀 Feature Request

    Integration into sentence-transformer library.

    📎 Additional context

    I tried to load this tokenizer with sentence-transformer library but it failed. AutoTokenizer couldn't load this tokenizer. So, I simply added code to override save_pretrained and its dependencies so that this tokenizer is saved as T5Tokenizer, its super class.

            def save_pretrained(
            self,
            save_directory,
            legacy_format: Optional[bool] = None,
            filename_prefix: Optional[str] = None,
            push_to_hub: bool = False,
            **kwargs,
        ):
            if os.path.isfile(save_directory):
                logger.error(f"Provided path ({save_directory}) should be a directory, not a file")
                return
    
            if push_to_hub:
                commit_message = kwargs.pop("commit_message", None)
                repo = self._create_or_get_repo(save_directory, **kwargs)
    
            os.makedirs(save_directory, exist_ok=True)
    
            special_tokens_map_file = os.path.join(
                save_directory, (filename_prefix + "-" if filename_prefix else "") + SPECIAL_TOKENS_MAP_FILE
            )
            tokenizer_config_file = os.path.join(
                save_directory, (filename_prefix + "-" if filename_prefix else "") + TOKENIZER_CONFIG_FILE
            )
    
            tokenizer_config = copy.deepcopy(self.init_kwargs)
            if len(self.init_inputs) > 0:
                tokenizer_config["init_inputs"] = copy.deepcopy(self.init_inputs)
            for file_id in self.vocab_files_names.keys():
                tokenizer_config.pop(file_id, None)
    
            # Sanitize AddedTokens
            def convert_added_tokens(obj: Union[AddedToken, Any], add_type_field=True):
                if isinstance(obj, AddedToken):
                    out = obj.__getstate__()
                    if add_type_field:
                        out["__type"] = "AddedToken"
                    return out
                elif isinstance(obj, (list, tuple)):
                    return list(convert_added_tokens(o, add_type_field=add_type_field) for o in obj)
                elif isinstance(obj, dict):
                    return {k: convert_added_tokens(v, add_type_field=add_type_field) for k, v in obj.items()}
                return obj
    
            # add_type_field=True to allow dicts in the kwargs / differentiate from AddedToken serialization
            tokenizer_config = convert_added_tokens(tokenizer_config, add_type_field=True)
    
            # Add tokenizer class to the tokenizer config to be able to reload it with from_pretrained
            ############################################################################
            tokenizer_class = self.__class__.__base__.__name__
            ############################################################################
            # Remove the Fast at the end unless we have a special `PreTrainedTokenizerFast`
            if tokenizer_class.endswith("Fast") and tokenizer_class != "PreTrainedTokenizerFast":
                tokenizer_class = tokenizer_class[:-4]
            tokenizer_config["tokenizer_class"] = tokenizer_class
            if getattr(self, "_auto_map", None) is not None:
                tokenizer_config["auto_map"] = self._auto_map
            if getattr(self, "_processor_class", None) is not None:
                tokenizer_config["processor_class"] = self._processor_class
    
            # If we have a custom model, we copy the file defining it in the folder and set the attributes so it can be
            # loaded from the Hub.
            if self._auto_class is not None:
                custom_object_save(self, save_directory, config=tokenizer_config)
    
            with open(tokenizer_config_file, "w", encoding="utf-8") as f:
                f.write(json.dumps(tokenizer_config, ensure_ascii=False))
            logger.info(f"tokenizer config file saved in {tokenizer_config_file}")
    
            # Sanitize AddedTokens in special_tokens_map
            write_dict = convert_added_tokens(self.special_tokens_map_extended, add_type_field=False)
            with open(special_tokens_map_file, "w", encoding="utf-8") as f:
                f.write(json.dumps(write_dict, ensure_ascii=False))
            logger.info(f"Special tokens file saved in {special_tokens_map_file}")
    
            file_names = (tokenizer_config_file, special_tokens_map_file)
    
            save_files = self._save_pretrained(
                save_directory=save_directory,
                file_names=file_names,
                legacy_format=legacy_format,
                filename_prefix=filename_prefix,
            )
    
            if push_to_hub:
                url = self._push_to_hub(repo, commit_message=commit_message)
                logger.info(f"Tokenizer pushed to the hub in this commit: {url}")
    
            return save_files
    
    enhancement 
    opened by kwonmha 0
Releases(v1.0.0)
  • v1.0.0(Jan 22, 2022)

    What’s Changed

    :rocket: Features

    • Add GLUE Trainer (#2) @monologg
    • Add Template & EncT5 model and tokenizer (#1) @monologg

    :pencil: Documentation

    • Add readme & script (#3) @monologg
    Source code(tar.gz)
    Source code(zip)
Owner
Jangwon Park
Jangwon Park
LyaNet: A Lyapunov Framework for Training Neural ODEs

LyaNet: A Lyapunov Framework for Training Neural ODEs Provide the model type--config-name to train and test models configured as those shown in the pa

Ivan Dario Jimenez Rodriguez 21 Nov 21, 2022
Learn the Deep Learning for Computer Vision in three steps: theory from base to SotA, code in PyTorch, and space-repetition with Anki

DeepCourse: Deep Learning for Computer Vision arthurdouillard.com/deepcourse/ This is a course I'm giving to the French engineering school EPITA each

Arthur Douillard 113 Nov 29, 2022
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed+Megatron trained the world's most powerful language model: MT-530B DeepSpeed is hiring, come join us! DeepSpeed is a deep learning optimizat

Microsoft 8.4k Dec 28, 2022
Temporal Segment Networks (TSN) in PyTorch

TSN-Pytorch We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation for TSN as well as oth

1k Jan 03, 2023
Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing

FGHV Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing Requirements Python 3.6 Pytorch 1.5.0 Cud

5 Jun 02, 2022
Meta-learning for NLP

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks Code for training the meta-learning models and fine-tuning on downstr

IESL 43 Nov 08, 2022
RepVGG: Making VGG-style ConvNets Great Again

This repository is the code that needs to be submitted for OpenMMLab Algorithm Ecological Challenge,the paper is RepVGG: Making VGG-style ConvNets Great Again

Ty Feng 62 May 21, 2022
Pca-on-genotypes - Mini bioinformatics project - PCA on genotypes

Mini bioinformatics project: PCA on genotypes This repo contains the code from t

Maria Nattestad 8 Dec 04, 2022
AoT is a system for automatically generating off-target test harness by using build information.

AoT: Auto off-Target Automatically generating off-target test harness by using build information. Brought to you by the Mobile Security Team at Samsun

Samsung 10 Oct 19, 2022
An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

Siavash Eliasi 33 May 31, 2022
Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks

Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks

Jina AI 794 Dec 31, 2022
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

Hugo Germain 78 Dec 01, 2022
A tight inclusion function for continuous collision detection

Tight-Inclusion Continuous Collision Detection A conservative Continuous Collision Detection (CCD) method with support for minimum separation. You can

Continuous Collision Detection 89 Jan 01, 2023
Pytorch implement of 'Unmixing based PAN guided fusion network for hyperspectral imagery'

Pgnet There's a improved version compared with the publication in Tgrs with the modification in the deduction of the PDIN block: https://arxiv.org/abs

5 Jul 01, 2022
A lightweight face-recognition toolbox and pipeline based on tensorflow-lite

FaceIDLight 📘 Description A lightweight face-recognition toolbox and pipeline based on tensorflow-lite with MTCNN-Face-Detection and ArcFace-Face-Rec

Martin Knoche 16 Dec 07, 2022
Implementation of Hierarchical Transformer Memory (HTM) for Pytorch

Hierarchical Transformer Memory (HTM) - Pytorch Implementation of Hierarchical Transformer Memory (HTM) for Pytorch. This Deepmind paper proposes a si

Phil Wang 63 Dec 29, 2022
Cowsay - A rewrite of cowsay in python

Python Cowsay A rewrite of cowsay in python. Allows for parsing of existing .cow

James Ansley 3 Jun 27, 2022
这是一个deeplabv3-plus-pytorch的源码,可以用于训练自己的模型。

DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download 训练步骤

Bubbliiiing 350 Dec 28, 2022
Modeling CNN layers activity with Gaussian mixture model

GMM-CNN This code package implements the modeling of CNN layers activity with Gaussian mixture model and Inference Graphs visualization technique from

3 Aug 05, 2022
The Adapter-Bot: All-In-One Controllable Conversational Model

The Adapter-Bot: All-In-One Controllable Conversational Model This is the implementation of the paper: The Adapter-Bot: All-In-One Controllable Conver

CAiRE 37 Nov 04, 2022