A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

Last update: Jan 02, 2023

Overview

multitask-learning-transformers

A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

Colab Notebook

Trained Huggingface Model

HF Model

Install depedencies

pip install -r requirements.txt

Run training

python3 main.py \
        --model_name_or_path='roberta-base' \
        --per_device_train_batch_size=8 \
        --output_dir=output --num_train_epochs=1

Single Encoder Multiple Output Heads

A multi-task model in the age of BERT works by having a shared BERT-style encoder transformer, and different task heads for each task.

Shared Encoder

Separate models for each task, but we make them share the same encoder.

References: Multi-task Training with Transformers+NLP

Owner

Shahrukh Khan

CS Grad Student @ Saarland University

GitHub Repository

OpenAI CLIP text encoders for multiple languages!

Multilingual-CLIP OpenAI CLIP text encoders for any language Colab Notebook · Pre-trained Models · Report Bug Overview OpenAI recently released the pa

481 Dec 30, 2022

SinglepassTextCluster, an TextCluster tools based on Singlepass cluster algorithm that use tfidf vector and doc2vec，which can be used for individual real-time corpus cluster task。基于single-pass算法思想的自动文本聚类小组件，内置tfidf和doc2vec两种文本向量方法，可自动输出聚类数目、类簇文档集合和簇类大小，用于自有实时数据的聚类任务。

项目的背景 SinglepassTextCluster, an TextCluster tool based on Singlepass cluster algorithm that use tfidf vector and doc2vec，which can be used for individ

34 Dec 18, 2022

A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

Related tags

Overview

multitask-learning-transformers

Colab Notebook

Trained Huggingface Model

Install depedencies

Run training

Single Encoder Multiple Output Heads

Shared Encoder

Owner

Shahrukh Khan

OpenAI CLIP text encoders for multiple languages!

A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

A single model that parses Universal Dependencies across 75 languages.

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

A Structured Self-attentive Sentence Embedding

Scikit-learn style model finetuning for NLP

Chinese named entity recognization (bert/roberta/macbert/bert_wwm with Keras)

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU

Linking data between GBIF, Biodiverse, and Open Tree of Life

SGMC: Spectral Graph Matrix Completion

Twewy-discord-chatbot - Build a Discord AI Chatbot that Speaks like Your Favorite Character

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

This program do translate english words to portuguese

Baseline code for Korean open domain question answering(ODQA)

多语言降噪预训练模型MBart的中文生成任务

Code for text augmentation method leveraging large-scale language models

This is the offline-training-pipeline for our project.