In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Last update: Apr 13, 2022

Overview

Transformers are all you need

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Table of Content

The workshop will be divided into four parts

Introduction to Transformers as a HYPE
Sneak peek to the theory behind Transfomers
Quick tour (Huggingface framework)
Lab
- fine tune a translation model

Note that you can always open the notebooks on Google Colab ( No need to install anything ) you just need a stable internet connection :

- fine tune a translation model

2. How to get started

Fork this repository
Create a branch by your name
Go through the notebook and complete all tasks
Submit a pull request

Homework exercise

Your task is to fine-tune a classification model

Using HuggingFace transformers and datasets.
fine tune it to one of the classification task of the GLUE Benchmark(CoLa to be specific).
Use a checkpoint from the Hub ("distilbert-base-uncased" for example)
Once finished submit a pull request to this repo, make sure to place your .ipynb file in the submissions folder (YOUR_NAME.ipynb)

Useful ressources : text_classification

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Related tags

Overview

Transformers are all you need

Table of Content

Note that you can always open the notebooks on Google Colab ( No need to install anything ) you just need a stable internet connection :

2. How to get started

Homework exercise

Owner

Aymen Berriche

Wind Speed Prediction using LSTMs in PyTorch

PyTorch implementation of Microsoft's text-to-speech system FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.

Generating Korean Slogans with phonetic and structural repetition

Edge-Augmented Graph Transformer

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Spert NLP Relation Extraction API deployed with torchserve for inference

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

A PyTorch-based model pruning toolkit for pre-trained language models

This program do translate english words to portuguese

SimBERT升级版（SimBERTv2）！

TFIDF-based QA system for AIO2 competition

MiCECo - Misskey Custom Emoji Counter

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.

Product-Review-Summarizer - Created a product review summarizer which clustered thousands of product reviews and summarized them into a maximum of 500 characters, saving precious time of customers and helping them make a wise buying decision.

CDLA: A Chinese document layout analysis (CDLA) dataset

BeautyNet is an AI powered model which can tell you whether you're beautiful or not.

Google AI 2018 BERT pytorch implementation

A Python/Pytorch app for easily synthesising human voices

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".