Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

Last update: Feb 07, 2022

Related tags

Overview

NLP-Summarizer

Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

This project aimed to provide insight and explanations to current limitations on Natural Language Processing models by exploring the Transformer model, the latest state-of-the-art NLP solution, as well as discussing possible use cases for such tools in a domestic and workplace environment. An in-depth explanation of the architecture and the limitations it aims to solve was provided, as well as how it can be used to infer various tasks. Numerous use cases of NLP were also explored and how tools such as this can be extremely useful and have a massive impact on today’s society, both domestically and in the workplace. Three specific Transformer models were implemented using a GUI to evaluate their effectiveness. The final artefact provides a user with an interaction between the models for document summarisation tasks of variable output lengths.

Working Example

Following example created using another student's project introduction, original word count was ~1000.

Initial GUI

After Summarization

Getting Started

All code is ran using Python version 3.8.8
The artefact to be operated in it's entirety requires ~20GB of available space for downloads of the pre-trained models.

!pip install transformers
!pip install spacy==2.0.12
!pip install torch
!pip install tk

Runtime will be displayed as an output in console

Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

Related tags

Overview

NLP-Summarizer

Working Example

Initial GUI

After Summarization

Owner

Samuel Sharkey

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

Use fastai-v2 with HuggingFace's pretrained transformers

This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

An assignment on creating a minimalist neural network toolkit for CS11-747

BeautyNet is an AI powered model which can tell you whether you're beautiful or not.

Natural Language Processing Tasks and Examples.

Associated Repository for "Translation between Molecules and Natural Language"

Unsupervised text tokenizer for Neural Network-based text generation.

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

TalkNet: Audio-visual active speaker detection Model

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

auto_code_complete is a auto word-completetion program which allows you to customize it on your need

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

PyTorch implementation of Microsoft's text-to-speech system FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.

Machine Psychology: Python Generated Art

PortaSpeech - PyTorch Implementation

Japanese NLP Library

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)