nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch

nlp-tutorial

nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch. Most of the models in NLP were implemented with less than 100 lines of code.(except comments or blank lines)

[08-14-2020] Old TensorFlow v1 code is archived in the archive folder. For beginner readability, only pytorch version 1.0 or higher is supported.

Curriculum - (Example Purpose)

1. Basic Embedding Model

1-1. NNLM(Neural Network Language Model) - Predict Next Word
- Paper - A Neural Probabilistic Language Model(2003)
- Colab - NNLM.ipynb
1-2. Word2Vec(Skip-gram) - Embedding Words and Show Graph
- Paper - Distributed Representations of Words and Phrases and their Compositionality(2013)
- Colab - Word2Vec.ipynb
1-3. FastText(Application Level) - Sentence Classification
- Paper - Bag of Tricks for Efficient Text Classification(2016)
- Colab - FastText.ipynb

2. CNN(Convolutional Neural Network)

2-1. TextCNN - Binary Sentiment Classification
- Paper - Convolutional Neural Networks for Sentence Classification(2014)
- TextCNN.ipynb

3. RNN(Recurrent Neural Network)

3-1. TextRNN - Predict Next Step
- Paper - Finding Structure in Time(1990)
- Colab - TextRNN.ipynb
3-2. TextLSTM - Autocomplete
- Paper - LONG SHORT-TERM MEMORY(1997)
- Colab - TextLSTM.ipynb
3-3. Bi-LSTM - Predict Next Word in Long Sentence
- Colab - Bi_LSTM.ipynb

4. Attention Mechanism

4-1. Seq2Seq - Change Word
- Paper - Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation(2014)
- Colab - Seq2Seq.ipynb
4-2. Seq2Seq with Attention - Translate
- Paper - Neural Machine Translation by Jointly Learning to Align and Translate(2014)
- Colab - Seq2Seq(Attention).ipynb
4-3. Bi-LSTM with Attention - Binary Sentiment Classification
- Colab - Bi_LSTM(Attention).ipynb

5. Model based on Transformer

5-1. The Transformer - Translate
- Paper - Attention Is All You Need(2017)
- Colab - Transformer.ipynb, Transformer(Greedy_decoder).ipynb
5-2. BERT - Classification Next Sentence & Predict Masked Tokens
- Paper - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding(2018)
- Colab - BERT.ipynb

Dependencies

Python 3.5+
Pytorch 1.0.0+

Author

Tae Hwan Jung(Jeff Jung) @graykode
Author Email : [email protected]
Acknowledgements to mojitok as NLP Research Internship.

nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch

Related tags

Overview

nlp-tutorial

Curriculum - (Example Purpose)

1. Basic Embedding Model

2. CNN(Convolutional Neural Network)

3. RNN(Recurrent Neural Network)

4. Attention Mechanism

5. Model based on Transformer

Dependencies

Author

Owner

Tae-Hwan Jung

Nateve compiler developed with python.

Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)

Code for Editing Factual Knowledge in Language Models

NLP: SLU tagging

Create a machine learning model which will predict if the mortgage will be approved or not based on 5 variables

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

Malware-Related Sentence Classification

A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/casual, active/passive, and many more. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

Convolutional 2D Knowledge Graph Embeddings resources

Creating an LSTM model to generate music

Modeling cumulative cases of Covid-19 in the US during the Covid 19 Delta wave using Bayesian methods.

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

NLPShala , the best IDE for all Natural language processing tasks.

Code for the ACL 2021 paper "Structural Guidance for Transformer Language Models"

Pre-training BERT masked language models with custom vocabulary

Yes it's true :broken_heart:

Conversational text Analysis using various NLP techniques

Data manipulation and transformation for audio signal processing, powered by PyTorch

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles