Awesome-NLP-Research (ANLP)

(Update on 2020-01-10: we have also added the presentations from the Fall 2020 installment of the course. Check for them under "slides2020".)

As part of the Fall 2018 course CPSC 677 "Advanced Natural Language Processing" at Yale, we developed, with the help of the students, a corpus of useful resources for NLP research. Bibliographies and Powerpoint Presentations for each topic are found below, in addition to several blog posts. We asked the students to also list relevant and prerequisite concepts for each topic, and these keywords are found here.

If you have any questions, would like to contribute further to this project or feel we are missing an important citation, please contact Alex Fabbri at alexander[dot]fabbri[at]yale.[first three letters of education]

Overview of papers presented in class

Capsule Networks for NLP by Will Merrill - BIB BLOG SLIDES
Commonsense Learning by Michihiro Yasunaga - BIB SLIDES
Dialogue Systems by Suyi Li - BIB SLIDES
Multilingual-Word-Embeddings by Davey Proctor - BIB SLIDES
Neural Embeddings By John Brandt - BIB SLIDES
Temporal and Dynamic Embeddings by Yavuz Nuzumlali - BIB SLIDES
NLP in Finance by Gaurav Pathak BIB SLIDES
Natural Language Generation by Tianwei She - BIB SLIDES
Knowledge Graphs by Tomoe Mizutani - BIB SLIDES
Cross-Lingual Information Retrieval by Rui Zhang - BIB BLOG SLIDES
Neural Information Retrieval by Danny Keller - BIB SLIDES
Character-Level Language Modeling by Angus Fong - BIB SLIDES
Latent Variable Models in NLP by Brian Kitano - BIB SLIDES
Unsupervised Machine Translation By Yongjie Lin - BIB SLIDES
Neural Computational Morphology by Garrett Bingham - BIB SLIDES
Network Methods by Noah Amsel - BIB SLIDES
Neural Semi-Supervised Learning by Alex Fabbri - BIB SLIDES
Question Answering by Talley Amir - BIB SLIDES
Attribute-Level Sentiment Analaysis by Ishita Chakraborty and Davey Proctor - BIB BLOG SLIDES
Semantic Parsing by Bo Pang - BIB SLIDES
Sequence2Sequence by Jack Koch - BIB SLIDES
Seq2SQL by Tao Yu - BIB SLIDES
Spectral Learning by Hannah Lawrence - BIB SLIDES
Single Document Summarization by Yi Chern Tan - BIB SLIDES
Transfer Learning by Irene Li - BIB SLIDES

Additionally, students from the class made blog posts on the following topics:

DARTS - BLOG
OpenAI Transformer - BLOG

Awesome-NLP-Research (ANLP)

Related tags

Overview

Awesome-NLP-Research (ANLP)

Overview of papers presented in class

Owner

Language, Information, and Learning at Yale

CDLA: A Chinese document layout analysis (CDLA) dataset

In this Notebook I've build some machine-learning and deep-learning to classify corona virus tweets, in both multi class classification and binary classification.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Python wrapper for Stanford CoreNLP tools v3.4.1

AutoGluon: AutoML for Text, Image, and Tabular Data

Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.

An implementation of WaveNet with fast generation

Community and sentiment analysis based on tweets

A repo for materials relating to the tutorial of CS-332 NLP

Nmt - TensorFlow Neural Machine Translation Tutorial

The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo

This repository contains helper functions which can help you generate additional data points depending on your NLP task.

Tool which allow you to detect and translate text.

Unsupervised text tokenizer for Neural Network-based text generation.

My Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks using Tensorflow

Chinese Grammatical Error Diagnosis

Train BPE with fastBPE, and load to Huggingface Tokenizer.

PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation

This repository serves as a place to document a toy attempt on how to create a generative text model in Catalan, based on GPT-2

Code for text augmentation method leveraging large-scale language models