Course materials for a 3-day seminar "Machine Learning and NLP: Advances and Applications" at New College of Florida

Overview

Machine Learning and NLP: Advances and Applications

This repository hosts the course materials used for a 3-day seminar "Machine Learning and NLP: Advances and Applications" as part of Independent Study Period 2020 at New College of Florida.

Note that the seminar was held in Jan 2020, and the content may be a little bit oudated (as of Feb 2022). Please also refer to a Fall 2021 full semester course "CIS6930 Topics in Computing for Data Science", which covers much wider (and a little bit newer) Deep Learning topics.

Syllabus

Course Description

This 3-day course provides students with an opportunity to learn Machine Learning and Natural Language Processing (NLP) from basics to applications. The course covers some state-of-the-art NLP techniques including Deep Learning. Each day consists of a lecture and a hands-on session to help students learn how to apply those techniques to real-world applications. During the hands-on session, students will be given assignments to develop programming code in Python. Three days are too short to fully understand the concepts that are covered by the course and learn to apply those techniques to actual problems. Students are strongly encouraged to complete reading assignments before the lecture to be ready for the course assignments, and bring a lot of questions to the course. :)

Learning Objectives

Students successfully completing the course will

  • demonstrate the ability to apply machine learning and natural language processing techniques to various types of problems.
  • demonstrate the ability to build their own machine learning models using Python libraries.
  • demonstrate the ability to read and understand research papers in ML and NLP.

Course Outline

  • Wed 1/22 Day 1: Machine Learning basics [Slides]

    • Machine learning examples
    • Problem formulation
    • Evaluation and hyper-parameter tuning
    • Data Processing basics with pandas
    • Machine Learning with scikit-learn
    • Hands-on material: [ipynb] Open In Colab
  • Thu 1/23 Day 2: NLP basics [Slides]

    • Unsupervised learning and visualization
    • Topic models
    • NLP basics with SpaCy and NLTK
    • Understanding NLP pipeline for feature extraction
    • Machine learning for NLP tasks (text classification, sequential tagging)
    • Hands-on material [ipynb] Open In Colab
    • Follow-up
      • Commonsense Reasoning (Winograd Schema Challenge)
  • Fri 1/24 Day 3: Advanced techniques and applications [Slides]

    • Basic Deep Learning techniques
    • Word embeddings
    • Advanced Deep Learning techniques for NLP
    • Problem formulation and applications to (non-)NLP tasks
    • Pre-training models: ELMo and BERT
    • Hands-on material: [ipynb] Open In Colab
    • Follow-up
      • The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time
      • Cross-lingual word/sentence embeddings

Reading Assignments & Recommendations:

The following online tutorials for students who are not familiar with the Python libraries used in the course. Each day will have a hands-on session that requires those libraries. Please do not expect to have enough time to learn how to use those libraries during the lecture.

The following list is a good starting point.

The course will cover the following papers as examples of (non-NLP) applications (probably in Day 3.) Students who'd like to learn how to apply Deep Learning techniques to your own problems are encouraged to read the following papers.

  • [1] A. Asai, S. Evensen, B. Golshan, A. Halevy, V. Li, A. Lopatenko, D. Stepanov, Y. Suhara, W.-C. Tan, Y. Xu, "HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments" Proc LREC 18, 2018. [Paper] [Dataset]
  • [2] S. Evensen, Y. Suhara, A. Halevy, V. Li, W.-C. Tan, S. Mumick, "Happiness Entailment: Automating Suggestions for Well-Being," Proc. ACII 2019, 2019. [Paper]
  • [3] Y. Suhara, Y. Xu, A. Pentland, "DeepMood: Forecasting Depressed Mood Based on Self-Reported Histories via Recurrent Neural Networks," Proc. WWW '17, 2017. [Paper]
  • [4] N. Bhutani, Y. Suhara, W.-C. Tan, A. Halevy, H. V. Jagadish, "Open Information Extraction from Question-Answer Pairs," Proc. NAACL-HLT 2019, 2019. [Paper]

Computing Resources:

The course requires students to write code:

  • Students are expected to have a personal computer at their disposal. Students should have a Python interpreter and the listed libraries installed on their machines.

The hands-on sessions will require the following Python libraries. Please install those libraries on your computer prior to the course. See also the reading assignment section for the recommended tutorials.

  • pandas
  • scikit-learn
  • gensim
  • spacy
  • nltk
  • torch (PyTorch)
Owner
Yoshi Suhara
Yoshi Suhara
A working roblox account generator it doesnt bypass the capcha stuff cuz these didnt showed up in my test runs

A working roblox account generator (state 11.5.2021) it doesnt bypass the capcha stuff cuz these didnt showed up in my test runs

TerrificTable 22 Jan 03, 2023
Blender addon, import and update mixamo animation

This is a blender addon for import and update mixamo animations.

ywaby 7 Apr 19, 2022
An animal facts python module

An animal facts python module

Fayas Noushad 3 Dec 19, 2021
The Python Achievements Framework!

Pychievements: The Python Achievements Framework! Pychievements is a framework for creating and tracking achievements within a Python application. It

Brian 114 Jul 21, 2022
1000+ ready code templates to kickstart your next AI experiment

AI Seed Projects Start with ready code for your next AI experiment. Choose from 1000+ code templates, across a wide variety of use cases. All examples

BlobCity, Inc 98 Jan 03, 2023
Automated rop chain generation

This is the accompanying code to the blog post talking about automated rop chain generation. Build the test file with: make Install the dependencies:

Christopher Roberts 14 Nov 22, 2022
Notifies server owners of mod updates, also notifies of player deaths and player joins through Discord.

ProjectZomboid-ServerAssistant Notifies server owners of mod updates, also notifies of player deaths and player joins through Discord. A Python based

3 Sep 30, 2022
Pdraw - Generate Deterministic, Procedural Artwork from Arbitrary Text

pdraw.py: Generate Deterministic, Procedural Artwork from Arbitrary Text pdraw a

Brian Schrader 2 Sep 12, 2022
Procedural 3D data generation pipeline for architecture

Synthetic Dataset Generator Authors: Stanislava Fedorova Alberto Tono Meher Shashwat Nigam Jiayao Zhang Amirhossein Ahmadnia Cecilia bolognesi Dominik

Computational Design Institute 49 Nov 25, 2022
Customisable coding font with alternates, ligatures and contextual positioning

Guide Ligature Support Links Log License Guide Live Preview + Download larsenwork.com/monoid Install Quit your editor/program. Unzip and open the fold

Andreas Larsen 7.6k Dec 30, 2022
Senator Stock Trading Tester

Senator Stock Trading Tester Program to compare stock performance of Senator's transactions vs when the sale is disclosed. Using to find if tracking S

Cole Cestaro 1 Dec 07, 2021
Aplicação que envia regularmente um email ao utilizador com todos os filmes disponíveis no cartaz dos cinemas Nos.

Cartaz-Cinemas-Nos Aplicação que envia regularmente uma notificação ao utilizador com todos os filmes disponíveis no cartaz dos cinemas Nos. Só funcio

Cavalex 1 Jan 09, 2022
A tool to allow New World players to calculate the best place to put their Attribute Points for their build and level

New World Damage Simulator A tool designed to take a characters base stats including armor and weapons, level, and base damage of their items (slash d

Joseph P Langford 31 Nov 01, 2022
Example platform plugin that fixes fentry calls in Binja

Example Binja Platform Plugin This is an example Binja platform plugin which fixes up linux kernel module calls to __fentry__. __fentry__ is the linux

_yrp 2 Oct 07, 2021
Demo of a WAM Prolog implementation in Python

Prol: WAM demo This is a simplified Warren Abstract Machine (WAM) implementation for Prolog, that showcases the main instructions, compiling, register

Bruno Kim Medeiros Cesar 62 Dec 26, 2022
Ronin - Create Fud Meterpreter Payload To Hack Windows 11

Ronin - Create Fud Meterpreter Payload To Hack Windows 11

Dj4w3d H4mm4di 6 May 09, 2022
Deis v1, the CoreOS and Docker PaaS: Your PaaS. Your Rules.

This repository (deis/deis) is no longer developed or maintained. The Deis v1 PaaS based on CoreOS Container Linux and Fleet has been replaced by Deis

Deis 6.1k Jan 04, 2023
Бэкапалка таблиц mysql 8 через брокер сообщений nats

nats-mysql-tables-backup Бэкап таблиц mysql 8 через брокер сообщений nats (проверено и работает в ubuntu 20.04, при наличии python 3.8) ПРИМЕРЫ: Ниже

Constantine 1 Dec 13, 2021
A password genarator/manager for passwords uesing a pseudorandom number genarator

pseudorandom-password-genarator a password genarator/manager for passwords uesing a pseudorandom number genarator when you give the program a word eg

1 Nov 18, 2021
Linux Backlight Manager

Is a program to manage your laptop keyboard backlights in linux. Tested on Tuxedo / Clevo / Monste models. Must be tested on other devices

Arshia Ihammi 4 Jan 14, 2022