A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

Overview

Resources to Help Global Equality for PhDs in NLP / AI

This repo originates with a wish to promote Global Equality for people who want to do a PhD in NLP, following the idea that mentorship programs are an effective way to fight against segregation, according to The Human Networks (Jackson, 2019). Specifically, we wish people from all over the world and with all types of backgrounds can share the same source of information, so that success will be a reward to those who are determined and hardworking, regardless of external contrainsts.

One non-negligible reason for success is access to information, such as (1) knowing what a PhD in NLP is like, (2) knowing what top grad schools look for when reviewing PhD applications, (3) broadening your horizon of what is good work, (4) knowing how careers in NLP in both academia and industry are like, and many others.

Contributor: Zhijing Jin (PhD student in NLP at Max Planck Institute, co-organizer of the ACL Year-Round Mentorship Program).

You are welcome to be a collaborator, -- you can make an issue/pull request, and I can add you :).

Endorsers of this repo: Prof Rada Mihalcea (University of Michigan). Please add your name here (by a pull request) if you endorse this repo :).

Contents (Actively Updating)

Top Resources

  1. Online ACL Year-Round Mentorship Program: https://acl-mentorship.github.io (You can apply as a mentee, as a mentor, or as a volunteer. For mentees, you will be able to attend monthly zoom Q&A sessions hosted senior researchers in NLP. You will also join a global slack channel, where you can constantly post your questions, and we will collect answers from senior NLP researchers.)

Stage 1. (Non-PhD -> PhD) How to Apply to PhD?

  1. (Prof Philip [email protected]) Finding CS Ph.D. programs to apply to. [Video]

  2. (Prof Mor Harchol-Balter@CMU) Applying to Ph.D. Programs in Computer Science (2014). [Guide]

  3. (Prof Jason [email protected]) Advice for Research Students (last updated: 2021). [List of suggestions]

  4. (CS Rankings) Advice on Applying to Grad School in Computer Science. [Pointers]

  5. (Nelson Liu, [email protected]) Student Perspectives on Applying to NLP PhD Programs (2019). [Suggestions Based on Surveys]

  6. A Princeton CS Major's Guide to Applying to Graduate School. [List of suggestions]

  7. (John Hewitt, [email protected]) Undergrad to PhD, or not - advice for undergrads interested in research (2018). [Suggestions]

  8. (Kalpesh Krishna, [email protected] Amherst) Grad School Resources (2018). [Article] (This list lots of useful pointers!)

  9. (Prof Scott E. [email protected]) Quora answers on the LTI program at CMU (2017). [Article]

  10. (Albert Webson et al., [email protected] University) Resources for Underrepresented Groups, including Brown's Own Applicant Mentorship Program (2020, but we will keep updating it throughout the 2021 application season.) [List of Resources]

Specific Suggestions

  1. (Prof Nathan [email protected] University) Inside Ph.D. admissions: What readers look for in a Statement of Purpose. [Article]

Improve Your Proficiency with Tools

  1. (MIT 2020) The Missing Semester of Your CS Education (e.g., master the command-line, ssh into remote machines, use fancy features of version control systems).

Stage 2. (Doing PhD) How to Succeed in PhD?

  1. (Maxwell Forbes, [email protected]) Every PhD Is Different. [Suggestions]

  2. (Prof Mark [email protected], Prof Hanna M. [email protected] Amherst) How to be a successful PhD student (in computer science (in NLP/ML)). [Suggestions]

  3. (Andrej Karpathy) A Survival Guide to a PhD (2016). [Suggestions]

  4. (Prof Kevin [email protected]) Kevin Gimpel's Advice to PhD Students. [Suggestions]

  5. (Prof Marie [email protected] University) How to Succeed in Graduate School: A Guide for Students and Advisors (1994). [Article] [Part II]

  6. (Prof Eric [email protected]) Syllabus for Eric’s PhD students (incl. Prof's expectation for PhD students). [syllabus]

  7. (Prof H.T. [email protected]) Useful Thoughts about Research (1987). [Suggestions]

  8. (Prof Phil [email protected]) Networking on the Network: A Guide to Professional Skills for PhD Students (last updated: 2015). [Suggestions]

  9. (Prof Stephen C. [email protected]) Some Modest Advice for Graduate Students. [Article]

  10. (Prof Tao [email protected]) Graduate Student Survival/Success Guide. [Slides]

  11. (Mu [email protected]) 博士这五年 (A Chinese article about five years in PhD at CMU). [Article]

  12. (Karl Stratos) A Note to a Prospective Student. [Suggestions]

What Is Weekly Meeting with Advisors like?

  1. (Prof Jason [email protected]) What do PhD students talk about in their once-a-week meetings with their advisers during their first year? (2015). [Article]

  2. (Brown University) Guide to Meetings with Your Advisor. [Suggestions]

Practical Guides

  1. (Prof Srinivasan [email protected]) How to Read a Paper (2007). [Suggestions]

  2. (Prof Jason [email protected]) How to Read a Technical Paper (2009). [Suggestions]

  3. (Prof Jason [email protected]) How to write a paper? (2010). [Suggestions]

Memoir-Like Narratives

  1. (Prof Philip [email protected]) The Ph.D. Grind: A Ph.D. Student Memoir (last updated: 2015). [Video] (For the book, you have to dig deeply, and then you will find the book.)

  2. (Prof Tianqi [email protected]) 陈天奇:机器学习科研的十年 (2019) (A Chinese article about ten years of research in ML). [Article]

  3. (Jean Yang) What My PhD Was Like. [Article]

How to Excel Your Research

  1. The most important step: (Prof Jason [email protected]) How to Find Research Problems (1997). [Suggestions]

Grad School Fellowships

  1. (List compiled by CMU) Graduate Fellowship Opportunities [link]
  2. CYD Fellowship for Grad Students in Switzerland [link]

Other Books

  1. The craft of Research by Wayne Booth, Greg Colomb and Joseph Williams.

  2. How to write a better thesis by Paul Gruba and David Evans

  3. Helping Doctoral Students to write by Barbara Kamler and Pat Thomson

  4. The unwritten rules of PhD research by Marian Petre and Gordon Rugg

Stage 3. (After PhD -> Industry) How is life as an industry researcher?

  1. (Mu [email protected]) 工作五年反思 (A Chinese article about reflections on the five years working in industry). [Article]

Stage 4. (Being a Prof) How to get an academic position? And how to be a good prof?

  1. (Prof Jason [email protected]) How to write an academic research statement (when applying for a faculty job) (2017). [Article]

  2. (Prof Jason [email protected]) How to Give a Talk (2015). [Suggestions]

  3. (Prof Jason [email protected]) Teaching Philosophy. [Article]

Stage 5. (Whole Career Path) How to live out a life career as an NLP research?

  1. (Prof Charles [email protected] University, Prof Qiang [email protected])Crafting Your Research Future: A Guide to Successful Master's and Ph.D. Degrees in Science & Engineering. [Book]

Further Readings: Technical Materials to Improve Your NLP Research Skills

  1. (Prof Jason [email protected]) Technical Tutorials, Notes, and Suggested Reading (last updated: 2018) [Reading list]

Contributions

All types of contributions to this resource list is welcome. Feel free to open a Pull Request.

Contact: Zhijing Jin, PhD in NLP at Max Planck Institute for Intelligent Systems, working on NLP & Causality.

How to Cite This Repo

@misc{resources2021jin,
  author = {Zhijing Jin},
  title = {Resources to Help Global Equality for PhDs in NLP},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/zhijing-jin/nlp-phd-global-equality}}
}
Owner
PhD in NLP & Causality. Affiliated with Max Planck Institute, Germany & ETH & UMich. Supervised by Bernhard Schoelkopf, Rada Mihalcea, and Mrinmaya Sachan.
Natural Language Processing with transformers

we want to create a repo to illustrate usage of transformers in chinese

Datawhale 763 Dec 27, 2022
DaCy: The State of the Art Danish NLP pipeline using SpaCy

DaCy: A SpaCy NLP Pipeline for Danish DaCy is a Danish preprocessing pipeline trained in SpaCy. At the time of writing it has achieved State-of-the-Ar

Kenneth Enevoldsen 71 Jan 06, 2023
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization (ACL 2021)

Structured Super Lottery Tickets in BERT This repo contains our codes for the paper "Super Tickets in Pre-Trained Language Models: From Model Compress

Chen Liang 16 Dec 11, 2022
VoiceFixer VoiceFixer is a framework for general speech restoration.

VoiceFixer VoiceFixer is a framework for general speech restoration. We aim at the restoration of severly degraded speech and historical speech. Paper

Leo 174 Jan 06, 2023
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual

Rasa 15.3k Dec 30, 2022
test

Lidar-data-decode In this project, you can decode your lidar data frame(pcap file) and make your own datasets(test dataset) in Windows without any hug

46 Dec 05, 2022
Chinese Pre-Trained Language Models (CPM-LM) Version-I

CPM-Generate 为了促进中文自然语言处理研究的发展,本项目提供了 CPM-LM (2.6B) 模型的文本生成代码,可用于文本生成的本地测试,并以此为基础进一步研究零次学习/少次学习等场景。[项目首页] [模型下载] [技术报告] 若您想使用CPM-1进行推理,我们建议使用高效推理工具BMI

Tsinghua AI 1.4k Jan 03, 2023
SimBERT升级版(SimBERTv2)!

RoFormer-Sim RoFormer-Sim,又称SimBERTv2,是我们之前发布的SimBERT模型的升级版。 介绍 https://kexue.fm/archives/8454 训练 tensorflow 1.14 + keras 2.3.1 + bert4keras 0.10.6 下载

317 Dec 23, 2022
GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model -- based on GPT-3, called GPT-Codex -- that is fine-tuned on publicly available code from GitHub.

Nathan Cooper 2.3k Jan 01, 2023
💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"

Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes Official PyTorch implementation and EmoCause evaluatio

Hyunwoo Kim 50 Dec 21, 2022
Uncomplete archive of files from the European Nopsled Team

European Nopsled CTF Archive This is an archive of collected material from various Capture the Flag competitions that the European Nopsled team played

European Nopsled 4 Nov 24, 2021
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Table of contents Introduction Using BARTpho with fairseq Using BARTpho with transformers Notes BARTpho: Pre-trained Sequence-to-Sequence Models for V

VinAI Research 58 Dec 23, 2022
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

Google Research 4.6k Jan 01, 2023
I can help you convert your images to pdf file.

IMAGE TO PDF CONVERTER BOT Configs TOKEN - Get bot token from @BotFather API_ID - From my.telegram.org API_HASH - From my.telegram.org Deploy to Herok

MADUSHANKA 10 Dec 14, 2022
Toy example of an applied ML pipeline for me to experiment with MLOps tools.

Toy Machine Learning Pipeline Table of Contents About Getting Started ML task description and evaluation procedure Dataset description Repository stru

Shreya Shankar 190 Dec 21, 2022
Fake Shakespearean Text Generator

Fake Shakespearean Text Generator This project contains an impelementation of stateful Char-RNN model to generate fake shakespearean texts. Files and

Recep YILDIRIM 1 Feb 15, 2022
Sequence-to-Sequence Framework in PyTorch

nmtpytorch allows training of various end-to-end neural architectures including but not limited to neural machine translation, image captioning and au

LIUM 395 Nov 21, 2022
In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.

Med-VQA In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset. Two of these are made on top of Facebook AI Reasearch's Multi-Mo

Kshitij Ambilduke 8 Apr 14, 2022
Tools and data for measuring the popularity & growth of various programming languages.

growth-data Tools and data for measuring the popularity & growth of various programming languages. Install the dependencies $ pip install -r requireme

3 Jan 06, 2022
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 07, 2023