These are resources from the class.
Slides [Updated for 2020]
Note: Slides are provided under Creative Commons 4.0 Share-Alike Non-Commercial. You are encouraged to make comments on the slides so that they can be improved.
- Introduction to NLP for Data Scientists [Google Drive - Comments Open]
- Modern NLP Approaches [Google Drive - Comments Open]
- Data Science + NLP Use-Cases [Google Drive - Comments Open]
Presentations/Readings from other Researchers
- “How to do good research, get it published in SIGKDD and get it cited!”, Eamonn Keogh, SIGKDD 2009 Tutorial. URL
- Heuristics for Scientific Writing (a Machine Learning Perspective) - Zachary C. Lipton URL
- Developing Language Annotation for Machine Learning Algorithms - Marie Meteer URL
Tools
DataSets
COVID-19
Local Language NLP Tasks
Misinformation/Disinformation
- Credibility Corpus with several datasets (Twitter, Web database) in French and English
- Fake News Challenge
- Fake News
- Hyperpartisan News Detection
- RumourEval
Hate Speech
- Multilingual detection of hate speech against immigrants and women in Twitter (hatEval)
- OffensEval: Identifying and Categorizing Offensive Language in Social Media
- HateSpeech