CS546 gives a graduate-level introduction to the statistical and machine learning methods used in natural language processing. We will largely focus on neural approaches this year, but may also cover other kinds of approaches. Prerequisites are a basic understanding of NLP, probability, statistics, linear algebra and machine learning, as well as solid programming skills. Students will learn to read the current literature, and apply these models to NLP problems. They will be required to do a research project, to give class presentations, and to write critical reviews of relevant papers.

Required textbook

Goldberg (2017) Neural Network Methods for Natural Language processing (you can get the PDF for free through the University)


35% paper presentation
50% research project
10% paper reviews
5% class participation


01/16 Introduction Overview, Policies pdf
01/18 Motivation What is NLP? Why neural models for NLP? pdf
01/23 More NLP basics pdf
01/25 NO CLASS TODAY pdf
01/30 ML basics pdf
02/01 Neural Network training pdf
02/13 Word Embeddings pdf
Pennington et al. (2014) Glove: Global Vectors for Word Representation, EMNLP. PDF Slides (Aming Ni)
Levy et al. (2015) Improving Distributional Similarity with Lessons Learned from Word Embeddings, TACL(3). Paper PDF Slides (Collin Gress)
02/15 Language models
Elman (1990) Finding Structure in time PDF Slides (Dominic Seyler)
Graves (2014) Generating Sequences With Recurrent Neural Networks PDF Slides (Yuning Mao)
A. Graves and J. Schmidhuber (2015) Framewise Phoneme Classification with Bidirectional LSTM Networks. PDF Slides (Chase Duncan)
02/20 More on RNNs for NLP
Ling et al. (2015) Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation (EMNLP) PDF Slides (Shibi He)
Greff et al. (2016) LSTM: A search space odyssey (IEEE Transactions on Neural Networks and Learning Systems) PDF Slides (Sidhartha Satapathy)
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling PDF Slides (Shibi He)
Deep RNNs PDF Slides (Haroun Habeeb)
02/22 CNNs for NLP
Background reading: Understanding CNNs for NLP
Dauphin et al. (2017) Language Modeling with Gated Convolutional Networks, ICML PDF Slides (Jingfeng Xiao)
Kalchbrenner et al. (2014) A Convolutional Neural Network for Modelling Sentences, ACL PDF Slides (Sameer Manchanda)
Zhang and Wallace (2017) A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification (IJCNLP) PDF Slides (Ruichuan Zhang)
Johnson and Teng (2015) Effective Use of Word Order for Text Categorization with Convolutional Neural Networks, NAACL. PDF Slides (Yi-Hsin Chen)
02/27 Multitask learning for NLP
Background reading: Multi-task learning for NLP
Collobert et al. (2011) NLP (almost) from Scratch (JMLR) (PDF, Slides (Sarah Schlieferstein and Shruti Bhargava))
Kaiser et al. (2017) One Model to Learn them All (ArXiv) (PDF, Slides (Shruti Bhargava))
Bingel et al. (2017) Identifying beneficial task relations for multi-task learning in deep neural networks (EACL) (PDF, Slides (Litian Ma))
03/01 Parsing
Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A. Smith. Transition-based dependency parsing with stack long short-term memory. ACL 2015 (PDF, Slides (Lavisha Aggarwal)
Eliyahu Kiperwasser and Yoav Goldberg. Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations, TACL 2016 (PDF, Slides (Yaoyang Zhang))
James Cross and Liang Huang. Incremental Parsing with Minimal Features Using Bi-Directional LSTM (PDF, Slides (Haocheng Zhang))
Chris Dyer et al. Recurrent Neural Network Grammars NAACL 2016 (PDF, Slides (Che-Lin Huang))
03/06 More Parsing
Socher et al. Parsing with Compositional Vector Grammars. ACL 2013 (PDF, Slides (Yuncheng Wu))
G. Durrett and D. Klein, Neural CRF Parsing. ACL 2015 (PDF, Slides (Yundi Fei))
Kuncoro et al. What Do Recurrent Neural Network Grammars Learn About Syntax? (PDF, Slides (Triveni Putti))
Stern et al. A Minimal Span-Based Neural Constituency Parser. ACL 2017 (PDF, Slides (Boyin Zhang))
03/08 Semantic Parsing
Buys et al. Robust Incremental Neural Semantic Graph Parsing, ACL 17 (PDF, Slides (Zihan Wang))
Su et al. Cross-domain Semantic Parsing via Paraphrasing, EMNLP 17 (PDF, Slides (Sha Li))
Cheng et al. Learning Structured Natural Language Representations for Semantic Parsing, ACL 17 (PDF, Slides (Rishika Agarwal))
Rabinovich et al. Abstract Syntax Networks for Code Generation and Semantic Parsing, ACL 17 (PDF, Slides (Patrick Crain))
Jia and Liang, Data Recombination for Neural Semantic Parsing, ACL 16 (PDF, Slides (Edward Xue))
03/13 Coreference Resolution
Clark et al. Deep Reinforcement Learning for Mention-Ranking Coreference Models, EMNLP 16 (PDF, Slides (Zubin Pahuja))
Kevin Clark and Christopher D. Manning. 2016. Improving Coreference Resolution by Learning Entity-Level Distributed Representations. ACL 16 (PDF, Slides (Ben Zhou))
Lee et al. End-to-end neural coreference resolution, EMNLP 17. (PDF, Slides (Wenxuan Hu))
03/15 Attention
Sutskever et al. 2014 Sequence to Sequence Learning with Neural Networks (PDF, Slides (Xinyu Zhou))
Bahdanau et al. Neural machine translation by jointly learning to align and translate. ICLR 15 (PDF, Slides (Jing Huang))
Luong et al. Effective Approaches to Attention-based Neural Machine Translation, EMNLP 2015 (PDF, Slides (Yunan Zhang))
Vinyals et al. Grammar as foreign language NIPS 2015 (PDF, Slides (Ved Upadhyay))
Vaswani et al. Attention Is All You Need NIPS 2017 (PDF, Slides (Hsuan-Yu Chen))
03/27 Machine Translation
Johnson et al (2017). Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation,TACL (PDF, Slides (Kejia Jiang))
Chen et al. (2017) Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder, ACL. (PDF, Slides (Ziqiao Ding))
Gehring et al. (2017) A Convolutional Encoder Model for Neural Machine Translation, TACL (PDF, Slides (Maghav Kumar))
Belinkov et al. (2017) What do Neural Machine Translation Models Learn about Morphology, ACL (PDF, Slides (Raghav Gurabxani))
Ding et al. (2017) Visualizing and Understanding Neural Machine Translation, ACL (PDF, Slides (Yuchen He))
03/29 Natural Language Generation
Kiddon et al. (2016) Globally Coherent Text Generation with Neural Checklist Models, EMNLP. (PDF, Slides (Webber Lee))
Lebret et al. (2016) Neural Text Generation from Structured Data with Application to the Biography Domain, EMNLP (PDF, Slides (Abhinav Kohar))
Gardent et al. (2017) Creating Training Corpora for NLG Micro-Planning (PDF, Slides (Omar Elabd))
Dong et al. (2017) Learning to Generate Product Reviews from Attributes, EACL. (PDF, Slides (Yimeng Zhou))
Konstas et al. (2017) Neural AMR: Sequence-to-Sequence Models for Parsing and Generation, ACL (PDF, Slides (Yuan Cheng))
04/03 Discourse
Ji and Smith (2017) Neural Discourse Structure for Text Categorization (ACL) (PDF, Slides (Ji Li))
Qin et al. (2017) Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification, ACL (PDF, Slides (Shubham Jain))
Rutherford et al. (2017) A Systematic Study of Neural Discourse Models for Implicit Discourse Relation, EACL (PDF, Slides (Dhruv Agarwal))
Lan et al. (2017) Multi-task Attention-based Neural Networks for Implicit Discourse Relationship Representation and Identification, EMNLP. (PDF, Slides (Aidan San))
04/05 Reinforcement Learning in NLP
Minh et al. Human-level control through deep reinforcement learning (Nature) (PDF, Slides (Guanheng Luo))
He et al. (2016) Deep Reinforcement Learning with a Natural Language Action Space, ACL (PDF, Slides (Victor Ge))
Fang et al. (2017) Learning how to Active Learn: A Deep Reinforcement Learning Approach, EMNLP. (PDF, Slides (Jialin Song))
Nogueira et al. (2017) Task-Oriented Query Reformulation with Reinforcement Learning, EMNLP. (PDF, Slides (Chris Benson))
04/10 Dialog
Su et al. (2016) Active Reward Learning for Policy Optimization in Spoken Dialogue Systems, ACL (PDF, Slides (Juho Kim)),
Wen et al. (2017) A Network-based End-to-End Trainable Task-oriented Dialogue System, EACL (PDF, Slides (Qihao Shao)),
Eshghi et al. (2017) Bootstrapping incremental dialog systems from minimal data: the generalization power of dialogue grammars (EMNLP) (PDF, Slides (Prashant Jayannavar)),
Li et al. (2017) Adversarial Learning for Neural Dialogue Generation, EMNLP (PDF, Slides (Yiren Wang)),
Peng et al. (2017) Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning, EMNLP (PDF, Slides (Shan Zhou))
04/12 Multimodal NLP
Xu et al. 2015 Show, Attend and Tell: Neural Image Caption Generation with Visual Attention (ICML) (PDF, Slides (Sai Krishna Bollam)),
Antol et al. 2015 Visual Question Answering (CVPR) (PDF, Slides (Yanan Liu)),
Das et al. 2017 Visual Dialog (CVPR) (PDF, Slides (Wei-Chieh Wu)),
Suhr et al. 2017 A Corpus of Natural Language for Visual Reasoning (ACL) (PDF, Slides (Tarek Elgamal)),
She and Chai 2017 Interactive Learning of Grounded Verb Semantics towards Human-Robot Communication (PDF, Slides (Yuyang Rao))
04/17 Question Answering
Weston et al. 2015 Memory Networks (PDF, Slides (Dongming Lei)),
Sukhbaatar et al. 2015 End-to-End Memory Networks (NIPS) (PDF, Slides (Rohan Gupta)),
Rajpurkar et al. 2016 SquAD: 1000,000+ Questions for Machine Comprehension of Text (PDF, Slides (Jiaming Shen)),
Joshi et al. 2017 TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension, ACL (PDF, Slides (Zhuolun Xiang)),
Hao et al. 2017 An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge, ACL (PDF, Slides (Shivank Mishra)),
04/19 Entailment Recognition
Bowman et al. 2015 A large annotated corpus for learning natural language inference, EMNLP (PDF, Slides (Medhini Gulganjalli Narasimhan)),
Rocktäschel et al. 2016 Reasoning about Entailment with Neural Attention (ICLR) (PDF, Slides (Gerui Wang)),
Yin et al. 2016 ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs (TACL) (PDF, Slides (Ehsan Saleh)),
Parikh et al. 16, A Decomposable Attention Model for Natural Language Inference (EMNLP) (PDF, Slides (Xikun Zhang))
04/24 Reading Comprehension
Chen et al. (2016) A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task, ACL (PDF, Slides (Jianqiu Kong))
Sugawara et al. (2017) Evaluation Metrics for Machine Reading Comprehension: Prerequisite Skills and Readability, ACL (PDF, Slides (Shaima Abdul))
Yadav et al, (2017) Learning and Knowledge Transfer with Memory Networks for Machine Comprehension, EACL (PDF, Slides (Kyo Hyun Kim)))
Levy et al. (2017) Zero-Shot Relation Extraction via Reading Comprehension, CoNLL (PDF, Slides (Xiaodong Yu)
Peters et al (2018) Deep contextualized word representations, NAACL (PDF, Slides (Liyuan Liu))
04/26 Knowledge Graphs
Yaghoobzadeh and Schütze (2017) Multi-level Representations for Fine-Grained Typing of Knowledge Base Entities, EACL (PDF, Slides (Xiaotao Gu))
Liu et al. (2017) Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach, EMNLP (PDF, Slides (Mao-Chuang Yeh))
Verga et al. (2017) Generalizing to Unseen Entities and Entity Pairs with Row-less Universal Schema, EACL (PDF, Slides (Ranran Li))
Das et al. (2017) Chains of reasoning over entities, relations and text (PDF, Slides (Assma Boughoula))
Xiong et al. (2017) DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning, EMNLP (PDF, Slides (Fang Guo))