CS546 gives a graduate-level introduction to the statistical and machine learning methods used in natural language processing. We will largely focus on neural approaches this year, but may also cover other kinds of approaches. Prerequisites are a basic understanding of NLP, probability, statistics, linear algebra and machine learning, as well as solid programming skills. Students will learn to read the current literature, and apply these models to NLP problems. They will be required to do a research project, to give class presentations, and to write critical reviews of relevant papers.

Required textbook

Goldberg (2017) Neural Network Methods for Natural Language processing (you can get the PDF for free through the University)


35% paper presentation
50% research project
10% paper reviews
5% class participation


01/16 Introduction Overview, Policies pdf
01/18 Motivation What is NLP? Why neural models for NLP? pdf
01/23 More NLP basics pdf
01/25 NO CLASS TODAY pdf
01/30 ML basics pdf
02/01 Neural Network training pdf
02/13 Word Embeddings pdf
Pennington et al. (2014) Glove: Global Vectors for Word Representation, EMNLP. PDF Slides (Aming Ni)
Levy et al. (2015) Improving Distributional Similarity with Lessons Learned from Word Embeddings, TACL(3). Paper PDF Slides (Collin Gress)
02/15 Language models
Elman (1990) Finding Structure in time PDF Slides (Dominic Seyler)
Graves (2014) Generating Sequences With Recurrent Neural Networks PDF Slides (Yuning Mao)
A. Graves and J. Schmidhuber (2015) Framewise Phoneme Classification with Bidirectional LSTM Networks. PDF Slides (Chase Duncan)
02/20 More on RNNs for NLP pdf
Ling et al. (2015) Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation (EMNLP) PDF
Greff et al. (2016) LSTM: A search space odyssey (IEEE Transactions on Neural Networks and Learning Systems) PDF Slides (Sidhartha Satapathy)
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling PDF Slides (Shibi He)
Deep RNNs PDF Slides (Haroun Habeeb)
02/22 CNNs for NLP
Understanding CNNs for NLP
Dauphin et al. (2017) Language Modeling with Gated Convolutional Networks, ICML PDF Slides (Jingfeng Xiao)
Kalchbrenner et al. (2014) A Convolutional Neural Network for Modelling Sentences, ACL PDF Slides (Sameer Manchanda)
Zhang and Wallace (2017) A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification (IJCNLP) PDF Slides (Ruichuan Zhang)
Johnson and Teng (2015) Effective Use of Word Order for Text Categorization with Convolutional Neural Networks, NAACL. PDF Slides (Yi-Hsin Chen)
02/27 Multitask learning for NLP pdf
Collobert et al. (2011) NLP (almost) from Scratch (JMLR) PDF
Kaiser et al. (2017) One Model to Learn them All (ArXiv) PDF
Bingel et al. (2017) Identifying beneficial task relations for multi-task learning in deep neural networks (EACL) PDF