CS546 Machine Learning in NLP (Spring 2018)

Tue/Thu 3:30 PM – 4:45 PM, 0216 Siebel Center

Julia Hockenmaier (TAs: Chris Cervantes, Anjali Narayan-Chen)

Tue/Thu 3:30 PM – 4:45 PM, 0216 Siebel Center

Julia Hockenmaier (TAs: Chris Cervantes, Anjali Narayan-Chen)

CS546 gives a graduate-level introduction to the statistical and machine learning methods used in natural language processing. We will largely focus on neural approaches this year, but may also cover other kinds of approaches. Prerequisites are a basic understanding of NLP, probability, statistics, linear algebra and machine learning, as well as solid programming skills. Students will learn to read the current literature, and apply these models to NLP problems. They will be required to do a research project, to give class presentations, and to write critical reviews of relevant papers.

Goldberg (2017) Neural Network Methods for Natural Language processing (you can get the PDF for free through the University)

35% paper presentation

50% research project

10% paper reviews

5% class participation

01/16 | Introduction | Overview, Policies | |

01/18 | Motivation | What is NLP? Why neural models for NLP? | |

01/23 | More NLP basics | ||

01/25 | NO CLASS TODAY | ||

01/30 | ML basics | ||

02/01 | Neural Network training | ||

02/13 | Word Embeddings | ||

Pennington et al. (2014) Glove: Global Vectors for Word Representation, EMNLP. PDF Slides (Aming Ni) | |||

Levy et al. (2015) Improving Distributional Similarity with Lessons Learned from Word Embeddings, TACL(3). Paper PDF Slides (Collin Gress) | |||

02/15 | Language models | ||

Elman (1990) Finding Structure in time PDF Slides (Dominic Seyler) | |||

Graves (2014) Generating Sequences With Recurrent Neural Networks PDF Slides (Yuning Mao) | |||

A. Graves and J. Schmidhuber (2015) Framewise Phoneme Classification with Bidirectional LSTM Networks. PDF Slides (Chase Duncan) | |||

02/20 | More on RNNs for NLP | ||

Reading: | |||

Ling et al. (2015) Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation (EMNLP) PDF | |||

Greff et al. (2016) LSTM: A search space odyssey (IEEE Transactions on Neural Networks and Learning Systems) PDF Slides (Sidhartha Satapathy) | |||

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling PDF Slides (Shibi He) | |||

Deep RNNs PDF Slides (Haroun Habeeb) | |||

02/22 | CNNs for NLP | ||

Understanding CNNs for NLP | |||

Dauphin et al. (2017) Language Modeling with Gated Convolutional Networks, ICML PDF Slides (Jingfeng Xiao) | |||

Kalchbrenner et al. (2014) A Convolutional Neural Network for Modelling Sentences, ACL PDF Slides (Sameer Manchanda) | |||

Zhang and Wallace (2017) A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification (IJCNLP) PDF Slides (Ruichuan Zhang) | |||

Johnson and Teng (2015) Effective Use of Word Order for Text Categorization with Convolutional Neural Networks, NAACL. PDF Slides (Yi-Hsin Chen) | |||

02/27 | Multitask learning for NLP | ||

Reading: | |||

Collobert et al. (2011) NLP (almost) from Scratch (JMLR) PDF | |||

Kaiser et al. (2017) One Model to Learn them All (ArXiv) PDF | |||

Bingel et al. (2017) Identifying beneficial task relations for multi-task learning in deep neural networks (EACL) PDF |