Deep Learning

CS 547/ IE 534, Fall 2019

Instructor: Justin Sirignano
Teaching Assistant: Yuanyi Zhong, Xiaobo Dong, Lei Fan, Rachneet Kaur, Jyoti Aneja, Peijun Xiao
 

What is Deep Learning?

Deep learning has revolutionized image recognition, speech recognition, and natural language processing. There's also growing interest in applying deep learning to science, engineering, medicine, and finance.

At a high level, deep neural networks are stacks of nonlinear operations, typically with millions of parameters. This produces a highly flexible and powerful model which has proved effective in many applications. The design of network architectures and optimization methods have been the focus of intense research.

Course overview

Topics include convolution neural networks, recurrent neural networks, and deep reinforcement learning. Homeworks on image classification, video recognition, and deep reinforcement learning. Training of deep learning models using TensorFlow and PyTorch. A large amount of GPU resources are provided to the class. See Syllabus for more details.

Mathematical analysis of neural networks, reinforcement learning, and stochastic gradient descent algorithms will also be covered in lectures.

For time and location of the Teaching Assistants' office hours, see Piazza website.

IE 534 Deep Learning is cross-listed with CS 547.

This course is part of the Deep Learning sequence:

  • IE 398 Deep Learning (undergraduate version)
  • IE 534 Deep Learning
  • IE 598 Deep Learning II
Computational resources

A large amount of GPU resources are provided to the class: 100,000 hours. Graphics processing units (GPUs) can massively parallelize the training of deep learning models. This is a unique opportunity for students to develop sophisticated deep learning models at large scales.

Code

Extensive TensorFlow and PyTorch code is provided to students.

Datasets, Code, and Notes

MNIST Dataset

CIFAR10 Dataset

Introduction to running jobs on Blue Waters

Blue Waters notes for Fall 2019

Lecture note for Blue Water and Pytorch

Lecture note for ResNet and Distributed Training

Lecture note for NLP

Blue Waters Help Document for the Class

Recommended articles on deep learning

PyTorch Class Tutorial

PyTorch Website

Course Notes for Weeks 1 & 2

List of Final Projects

Lecture Slides: Lecture 1 , Lecture 2-3 , Lecture 4-5 , Lecture 6 , Lecture 8 , Lecture 10 , GAN Lecture Slides , Lecture 11 , Code for Distributed Training , Lecture 12 , Deep Learning Image Ranking Lecture , Action Recognition Lecture , Additional Lecture Slides on Q-learning Algorithm

Homeworks

  • HW1: Implement and train a neural network from scratch in Python for the MNIST dataset (no PyTorch). The neural network should be trained on the Training Set using stochastic gradient descent. It should achieve 97-98% accuracy on the Test Set. For full credit, submit via Compass (1) the code and (2) a paragraph (in a PDF document) which states the Test Accuracy and briefly describes the implementation. Due September 6 at 5:00 PM.
  • HW2: Implement and train a convolution neural network from scratch in Python for the MNIST dataset (no PyTorch). You should write your own code for convolutions (e.g., do not use SciPy's convolution function). The convolution network should have a single hidden layer with multiple channels. It should achieve at least 94% accuracy on the Test Set. For full credit, submit via Compass (1) the code and (2) a paragraph (in a PDF document) which states the Test Accuracy and briefly describes the implementation. Due September 18 at 5:00 PM.
  • HW3: Train a deep convolution network on a GPU with PyTorch for the CIFAR10 dataset. The convolution network should use (A) dropout, (B) trained with RMSprop or ADAM, and (C) data augmentation. For 10% extra credit, compare dropout test accuracy (i) using the heuristic prediction rule and (ii) Monte Carlo simulation. For full credit, the model should achieve 80-90% Test Accuracy. Submit via Compass (1) the code and (2) a paragraph (in a PDF document) which reports the results and briefly describes the model architecture. Due Tuesday, October 1 at 5:00 PM. Homework #3 Solutions.
  • HW4: Implement a deep residual neural network for CIFAR100. Homework #4 Details. Due October 15 at 5:00 PM. Homework #4 Solutions.
  • HW5: Natural Language Processing A. Part I and II of NLP assignment Due October 23 at 5:00 PM. Homework #5 and # 6 Solutions.
  • HW6: Natural Language Processing B. Part III of NLP assignment Due October 28 at 5:00 PM. Homework #5 and # 6 Solutions. Language Model in HW6.
  • HW7: Generative adversarial networks (GANs). Homework Link Due November 11 at 5:00 PM. Homework #7 Solutions.
  • HW8: Deep reinforcement learning on Atari games. Due December 02. Homework Link Homework #8 Solutions.
  • HW9: Video recognition I. Due TBA. Homework Link Homework #9 Solutions.
  • HW10 (not assigned this year): Implement a deep learning model for image ranking. Homework #5 Details. Due October 18 at 5:00 PM.
  • HW11 (not assigned this year): Deep reinforcement learning on Atari games I using TensorFlow. 2017 version of this homework.
  • HW12 (not assigned this year): Deep reinforcement learning on Atari games II using TensorFlow. 2017 version of this homework.
  • Final Project: See Syllabus for a list of possible final projects. Due December 12. Examples of Final Projects: Image Captioning I , Faster RCNN , Image Captioning II , Deep Reinforcement Learning .
Examples of what will be implemented in the Homeworks

In HW6, a deep learning model is trained to predict the action occurring in a video solely using the raw pixels in the sequence of frames. The five most likely actions according to the deep learning model are reported (selected from a total of 400 possible actions).

In HW9, a deep learning model learns to play the Atari video game using only the raw pixels in the sequence of frames (as a human would learn).