Homework 5: Available 11/06/2016, Due 11/19/2016.
NOTE: Only the software portion can be done in teams of up to three people, your report should be your own work.
Handing in assignment on compass:
  1. Report in pdf format. File name should be net_id_hw5_report (e.g. yeh17_hw5_report.pdf)
  2. Code for both code-from-scratch and TensorFlow in TGZ or ZIP format. File name should be net_id_hw5_code (e.g. yeh17_hw5_code.zip)
  3. Include your team members' name in the submission description and report.
  4. Make sure the uploaded file is correct. Missing or corrupted files will be considered late. (You can download the uploaded file to verify)

Data

MNIST handwritten digit database [Link]. Tensorflow also has an API for loading and downloading the data. Link

Software Requirements

  1. Do NOT use ipython notebook.
  2. Modularized your code, a single script code is difficult to read.
  3. Use a relative path when accessing data. Clearly write in the README file where to store the data relative to the base directory.

TensorFlow

In this portion of the assignment, you will use a vanilla RNN and a LSTM to perform digit classification on the MNIST dataset.
  1. Setting 1 (Sequence of Pixels): It is assumed that each $28 \times 28$ image, $x$, in the MNIST dataset is a sequence of single pixels, $x(1), x(2), ... x(784)$, where $x(t)$ is a single scalar value. The network reads one pixel at a time from the top left corner of the image to the bottom right of the image. Note: If you are running out of memory, you can downsample the image to $14 \times 14$ or $7 \times 7$, just report what image size you used in the report.
  2. Setting 2 (Sequence of Columns): It is assumed that each $28 \times 28$ image, $x$, in the MNIST dataset is a sequence of vectors, $x(1), x(2), ... x(28)$, where $x(t)$ is a $28 \times 1$ vector representing one column in the image. The network reads one column at a time from left to right.
Train a basic (vanilla) RNN and a LSTM for each the two settings using a single layer RNN and LSRM with 100 hidden nodes. Perform classification on the last frame using cross entropy loss.

Revelant Tensorflow Doc:
Tensorflow provides some API for recurrent neural network, please use the following:
  1. tf.nn.rnn
  2. tf.nn.rnn_cell
What to turn in: