Due Date

Homework 1 released 8/28/2016, Due 9/16/2016.

Data

The dataset is here. The data include audio waveforms (and cepstral coefficients) from letters of the alphabet. The task will be to separate the ee-set (b,c,d,e,g,p,t,v,z) from the eh-set (f,l,m,n,s,x). The dataset is divided into train, dev (development test), and eval (evaluation test) sets. Read the file README.txt for some more information. Note: The dataset contains some corrupted data, you should skip over those examples.

Pencil-and-paper

Suppose that you have a one-layer neural network, of the form $y_i=g(w'x_i+b)$, where $g()$ is some nonlinearity, $b$ is a trainable scalar bias parameter, and $w'x_i$ means the dot product between the trainable weight vector, $w$, and the $i^{th}$ training vector, $x_i$. Suppose you have a training corpus of the form $D=\{(x_1,t_1),...,(x_n,t_n)\}$. Turn in your derivations of the following 5 things.

  1. Find the derivatives $\frac{dE}{dw_j}$ and $\frac{dE}{db}$, where $E=\sum_i((t_i-y_i)^2)$. Your answer should include the derivative of the nonlinearity, $g'(w'x_i+b)$.
  2. Suppose $g(a)=a$. Write $\frac{dE}{dw_j}$ without $g'()$.
  3. Suppose $g(a)=\frac{1}{(1+exp(-a))}$. In this case, $g'(w'x_i+b)$ can be written as a simple function of $y_i$. Write it that way.
  4. Use the perceptron error instead: $E=\sum_i(max(0,-(w'x_i+b)\cdot t_i))$.
  5. Use the SVM error instead: $E=||w||_2^2+C\cdot\sum_i(h(x_i,t_i))$, where $h(x_i,t_i)=max(0,1-t_i(w'x_i+b))$ is the hinge loss, $C$ is an arbitrary constant, and you can assume that $t_i$ is either +/-1.

Code-From-Scratch:

This part of the homework can be written using any programming language you like. We recommend Matlab or Python, but others are also possible.

Write a one-layer neural net that takes a single cepstrum as input, and classifies it as ee-set versus eh-set. Write general code for training (using gradient descent), and for testing (using both MSE and classification accuracy). Use your code to train four classifiers: linear, logistic, perceptron, and linear SVM. Note that you'll have to define the targets $t_i$ to be either {+1,-1} or {+1,0}, depending on the classifier.

What to turn in:

TensorFlow

Train a one-layer neural net, with logistic output nodes, using TensorFlow. Hint: this will be very similar to MNIST for beginners, but with different data. This part of the assignment should be written in Python.

Note that a logistic node is exactly equal to a two-class softmax node. Cross-entropy loss is not the same as MSE loss; you can use whichever one you prefer.

What to turn in: