ECE398BD: Fundamentals of Machine Learning (Labs)

Schedule

	Topic	Lab	Quiz	Assigned	Due	Feedback for the Week
Lab 1	Introduction to Python	[link]	None	Jan 20	Not Graded	No Feedback
Lab 2	Classification, Part 1	[link]	[link]	Jan 27	Feb 4, 12:00 AM	[link]
Lab 3	Classification, Part 2	[link]	[link]	Feb 3	Feb 11, 12:00 AM	[link]
Lab 4	Linear Regression and Clustering	[link]	[link]	Feb 10	Feb 18, 12:00 AM	[link]
Lab 5	Principal Component Analysis	[link]	[link]	Feb 17	Feb 25, 12:00 AM	[link]

A sample quiz is available [here].

Hints, Errata and Feedback

If any changes, hints or commentary are needed for the labs, they will be provided here.

Lab 5

There will be office hours next Monday at the usual time.

You are highly encouraged to attend the talk by Dr. Andy Feng (Yahoo!) between 4:30 PM and 5:30 PM in the NCSA Auditorium. It is related to topics in the course such as neural networks (which Prof. Minh Do will cover, and large scale learning). The CSL Student Conference also has other talks related to machine learning.

The quiz will start at 4:00 PM sharp. Bring a ruler.

Submit this lab by email to ligo2 [at] illinois [dot] edu.

This is the last lab of this section of the course. The lab session on February 24, 2016 will be dedicated to the first lab of the Social Network Analysis portion of the course.

Your new TA will be Peter Kairouz.

We will be using Canopy 1.6.1 for this lab. You can load this by

module load canopy/1.6.1

on the EWS machines or FastX.

Some notes:

scikit-learn's PCA implementation will automatically handle non-zero mean features by internally subtracting out the mean. This is done manually in problem 2 so that the eigenfaces look a bit prettier. You can assume the data has been appropriately pre-processed in problems 2 and 3 so that you can feed them directly into scikit-learn's PCA implementation
In order to compress the image X_t using i principal components, use the following process (or equivalent):

Determine the PCA transformation (i.e. fit) on X.
Transform X_t to the PCA features determined by X
Retain the first i PCA features of the transformed X_t (set the others to zero).
Transform the result of step 3 back to the original feature space.

The rows of the PCA transformation matrix W are the principal components (represented as row vectors, i.e. the transpose of the principal components). Remember that principal components are ordered in decreasing amount of variance explained.
In problem 3, use scikit-learn's PCA implementation.
numpy.fliplr or numpy.flipud or reviewing slicing may be useful for your PCA via eigendecomposition code.

Lab 4

Bring a ruler to the quiz (optional). Submit this lab by email to ligo2 [at] illinois [dot ] edu.

Some hand-written notes on vector quantization are given here, if you need additional clarification from the slides in class.

Be sure to complete this lab early, so you are not behind next week. You will be changing topics to Social Network Analysis the week after (Feb. 24, 2016).

Some of you are close to (or at) your late policy quota. Be sure to keep track of how much time you have left. If you are not sure, ask.

Some notes:

In problem 1, your criteria for convergence for K-means is to simply iterate between assigning points to clusters and updating cluster centers niter times.
Your whole notebook should run in well under 5 minutes on a modern PC.
If you want to display multiple images in one cell (e.g. for your Vector Quantizer), you can do

figure() imshow(image, cmap = cm.Greys_r)

for each image you want to display.

Lab 3

Bring a ruler to the quiz (required). Submit this lab by email to ligo2 [at] illinois [ dot ] edu.

Make sure to read the instructions for each problem, and name your ipynb file with your netid (and fill the corresponding field in in your ipynb). You will submit this lab via email.

Note that your answers for the best parameters in Problem 2 may be different each time you run it. This is just due to the algorithm which calculates the SVM parameters. However, your error estimates should be approximately constant.

You should not display all the cross-validated error estimates in problem 2 for training the linear SVM and SVM with Gaussian RBF kernel (but your code needs to calculate them in order to determine which parameter is best and the corresponding error; do this via code, not by hand).

Some hints:

Problem 1
1. You can use numpy.setdiff1d to get the indices corresponding to the data with a fold removed. Example: numpy.setdiff1d(numpy.r_[:N],numpy.r_[S:T]) gives you the numbers 0,1,…,N-1 with S,S+1,…,T-1 removed. You can use this to get the indices of data not in a fold (by appropriately specifying N,S,T). Then, trainData[indices,:] gives you the data corresponding to indices (which you can build with setdiff1d) and trainLabels[indices] gives you the corresponding labels. Similarly, you can extract the data in a fold. See also this link from Lab 1.
2. If you're using .score() on your classifier, note that it returns 1-error. If not, you may want to re-use your classifier error function from Lab 2.
Problem 2
1. Read the problem statement – you're supposed to use cross-validation from sklearn.
2. mean is a synonym for average.
3. sklearn.lda.LDA implements LDA (as you saw on Lab 2).
4. In the last part of the problem, you pick the best classifier among the ones you have tried on this dataset (Linear SVM, SVM with Gaussian RBF, LDA) and train it on all the training data, then run it on the test data to estimate its real world performance. If you have used the test data at any point prior, you have done something very wrong.

Lab 2

Please upload your completed labs to Compass and email them to ligo2 [at] illinois [ dot ] edu (this is a just-in-case measure).

Errata:

In the third hint for problem 1, replace either m with N or vice versa.
In problem 2, the code for the scatter plot should have vallabels!=estimatedvallabels, rather than what is written.
In problem 4, you need to calculate the validation error for LDA and kNN in order to justify your choice of classifer. Do not base it solely on run time!

Some hints:

Problem 1
1. In the lecture notes, the Bayes classifier is written so that you take in one feature vector x and output its estimated label. If you are trying to vectorize the code, it is not just dropping in the means matrix and the data matrix where mu and x occur.
2. As I stated at the beginning of lab, you are not required to vectorize your code. It is nice if you do, because it is a useful skill to have, but if not, use a double for loop (one for the rows of the data matrix, i.e. the input vectors, and one for rows of the means matrix, i.e. the class means), and then you can directly implement the equation from the notes for the multivariate Gaussian Bayes classifier. Do not get hung up on vectorizing; if you don't get it, use for loops.
  1. data[i,:] is the feature vector in the i-th row of the data matrix. means[y,:] is the mean vector for class y. Using this information, you should be able to make a (V,M) numpy array such that the (i,y)-th entry consists of the quantity the Bayes classifier calculates for the input vector in row i in the data matrix corresponding to class y. Then, you just need to do the argmax.
3. When you vectorize, you have to write things in terms of matrix operations. This means that when you multiply two matrices, you have to know what the entries are, and how to combine them in order to get what you want.
4. Your error for the Bayes classifier should be quite small. If you are still not sure, you can generate two Gaussians with different means as your classes and see what happens (from ECE313, you should be able to calculate the probability of error).

Lab 1

Please follow the Python instructions to get started with IPython notebooks.

The following other Python tutorials may be helpful:

And a few links to write code concisely:

List Comprehensions
Code Vectorization (in Matlab; the ideas translate to Python in a straighforward manner)