Textbooks and Useful Links
Textbook = Articles
In place of textbooks this semester, we will be referring primarily to articles published in academic journals. Each week, you will be expected to read and understand one of these articles. Exams will cover primarily the material in the articles.
- Week 1: A.V. Oppenheim, R.W. Schafer and T.G. Stockham, Jr., Nonlinear filtering of multiplied and convolved signals. IEEE Trans. Audio and Electroacoustics 16(3):437-466, section IV. Exam will cover: pp. 437-442, page 444, pages 460-464.
- Week 2: Steven Davis and Paul Mermelstein, Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences, IEEE Transactions on Acoustics, Speech and Signal Processing 28(4):357-366, 1980. Exam covers only Fig. 1 and Eq. (1).
- Week 3: Matthew Turk and Alex Pentland, Eigenfaces for Recognition, Journal of Cognitive Neuroscience 3(1):71-86, 1991
- Week 5 (Bayesian Classifiers): J. Neyman and E.S. Pearson, On the Problem of the Most Efficient Tests for Statistical Hypotheses, Philosophical Transactions of the Royal Society of London, Series A, 231:289-337, 1933 (read pages 289-303)
- Week 6 (Gaussian Mixture Models): Bin H. Juang, Stephen E. Levinson and Man Mohan Sondhi, Maximum Likelihood Estimation for Multivariate Mixture Observations of Markov Chains, Trans. Information Theory 32(2):307-309, 1986 (you're only responsible for understanding Equation 1!! The rest of the article describes what is going on in the code that we provided for you).
- Week 7 (Hidden Markov Models): Lawrence R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proc. IEEE 77(2):257-286. You should read at least pages 257-264, 266-267, 272-273, and 278-279.
- Week 9: Barycentric Coordinates and Bilinear Interpolation
- Week 12: (Boosting and Object Detection): Rapid Object Detection using a Boosted Cascade of Simple Features, Paul Viola and Michael Jones, CVPR 2001. Optional: A decision-theoretic generalization of on-line learning and an application to boosting, Freund and Schapire, 1995
Lecture Notes
Professor Huang's lecture notes are here.
Recommended Texts
- This book has a larger percentage of the course content than any other single text, maybe it is even adequate for the whole course: Theodoridis and Koutroumbas, Pattern Recognition
- This book is relevant to MP1 and MP4: L. Rabiner and B.W. Juang, Fundamentals of Speech Recognition, Prentice-Hall, 1993.
- This book is relevant to MP2 and MP6: Al Bovik (Ed.), The Essential Guide to Image Processing, Elsevier, 2nd Ed., 2009.
- Al Bovik (Ed.), The Essential Guide to Video Processing, Elsevier, 2nd Ed., 2009.
- This book is relevant to MP3 through MP7: R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, 2nd Ed. Wiley, 2001.
Useful Links
- Some useful Matlab tips compiled by Kevin Murphy: http://www.cs.ubc.ca/~murphyk/Software/matlab_tips.html
- Mark Hasegawa-Johnson, Lecture Notes in Speech Production, Speech Coding, and Speech Recognition, 1999
- Mike Brooks, Voicebox Toolbox for Matlab
- MIT Open Courseware Introduction to Matlab