Course Goals and Instructional Objectives

Course Goals

The goal of the course is to prepare the students for research and industrial positions in the processing of multimedia signals: signals that change over time, including audio and video. Through a set of carefully designed machine problems, the student learns important tools in audio-visual signal processing, analysis, and synthesis, and their applications to biometrics, human-computer interaction, and multimedia indexing and search.

Instructional Objectives

After Machine Problem 1 (MP1), Week 3 of the semester, the students should be able to: Understand speech features (esp. cepstrum coefficients) and nearest-neighbor pattern classifiers and their applications to speech recognition and speaker identification
After MP2, Week 5 of the semester, the students should be able to: Understand principal component analysis and linear discriminant analysis, and their applications to face recognition.
After MP3, Week 7 of the semester, the students should be able to: Understand maximum likelihood (ML) classifies, Bayesian networks, and multimodal fusion, and their applications to audio-visual person identification
After MP4, Week 9 of the semester, the students should be able to: Understand hidden Markov model (HMM), including algorithms for learning, inference, and decoding, and its application to audio-visual speech recognition.
After MP5, Week 11 of the semester, the students should be able to: Understand 3D face modeling and animation and applications to speech-driven lip movement in an audio-visual avatar (synthetic talking head).
MP6 for spring 2016 is currently being revised.
MP7 for spring 2016 is currently being revised.

Useful Matlab Tutorial MP1 (due February 4)MP2 (due February 18)Exam 1 (February 25)MP3 (due March 3)MP4 (due March 17)Exam 2 (March 31)MP5 (due April 7)MP6 (due April 21)MP7 (due May 05)Exam 3 (Finals Week)