Course Goals and Instructional Objectives


Course Goals

The goal of the course is to prepare the students for research and industrial positions in the processing of multimedia signals: signals that change over time, including audio and video. Through a set of carefully designed machine problems, the student learns important tools in audio-visual signal processing, analysis, and synthesis, and their applications to biometrics, human-computer interaction, and multimedia indexing and search.

Instructional Objectives

  1. After Machine Problem 1 (MP1), Week 3 of the semester, the students should be able to: Understand speech features (esp. cepstrum coefficients) and nearest-neighbor pattern classifiers and their applications to speech recognition and speaker identification
  2. After MP2, Week 5 of the semester, the students should be able to: Understand principal component analysis and linear discriminant analysis, and their applications to face recognition.
  3. After MP3, Week 7 of the semester, the students should be able to: Understand maximum likelihood (ML) classifies, Bayesian networks, and multimodal fusion, and their applications to audio-visual person identification
  4. After MP4, Week 9 of the semester, the students should be able to: Understand hidden Markov model (HMM), including algorithms for learning, inference, and decoding, and its application to audio-visual speech recognition.
  5. After MP5, Week 11 of the semester, the students should be able to: Understand 3D face modeling and animation and applications to speech-driven lip movement in an audio-visual avatar (synthetic talking head).
  6. MP6 for spring 2016 is currently being revised.
  7. MP7 for spring 2016 is currently being revised.