Course Goals and Instructional Objectives
Course Goals
The goal of the course is to prepare the students for research and industrial positions in the processing of multimedia signals: signals that change over time, including audio and video. Through a set of carefully designed machine problems, the student learns important tools in audio-visual signal processing, analysis, and synthesis, and their applications to biometrics, human-computer interaction, and multimedia indexing and search.
Instructional Objectives
- After Machine Problem 1 (MP1), Week 3 of the semester, the students should be able to: Understand speech features (esp. cepstrum coefficients) and nearest-neighbor pattern classifiers and their applications to speech recognition and speaker identification
- After MP2, Week 5 of the semester, the students should be able to: Understand principal component analysis and linear discriminant analysis, and their applications to face recognition.
- After MP3, Week 7 of the semester, the students should be able to: Understand maximum likelihood (ML) classifies, Bayesian networks, and multimodal fusion, and their applications to audio-visual person identification
- After MP4, Week 9 of the semester, the students should be able to: Understand hidden Markov model (HMM), including algorithms for learning, inference, and decoding, and its application to audio-visual speech recognition.
- After MP5, Week 11 of the semester, the students should be able to: Understand 3D face modeling and animation and applications to speech-driven lip movement in an audio-visual avatar (synthetic talking head).
- MP6 for spring 2016 is currently being revised.
- MP7 for spring 2016 is currently being revised.