## ECE 598 (Representation Learning: Algorithms and Models)Department of Electrical and Computer Engineering ## OutlineLearning to represent real-world data, from images to text to proteins, is a basic scientific endeavor of great topical interest. Many of these data elements are naturally discrete (example: words) and are readily modeled as discrete atomic units; however this is unable to capture the relation between the discrete units. On the other hand, distributional real vector-valued representations provide a geometric backdrop to describe relations between units. Such representations have been the key behind success of several machine learning algorithms, especially when related to natural language data. Successful examples include LSA (latent semantic analysis) for (coarse) text/document rep- resentation and very recently, Word2Vec for (fine grained) representation of words. Word2Vec has been extremely successful in a variety of natural language processing applications: The success comes from the geometry of the representations that efficiently captures linguistic regularities: the semantic similarity of words are well captured by the similarity of the corresponding vector representations; the latent relation between two words is well captured by the difference vector between the corresponding two vector representations. In this course, we take a first principles view of the key mathematical ideas behind the major representation learning algorithms of machine learning. The topics covered range from classical ones (viewing classical representation learning algorithms of PCA and CCA as maximum likelihood rules under suitable probabilistic models) to very recent ones (word2vec and its variants for learning representations of words). |