ECE563

index

ECE 563 - Information Theory

Fall 2016

Summary of Lectures

Lecture 1: Outline of the course. Three main topics: (a) Information representation and compression. (b) Information communication. (c) Information principles in machine learning and statistics. This lecture talks about (a) what is information, and (b) how to quantify it. Information is revealed when one of a number of (discrete) possible items is chosen. A measure of information is then related to how likely we expect each item to be chosen. Axiomatic definition of entropy -- as motivated by Shannon in 1948. Properties of entropy -- concavity and monotonicity with respect to increased randomness. For fixed alphabet size, entropy is maximized for uniform probabilities. Entropy is monotonic with respect to the partial order of majorization. References: Section 2.1 of text book and a "Mathematical theory of Communication" by C.E.Shannon, Bell System Technical Journal, October 1948.

Lecture 2: Codes: nonsingularity, uniquely decodable (nonsingular under concatenation), instantaneously decodable (prefix free). Kraft inequality constrains the lengths of the codewords.

Lecture 3: Proof of Kraft inequality: necessary for all uniquely decodable codes and guarantees existence of a instantaneously decodable code. Average minimum length of the code as a measure of efficiency and is within the entropy and one plus the entropy. Huffman code.

Lecture 4: Concatenation of codes and near optimality for i.i.d. sources. Leads to the broader question of how to handle memory. As a first step, look at Markov sources. Introduction to the Shannon Fano Elias code.

Lecture 5: Arithmetic coding: Shannon-Fano-Elias coding for Markov sources. Entropy rate of Markov chains.

Lecture 6: Information measures. Joint entropy, conditional entropy, KL divergence, mutual information.

Lecture 7: Fano inequality. Lower bound on compression rate, with vanishingly small error and worst-case lengths. Lempel-Ziv compression.

Lecture 8: Reliable communication -- the block diagram setting of Shannon. Rate of communication and reliability of communication. Discrete stationary memoryless channels. Definition of Capacity. Examples: binary erasure channel and binary symmetric channel. Simple upper bounds on capacity of the channel.

Lecture 9: Upper bound on rate of communication for zero error communication via the maximum mutual information over the channel. Fano inequality and its use to do the same with vanishing error probability.

Lecture 10: Example: Binary erasure channel. Calculate upper bound explicity (and also for binary symmetric channel). Argue a simple random linear code achieves capacity. Fountain codes and iterative decoding.

Lecture 11: Properties of mutual information: concavity in P_X and convexity in P_Y|X. Calculate capacity for example channels, specially the symmetric case.

Lecture 12: Jointly typical sequences. Achievability via random coding and typical set decoding. Statement of Shannon's theorem on capacity of a channel -- fundamental limits to reliable communication.