Statistics 311/Electrical Engineering 377 (Winter 2016)

Instructor: John Duchi (jduchi stanford edu)

Lectures: Tuesday/Thursday, 13:30--14:50, building 200 (History corner) room 205

TA: Hongseok Namkoong (hnamk stanford edu)

Office hours: John: Tuesday/Thurdsay 15:00 - 16:00 126 Sequoia Hall. Hong: Wednesday 13:15-15:00 Packard 106.

Description: Information theory was developed to solve fundamental problems in the theory of communications, but its connections to statistical estimation and inference date nearly to the birth of the field. With their focus on fundamental limits, information theoretic techniques have provided deep insights into optimal procedures for a variety of inferential tasks. In addition, the basic quantities of information theory--entropy and relative entropy and their generalizations to other divergence measures such as f-divergences--are central in many areas of mathematical statistics and probability.

The application of these tools are numerous. In mathematical statistics, for example, they allow characterization of optimal error probabilities in hypothesis testing, determination of minimax rates of convergence for estimation problems, demonstration of equivalence between (ostensibly) different estimation problems, and lead to penalized estimators and the minimum description length principle. In probability, they provide insights into the central limit theorem, large deviations theory (via Sanov's theorem and other results), and appear in empirical process theory and concentration of measure. Information theoretic techniques also arise in game playing, gambling, stochastic optimization and approximation, among other areas.

In this course, we will study information theoretic quantities, and their connections to estimation and statistics, in some depth, showing applications to many of the areas above. Except to provide background, we will not cover standard information-theoretic topics such as source-coding or channel-coding, focusing on the probabilistic and statistical consequences of information theory.

Here is a more precise syllabus including evaluation criteria and topics, and we also have last year's course webpage.

Readings listed below are supplementary to the lectures and not necessary, but should give additional perspective on the material.

Cover and Thomas: Elements of Information Theory, Second Edition, Wiley 2006.
Tsybakov: Introduction to Nonparametric Estimation, Springer 2009.
Vershynin: Introduction to the non-asymptotic analysis of random matrices, 2012.
Boucheron, Lugosi, Massart: Concentration Inequalities: a Nonasymptotic Theory of Independence, Oxford University Press, 2013.

Lectures

Full lecture notes: [pdf] I will post a fully linked and updated version of the lecture notes here as soon as they are prepared. The pdfs associated with each lecture are chapters within this document. [pdf]

In the readings below, "LN" stands for Lecture Notes, while CT, T, V, BLM, etc., stand for the supplementary books and notes listed above.

Date	Topics	Reading	Assignments
January 5	Course overview and introduction	LN Chapters 1 and 2 CT, Chapter 2
January 7	Review of information theory Divergences	LN Chapters 1 and 2 CT, Chapter 2	Problem Set 1 Out [pdf]
January 12	Concentration inequalities	LN Chapter 3	Project information out [pdf]
January 14	Concentration inequalities	LN Chapter 3
January 19	Concentration inequalities	LN Chapter 3
January 21	Concentration inequalities and generalization bounds	LN Chapter 4	Problem set 1 due Solutions: [pdf]
January 26	Generalization bounds Coding and Kraft inequalities	LN Chapters 4 and 5	Problem set 2 out [pdf]
January 28	Exponential families Fisher information	LN Chapters 6, 7, 8
February 2	Universal coding	LN Chapter 9
February 4	Universal coding and sequential prediction	LN Chapters 9, 10	Problem set 2 due Solutions: [pdf]
February 9	Online convex optimization Bandit problems	LN Chapter 11	Problem set 3 out [pdf]
February 11	Online convex optimization Bandit problems	LN Chapters 11, 12
February 16	Exploration/exploitation and bandit problems	LN Chapters 11, 12
February 18	Exploration/exploitation and bandit problems	LN Chapters 11, 12	Problem set 3 due Solutions: [pdf]
February 22	Minimax lower bounds Le Cam's method	LN Chapter 13 Tsybakov Chapters 2.1-2.4
February 25	Minimax lower bounds Fano's method and packing	LN Chapter 13 Tsybakov Chapters 2.5-2.7	Problem set 4 out [pdf]