Instructor: John Duchi (jduchi stanford edu)
Lectures: Tuesday/Thursday, 13:30--14:50, building 200 (History corner) room 205
TA: Hongseok Namkoong (hnamk stanford edu)
Office hours: John: Tuesday/Thurdsay 15:00 - 16:00 126 Sequoia Hall. Hong: Wednesday 13:15-15:00 Packard 106.
Description: Information theory was developed to solve
fundamental problems in the theory of communications, but its
connections to statistical estimation and inference date nearly
to the birth of the field. With their focus on fundamental
limits, information theoretic techniques have provided deep
insights into optimal procedures for a variety of inferential
tasks. In addition, the basic quantities of information
theory--entropy and relative entropy and their generalizations
to other divergence measures such as f-divergences--are
central in many areas of mathematical statistics and
probability.
The application of these tools are numerous. In mathematical
statistics, for example, they allow characterization of optimal error
probabilities in hypothesis testing, determination of minimax rates of
convergence for estimation problems, demonstration of equivalence
between (ostensibly) different estimation problems, and lead to
penalized estimators and the minimum description length principle. In
probability, they provide insights into the central limit theorem,
large deviations theory (via Sanov's theorem and other results), and
appear in empirical process theory and concentration of
measure. Information theoretic techniques also arise in game playing,
gambling, stochastic optimization and approximation, among other
areas.
In this course, we will study information theoretic quantities,
and their connections to estimation and statistics, in some
depth, showing applications to many of the areas above. Except
to provide background, we will not cover standard
information-theoretic topics such as source-coding or
channel-coding, focusing on the probabilistic and statistical
consequences of information theory.
Here is a more precise
syllabus
including evaluation criteria and topics, and we also
have
last year's course webpage.
Readings listed below are supplementary to the lectures
and not necessary, but should give additional
perspective on the material.
Cover and Thomas: Elements of Information Theory, Second Edition, Wiley 2006.
Tsybakov: Introduction to Nonparametric Estimation, Springer 2009.
Vershynin: Introduction to the non-asymptotic analysis of random matrices, 2012.
Boucheron, Lugosi, Massart: Concentration Inequalities: a Nonasymptotic Theory of Independence, Oxford University Press, 2013.
Date | Topics | Reading | Assignments | |
January 5 |
Course overview and introduction | LN Chapters 1 and 2
CT, Chapter 2 |
||
January 7 | Review of information theory
Divergences |
LN Chapters 1 and 2 CT, Chapter 2 |
Problem Set 1 Out [pdf] |
|
January 12 | Concentration inequalities | LN Chapter 3 | Project information out
[pdf] |
|
January 14 | Concentration inequalities | LN Chapter 3 | ||
January 19 | Concentration inequalities | LN Chapter 3 | ||
January 21 | Concentration inequalities and generalization bounds |
LN Chapter 4 | Problem set 1 due
Solutions: [pdf] |
|
January 26 | Generalization bounds Coding and Kraft inequalities |
LN Chapters 4 and 5 | Problem set 2 out
[pdf] |
|
January 28 | Exponential families Fisher information |
LN Chapters 6, 7, 8 | ||
February 2 | Universal coding | LN Chapter 9 | ||
February 4 | Universal coding and sequential prediction | LN Chapters 9, 10 |
Problem set 2 due Solutions: [pdf] |
|
February 9 | Online convex optimization Bandit problems |
LN Chapter 11 | Problem set 3 out [pdf] |
|
February 11 | Online convex optimization Bandit problems |
LN Chapters 11, 12 | ||
February 16 | Exploration/exploitation and bandit problems | LN Chapters 11, 12 | ||
February 18 | Exploration/exploitation and bandit problems | LN Chapters 11, 12 |
Problem set 3 due Solutions: [pdf] |
|
February 22 | Minimax lower bounds
Le Cam's method |
LN Chapter 13 Tsybakov Chapters 2.1-2.4 |
||
February 25 | Minimax lower bounds
Fano's method and packing |
LN Chapter 13 Tsybakov Chapters 2.5-2.7 |
Problem set 4 out [pdf] |