Lecture Materials

Lecture Materials

Lecture materials for this course are given below. Note the associated refresh your understanding and check your understanding polls will be posted weekly.

Topic	Videos (on Canvas/Panopto)	Course Materials
Introduction to Reinforcement Learning	Lecture 1	Lecture 1 Draft Slides [Post class version] Additional Materials: High level introduction: SB (Sutton and Barto) Chp 1 Linear Algebra Review Probability Review Python Tutorial
Tabular MDP planning	Lecture 2	Lecture 2 Slides (pre-class) [Post class, annotated] Additional Materials: SB (Sutton and Barto) Chp 3, 4.1-4.4
Tabular RL policy evaluation	Lecture 3	Lecture 3 Slides (pre-class) [Post class, with annotations] Additional Materials: SB (Sutton and Barto) Chp 5.1, 5.5, 6.1-6.3 David Silver's Lecture 4 [link]
Q-learning	Lecture 4	Lecture 4 Slides (preclass) (post class with annotations) Additional Materials: SB (Sutton and Barto) Chp 5.2, 5.4, 6.4-6.5, 6.7
Policy Gradient	Lecture 5, 6	Lecture 5 Slides [Post lecture with annotations] Lecture 6 Slides [Post class annotations] Lecture 7 Slides [Post class annotations] Additional Materials: SB (Sutton and Barto) Chp 13
Imitation Learning and Learning from Human Input and Batch RL	Lecture Videos	Lecture 7 Slides [Post class annotations] Lecture 8 Slides (preclass) [Post class with annotations] Lecture 9 Slides [Post class] Lecture 9 Guest lecture part slides Lecture 10 Slides [Post class] Additional Materials:
Data Efficient RL	Lecture Videos	Lecture 11 Slides [Post class annotations] Lecture 12 Slides (Tuesday Feb 18) [Post class annotations] Lecture 13 Slides [Post class annotations] Lecture 14 Slides [Post class annotations] Additional Materials: Bandit Algorithms Book Section 7.1
Ethics and Society Guest Lecture	Lecture Videos	Lecture 9 Part Guest Slides 1 Lecture 15 Guest Slides 2
Monte Carlo Tree Search and Conquering Go	Lecture Videos	Lecture 15 Draft slides [Post class with annotations]
LLMS and RL Guest Lecture and Wrap Up	Lecture Videos	Lecture 16 (pre class draft) [post class, annotated] Additional Materials: "Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning" [Neurips 2024].