Topic |
Videos (on Canvas/Panopto) |
Course Materials |
|
Introduction to Reinforcement Learning |
Lecture 1
|
- Lecture 1 Draft Slides [Post class version]
- Additional Materials:
|
Tabular MDP planning |
Lecture 2
|
- Lecture 2 Slides (pre-class) [Post class, annotated]
- Additional Materials:
- SB (Sutton and Barto) Chp 3, 4.1-4.4
|
Tabular RL policy evaluation |
Lecture 3
|
- Lecture 3 Slides (pre-class) [Post class, with annotations]
- Additional Materials:
- SB (Sutton and Barto) Chp 5.1, 5.5, 6.1-6.3
- David Silver's Lecture 4 [link]
|
Q-learning |
Lecture 4
|
- Lecture 4 Slides (preclass) (post class with annotations)
- Additional Materials:
- SB (Sutton and Barto) Chp 5.2, 5.4, 6.4-6.5, 6.7
|
Policy Gradient |
Lecture 5, 6
|
- Lecture 5 Slides [Post lecture with annotations]
- Lecture 6 Slides [Post class annotations]
- Lecture 7 Slides [Post class annotations]
- Additional Materials:
- SB (Sutton and Barto) Chp 13
|
Imitation Learning and Learning from Human Input |
Lecture Videos
|
- Lecture 7 Slides [Post class annotations]
- Lecture 8 Slides (preclass) [Post class with annotations]
- Lecture 9 Slides [Post class]
- Lecture 9 DPO Slides
- Lecture 10 Slides [Post class]
- Additional Materials:
|