skip to main content
Caltech

Control Meets Learning Seminar

Wednesday, January 13, 2021
9:00am to 10:00am
Add to Cal
Online Event
The Provable Effectiveness of Policy Gradient Methods in Reinforcement Learning and Controls
Sham Kakade, Professor, Department of Computer Science and the Department of Statistics, University of Washington,

Reinforcement learning is the dominant paradigm for how an agent learns to interact with the world in order to achieve some long term objectives. Here, policy gradient methods are among the most effective methods in challenging reinforcement learning problems, due to that they: are applicable to any differentiable policy parameterization; admit easy extensions to function approximation; easily incorporate structured state and action spaces; are easy to implement in a simulation based, model-free manner.

However, little is known about even their most basic theoretical convergence properties, including:

- do they converge to a globally optimal solution, say with a sufficiently rich policy class?

- how well do they cope with approximation error, say due to using a class of neural policies?

- what is their finite sample complexity?

This talk will cover a number of recent results on these basic questions and also provide the first approximation results which do have not worst case dependencies on the size of the state space. We will highlight the interplay of theory, algorithm design, and practice.

Joint work with: Alekh Agarwal, Jason Lee, Gaurav Mahajan

For more information, please contact Jolene Brink by email at [email protected] or visit Control Meets Learning Website.