skip to main content

DOLCIT Seminar

Thursday, June 20, 2019
4:00pm to 5:00pm
Add to Cal
Annenberg 213
Optimization Aspects of Temporal Abstraction in Reinforcement Learning
Pierre-Luc Bacon, Postdoctoral Scholar, Stanford Artificial Intelligence Lab, Stanford University,

Temporal abstraction refers to the idea that complicated sequential decision making problems can sometimes be simplified by considering the "big picture" first. In this talk, I will give an overview of some of my work on learning such temporal abstractions end-to-end within the "option-critic" architecture (Bacon et al., 2017). I will then explain how other related hierarchical RL frameworks, such as Feudal RL by Dayan and Hinton (1993), can also be approached under the same option-critic architecture. However, we will see that that this formulation leads to a so-called "bilevel" optimization problem. While this is a more difficult problem, the good news is that the literature on bilevel optimization is rich and many of its tools have yet to be re-discovered by our community. I will finally show how "iterative differentiation" techniques (Griewank and Walther, 2008) can be applied to our problem while providing a new interpretation to the "inverse RL" approach of Rust (1988).

For more information, please contact Pamela Albertson by email at [email protected].