skip to main content
Caltech

Control Meets Learning Seminar

Wednesday, November 11, 2020
9:00am to 10:00am
Add to Cal
Online Event
A new approach to contextual bandits
Sasha Rakhlin, Associate Professor, Center for Statistics, IDSS / Department of Brain & Cognitive Sciences / Laboratory for Information & Decision Systems Center for Brains, Minds, and Machines, Massachusetts Institute of Technology,

A fundamental challenge in contextual bandits is to develop flexible, general-purpose algorithms with computational requirements no worse than classical supervised learning tasks. In this talk, we will describe a universal and optimal reduction from contextual bandits to online regression. We characterize the minimax rates for contextual bandits with general, potentially nonparametric function classes, and show that our algorithm is minimax optimal whenever the online regression method is optimal. We then turn to the case of iid data and present an adaptive method that attains fast instance-dependent rates, whenever certain disagreement-based notions of problem complexity are bounded. Time permitting, we extend these ideas to the setting of block-MDPs and provide oracle-efficient algorithms for reinforcement learning with rich observations that obtain optimal gap-dependent sample complexity.

Joint work with Dylan Foster, David Simchi-Levi, and Yunzong Xu

For more information, please contact Jolene Brink by phone at 6263952813 or by email at [email protected] or visit Control Meets Learning Website.