IST LUNCH BUNCH
In this talk, I will present the Dueling Bandits Problem, which is an online learning framework tailored towards real-time learning from subjective human feedback. In particular, the Dueling Bandits Problem only requires pairwise comparisons, which are shown to be reliably inferred in a variety of subjective feedback settings such as for information retrieval an recommender systems. I will provide an overview of the DuelingBandits Problem with basic algorithmic results. I will then conclude by discussing some ongoing research directions with applications to personalized medicine.This is joint work with Josef Broder, Bobby Kleinberg, Thorsten Joachims, Yanan Sui, Vincent Zhuang, and Joel Burdick.