RSRG/DOLCIT Seminar

Thursday, May 3, 2018

12:00pm to 1:00pm

Annenberg 213

Learning from logged bandit feedback

Adith Swaminathan, Researcher, Microsoft Research AI,

Many of the most impactful applications of machine learning are not just about prediction, but are about putting learning systems in control of selecting the right action at the right time (e.g., search engines, recommender systems or automated trading platforms). These systems are both producers and users of data -- the logs of the selected actions and their outcomes (e.g., derived from clicks, ratings or revenue) can provide valuable training data for learning the next generation of the system, giving rise to some of the biggest datasets we have collected. Machine learning in these settings is challenging since the system in operation biases the log data through the actions it selects and outcomes remain unknown for the actions not taken. Learning methods must, hence, reason about how changes to the system will affect future outcomes. We will summarize recent advances in these counterfactual learning techniques, and demonstrate how deep neural networks can be trained in these settings (ICLR'18).

Joint work with Thorsten Joachims and Maarten de Rijke.

For more information, please contact Yanan Sui by email at [email protected] or visit Seminar & Events.

Event Series

Rigorous Systems Research Group (RSRG) Seminar Series

Event Sponsors

Computing and Mathematical Sciences (CMS) More Events from this Sponsor