H.B. Keller Colloquium

Monday, April 29, 2019

4:00pm to 5:00pm

Annenberg 105

Data Perturbation for Data Science

Richard Samworth, Professor of Statistical Science and Director of the Statistical Laboratory, University of Cambridge & Teaching Fellow at Saint John's College,

Abstract: When faced with a dataset and a statistical problem of interest, should we propose a statistical model and use that to inform an appropriate algorithm, or dream up a potential algorithm and then seek to justify it? The former is the more traditional statistical approach, but the latter appears to be becoming more popular. I will present an example of a 20th century analysis that falls into the first category, and explain why it may not be as suitable for modern statistical challenges. I'll then discuss a class of algorithms that belong in the second category, namely those that involve data perturbation (e.g. subsampling, random projections, artificial noise, knockoffs,...).
As an illustration, I will consider Complementary Pairs Stability Selection for variable selection.

Event Series

H. B. Keller Colloquium Series

Event Sponsors

Computing and Mathematical Sciences (CMS) More Events from this Sponsor