Friday, February 20, 2015
12:00 pm

Theory of Computing Seminar

Learning Arbitrary Statistical Mixtures of Discrete Distributions
Yuval Rabani, The Hebrew University of Jerusalem

Abstract:

We study the problem of learning from unlabeled samples very general
statistical mixture models on large finite sets. Specifically, we want to learn a model which is a
mixture of distributions $p$ on $\{1,2,\dots,n\}$. When we sample from the model, we do not observe a
specific $p$ from the mixture directly, but rather indirectly and in a very noisy fashion—we
observe $K$ independent samples of $\{1,2,\dots,n\}$ according to a distribution $p$. We give
efficient algorithms for learning this mixture model without making any restricting assumptions on the
structure of the mixture. We bound the quality of the solution as a function of the size of the
samples $K$ and the number of samples used. Our model and results have applications to a variety of
unsupervised learning scenarios, including learning topic models and collaborative filtering.

Most of the talk is based on a joint paper with Jian Li, Leonard J. Schulman, and Chaitanya Swamy.
We will also mention some earlier work with some of the above coauthors. 
Contact Thomas Vidick vidick@cms.caltech.edu
Add this event to my calendar