skip to main content
Caltech

IST LUNCH BUNCH

Tuesday, April 17, 2018
12:00pm to 1:00pm
Add to Cal
Annenberg 105
On Optimization in Deep Learning: Implicit Acceleration by Overparameterization
Nadav Cohen, Research Scholar at the School of Mathematics in the Institute for Advanced Study of Princeton, Princeton University,

Deep learning refers to a class of statistical learning models that is enjoying unprecedented success in recent years, powering state of the art technologies in numerous application domains. However, despite the vast scientific and industrial interest it is drawing, our theoretical understanding of deep learning is partial at best. In this talk I will outline my perspective on the fundamental questions in deep learning theory, highlighting the difference from classical machine learning. I will then focus on the question of optimization, where depth (non-convexity) is believed to be an impediment. In stark contrast with conventional wisdom, I will show that, sometimes, increasing depth can speed up optimization. This result was recently derived with Sanjeev Arora and Elad Hazan (arXiv preprint: https://arxiv.org/pdf/1802.06509.pdf).

For more information, please contact Diane Goodfellow by phone at 6267972398 or by email at [email protected].