Tuesday, April 17, 2018
12:00 pm

IST LUNCH BUNCH

On Optimization in Deep Learning: Implicit Acceleration by Overparameterization
Nadav Cohen, Research Scholar at the School of Mathematics in the Institute for Advanced Study of Princeton, Princeton University

Deep learning refers to a class of statistical learning models that is enjoying unprecedented success in recent years, powering state of the art technologies in numerous application domains. However, despite the vast scientific and industrial interest it is drawing, our theoretical understanding of deep learning is partial at best. In this talk I will outline my perspective on the fundamental questions in deep learning theory, highlighting the difference from classical machine learning. I will then focus on the question of optimization, where depth (non-convexity) is believed to be an impediment. In stark contrast with conventional wisdom, I will show that, sometimes, increasing depth can speed up optimization. This result was recently derived with Sanjeev Arora and Elad Hazan (arXiv preprint: https://arxiv.org/pdf/1802.06509.pdf).

Contact Diane Goodfellow diane@cms.caltech.edu at 6267972398
Add this event to my calendar