IST LUNCH BUNCH
Deep learning refers to a class of statistical learning models that is enjoying unprecedented success in recent years, powering state of the art technologies in numerous application domains. However, despite the vast scientific and industrial interest it is drawing, our theoretical understanding of deep learning is partial at best. In this talk I will outline my perspective on the fundamental questions in deep learning theory, highlighting the difference from classical machine learning. I will then focus on the question of optimization, where depth (non-convexity) is believed to be an impediment. In stark contrast with conventional wisdom, I will show that, sometimes, increasing depth can speed up optimization. This result was recently derived with Sanjeev Arora and Elad Hazan (arXiv preprint: https://arxiv.org/pdf/1802.06509.pdf).