Computation and Neural Systems Seminar
Theoretical and empirical evidence suggest that the perceptual world is best represented by a multi-stage hierarchy in which features in successive stages are increasingly global, invariant, and abstract. An important question is to devise "deep learning" methods for multi-stage architecture than can automatically learn invariant feature hierarchies from labeled and unlabeled data.
We will describe a number of unsupervised methods for learning invariant features that are based on sparse coding and sparse auto-encoders: convolutional sparse auto-encoders, invariance through group sparsity, invariance through lateral inhibition, and invariance through temporal constancy. The methods are used to pre-train convolutional networks (ConvNets). ConvNets are biologically-inspired architectures consisting of multiple stages of filter banks, interspersed with non-linear operations, and spatial pooling operations.
A number of applications will be shown through videos and live demos, including a a pedestrian detector, a category-level object recognition system that can be trained on the fly, and a system that can label every pixel in an image with the category of the object it belongs to (scene parsing). Specialized hardware architecture that run these systems in real time will also be described.