EE Special Seminar
Abstract Recent advancements in machine learning algorithms, hardware, and datasets have led to the successful deployment of deep neural networks (DNNs) in various cloud-based services. Today, new applications are emerging where sensor bandwidth is higher than network bandwidth, and where network latency is not tolerable. By pushing DNN processing closer to the sensor, we can avoid throwing data away and improve the user experience. In the long term, it is foreseeable that such DNN processors will run on harvested energy, eliminating the cost and overhead of wired power connections and battery replacement. A significant challenge arises from the fact that DNNs are both memory and compute intensive, requiring millions of parameters and billions of arithmetic operations to perform a single inference. In this talk, I will present circuit and architecture techniques that leverage the noise tolerance and parallel structure of DNNs to bring inference systems closer to the energy-efficiency limits of CMOS technology.
In the low SNR regime where DNNs operate, thermally-limited analog signal processing circuits are more energy-efficient than digital. However, the massive scale of DNNs favors circuits compatible with dense digital memory. Mixed-signal processing allows us to integrate analog efficiency with digital scalability, but close attention must be paid to energy consumed at the analog-digital interface and in memory access. Binarized neural networks minimize this overhead, and hence operate closer to the analog energy limit. I will present a mixed-signal binary convolutional neural network processor implemented in 28 nm CMOS, featuring a weight-stationary, parallel-processing architecture that amortizes memory access across many computations, and a switched-capacitor neuron array that consumes an order of magnitude lower energy than synthesized digital arithmetic at the same application-level accuracy. I will provide an apples-to-apples comparison of the mixed-signal, hand-designed digital, and synthesized digital implementations of this architecture. I will conclude with future research directions centered around the idea of training neural networks where the transfer function of a neuron models the behavior of an energy-efficient, physically realizable circuit.
Bio Daniel Bankman is a PhD candidate in the Department of Electrical Engineering at Stanford University, advised by Prof. Boris Murmann. His research focuses on mixed-signal processing circuits, hardware architectures, and neural architectures capable of bringing machine learning closer to the energy limits of scaled semiconductor technology. During his PhD, he demonstrated that switched-capacitor circuits can significantly lower the energy consumption of binarized neural networks while preserving the same application-level accuracy as digital static CMOS arithmetic. Daniel received the S.B. degree in electrical engineering from MIT in 2012 and the M.S. degree from Stanford in 2015. He has held internship positions at Analog Devices Lyric Labs and Intel AI Research, and served as the instructor for EE315 Analog-Digital Interface Circuits at Stanford in 2018.