Prof. Kurt Keutzer’s presentation at the 2018 CVPR Workshop on Efficient Deep Learning for Computer Vision
1. Co-design of DNNs and accelerators requires detailed layer-by-layer analysis of contemporary DNNs
2. Key to efficient DNN computation is data reuse
3. Different CNN/DNN layers have different patterns of data reuse
4. Different NN accelerator architectures favor different types of reuse (outputs vs weights)
5. We present a NN accelerator design, the Squeezelerator, that supports Weight-Stationary (WS) and Output-Stationary (OS) dataflow and matches the data reuse in the accelerator to the characteristics of the layer
6. Careful attention to layer-by-layer speed/energy of mobile DNNs resulted in up to 6x speedup over other accelerator architectures
