|Date||January 26, 2017|
University of Wisconsin-Madison
|Title||Dynamically Synthesized Accelerators: Achieving Hardware Efficiency and Software Programmability|
|Abstract||The waning benefits of device scaling have caused a push towards domain specific accelerators (DSAs), which sacrifice programmability for efficiency. While providing huge benefits, DSAs are prone to obsoletion due to domain volatility, have recurring design and verification costs, and have large area footprints when multiple DSAs are required in a single device. Because of the benefits of generality, this work explores how far a programmable architecture can be pushed, and whether it can come close to the performance, energy, and area efficiency of a DSA-based approach. It has been taken for granted that workload specialization is necessary leaving software developers with no clean abstractions to target. In this talk we dispel this myth and show efficiency and programmability can indeed be achieved. First, we discover and demonstrate all DSAs are more similar than dissimilar. We show that all DSAs employ common specialization principles for concurrency, computation, communication, data-reuse and coordination, and that these same principles can be exploited in a programmable architecture using a composition of known microarchitectural mechanisms. Second, we propose a universal accelerator fabric that can match DSA. Our architecture is called LSSD, which is composed of many low-power and tiny cores, each having a configurable spatial architecture, scratchpads, and DMA. Our results from modeling and hardware/software implementation show that a programmable, specialized architecture can indeed be competitive with a domain-specific approach. Considering four diverse domains spanning databases, deep-learning, image-processing, and neural networks, we show LSSD matches DSAs for performance, and are typically only 2X worse in area and power than DSAs. This shows that the benefit of domain-specialization is only 2X in area and power. Considering the problems of their non-programmability and flexibility, we argue LSSD and LSSD-like hardware is likely to dominate future chips.|
|Bio||Karu Sankaralingam is an associate professor in the computer sciences department at the University of Wisconsin-Madison. His research interests include architecture, open source hardware, and software issues for massively parallel computation systems. His group has built the MIAOW open source GPGPU, and the DySER extension to OpenSPARC. He is a recipient of the IEEE TCCA Young Computer Architecture Award in 2012, an NSF CAREER award in 2009, the Emil H Steiger Distinguished Teaching award in 2014, and the Letters and Science Philip R. Certain - Gary Sandefur Distinguished Faculty Award in 2013. He has authored multiple IEEE Micro Top Picks papers, best paper awards, multiple CACM Research Highlights papers and nominations, and has 17 patents/applications. He earned a PhD from The University of Texas at Austin in December 2006.|
These seminars supported by the Ming Hsieh Institute.