Close

Presentation

Deputy NoC: A Case of Low Cost Network-on-Chip for Neural Network Accelerations on GPUs
DescriptionWith the rapid advance in Deep Neural Networks (DNNs), GPU's role as a hardware accelerator becomes more and more important. Due to the GPU's significant power consumption, developing high performance and power-efficient GPU systems becomes a critical challenge. DNN applications need to move a large amount of data between memory and the processing cores which consumes a great amount of power in the on-chip network. Prior data compression techniques have been proposed for network-on-chips to reduce the size of data being moved and can thus save power. However, these techniques are usually lossless because they target on general purpose applications that are not resilient to errors. DNN applications, on the contrary, are well known to be error-resistant which makes them good candidate for lossy compressions.
In this work, we propose an NoC architecture that can reduce the power consumption without compromising the performance and accuracy. Our technique takes advantage of the error resilience of DNNs as well as the data locality in the exponent field of DNN's floating-point data. Each data packet is reorganized by grouping data with similar exponents and redundant exponents are sent only once. We further compress the mantissa fields by appropriately selecting deputy values for data sharing a same exponent. Our evaluation results show that the proposed technique can effectively reduce the data transmissions and lead to better performance and power tradeoffs without losing accuracy.
Event Type
Work-in-Progress Poster
TimeTuesday, June 256:00pm - 7:00pm PDT
LocationLevel 2 Lobby
Topics
AI
Autonomous Systems
Cloud
Design
EDA
Embedded Systems
IP
Security