Close

Presentation

Compression with Attention: Learning in Lower Dimensions
DescriptionWe propose a novel operation called Personal Self-Attention (PSA). It is designed specifically to learn non-linear 1-D functions faster than existing architectures. We show that by stacking and combining these non-linear functions with linear transformations, we can achieve the same accuracy as a larger model but with a hidden dimension that is 2-6x smaller. Further, by quantizing our non-linear function, the PSA can be mapped to a simple lookup table, allowing for very efficient translation to FPGA hardware attaining an accuracy of 86% on CIFAR-10 with a throughput of 29k FPS.
Event Type
Work-in-Progress Poster
TimeWednesday, June 265:00pm - 6:00pm PDT
LocationLevel 2 Lobby
Topics
AI
Autonomous Systems
Cloud
Design
EDA
Embedded Systems
IP
Security