Close

Session

Research Manuscript: It's Not 8b Retro-Gaming, It's State-Of-The-Art Architectures Using Quantization, Sparsity, and Compression!
DescriptionThis session presents state-of-the-art work in architecture design focusing on optimization techniques such as quantization, sparsity, pruning, and compression for DNN accelerators. The session begins with a series of presentations on quantization, which is an increasingly popular and energy efficient technique used for deep neural networks (DNNs). The session presents other hot topic techniques such as utilization and optimizing sparsity and pruning, with a focus on the ever-popular transformer attention architecture.
Event TypeResearch Manuscript
TimeThursday, June 2710:30am - 12:00pm PDT
Location3003, 3rd Floor
Topics
AI
Design
Keywords
AI/ML Architecture Design