Presentation
Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
DescriptionBit-level sparsity in neural network models harbors immense untapped potential.
Eliminating redundant calculations of randomly distributed zero-bits significantly boosts computational efficiency.
Yet, traditional digital SRAM-PIM architecture, limited by rigid crossbar architecture, struggles to effectively exploit this unstructured sparsity.
To address this challenge, we propose Dyadic Block PIM (DB-PIM), a novel algorithm-architecture co-design framework.
It preserves the random distribution of non-zero bits to maintain accuracy while restricting the number of non-zero bits in each weight of the filter to improve regularity.
DB-PIM improves both performance and energy efficiency, achieving a remarkable speedup of up to 6.53x and energy savings of 77.50%.
Eliminating redundant calculations of randomly distributed zero-bits significantly boosts computational efficiency.
Yet, traditional digital SRAM-PIM architecture, limited by rigid crossbar architecture, struggles to effectively exploit this unstructured sparsity.
To address this challenge, we propose Dyadic Block PIM (DB-PIM), a novel algorithm-architecture co-design framework.
It preserves the random distribution of non-zero bits to maintain accuracy while restricting the number of non-zero bits in each weight of the filter to improve regularity.
DB-PIM improves both performance and energy efficiency, achieving a remarkable speedup of up to 6.53x and energy savings of 77.50%.
Event Type
Research Manuscript
TimeWednesday, June 264:55pm - 5:12pm PDT
Location3004, 3rd Floor
Design
In-memory and Near-memory Computing Architectures, Applications and Systems