Close

Presentation

ReS-CIM: ReRAM-cached SRAM Compute-in-Memory Architecture with a Differential Sensing Scheme Enabling Intra-Macro Weight Loading
DescriptionSRAM-based compute-in-memory (SRAM-CIM) is a promising architecture for efficient and accurate AI computing. However, the low memory density of SRAM makes it impractical to store all weights of large neural networks, leading to on-chip and off-chip weight loading overheads. Previous attempts to improve SRAM-CIM's memory density involved integrating multiple resistive RAM (ReRAM) cells into an SRAM cell as local weight storage. However, the current-based sensing scheme used in these approaches does not guarantee accurate data loading of weights, since it relies on the limited gain of the SRAM cell to latch data from ReRAM. The correctness of weight loading deteriorates as the number of embedded ReRAM cells increases, impeding the achievement of high-density SRAM-CIM. To address these issues, we propose ReS-CIM, a ReRAM-cached SRAM-CIM architecture employing a differential sensing scheme that provides highly scalable local ReRAM storage and robust weight loading. By amplifying the ReRAM resistance difference before SRAM latches the data, the proposed sensing scheme guarantees accurate weight loading across varying ReRAM capacities, on/off ratios, and device variations. Additionally, the voltage-based differential sensing mechanism eliminates static current flow, achieving ultra-low energy consumption and short latency. To fully leverage ReS-CIM's exceptional bandwidth data loading and energy efficiency, we introduce a CIM acceleration data flow. System-level simulations show that ReS-CIM achieves 91.7% energy savings and 97.7% latency on AlexNet when compared to the state-of-the-art all-weights-on-chip AI accelerator architectures.
Event Type
Work-in-Progress Poster
TimeWednesday, June 265:00pm - 6:00pm PDT
LocationLevel 2 Lobby
Topics
AI
Autonomous Systems
Cloud
Design
EDA
Embedded Systems
IP
Security