Close

Presentation

A High-Throughput Private Inference Engine Based on 3D Stacked Memory
DescriptionFully Homomorphic Encryption (FHE) enables unlimited computation depth, allowing for privacy-enhanced neural network inference tasks directly on the ciphertext. However, existing FHE architectures suffer from the memory access bottleneck due to the significant data consumption. This work proposes a High-throughput FHE engine for private inference (PI) based on 3D stacked memory (H3). H3 adopts software-hardware co-design that dynamically adjusts the polynomial decomposition during the PI process to minimize the computation and storage overhead at a fine granularity. With 3D hybrid bonding, H3 integrates a logic die with a multi-layer embedded DRAM, routing data efficiently to the processing unit array through an efficient broadcast mechanism. H3 consumes 192mm$^2$ of the area when implemented using a 28nm logic process. H3 achieves a throughput of 1.36 million LeNet-5 or 920 ResNet-20 PI per minute, surpassing existing 7nm accelerators by 52%. This demonstrates that 3D memory is a promising technology to promote the performance of FHE.
Event Type
Research Manuscript
TimeWednesday, June 265:00pm - 5:15pm PDT
Location3012, 3rd Floor
Topics
Security
Keywords
Hardware Security: Primitives, Architecture, Design & Test