Close

Presentation

A DRAM-based PIM Architecture for Accelerated and Energy-Efficient Execution of Transformers
DescriptionTransformers-based language models have demonstrated tremendous accuracy in multiple natural language processing (NLP) tasks. Transformers use self-attention, in which matrix multiplication is the dominant computation. Moreover, their large size, makes the data movement a latency and energy efficiency bottleneck in conventional Von-Neumann systems. The processing-in-memory architectures with compute elements in the memory have been proposed to address the bottleneck. This paper presents PACT-3D, a PIM architecture with novel computing units interfaced with DRAM banks performing the required computations and achieving a 1.7X reduction in latency and 18.7X improvement in energy efficiency against the state-of-the-art PIM architecture.
Event Type
Work-in-Progress Poster
TimeWednesday, June 265:00pm - 6:00pm PDT
LocationLevel 2 Lobby
Topics
AI
Autonomous Systems
Cloud
Design
EDA
Embedded Systems
IP
Security