Presentation

· Contributors · Organizations · Search Program · Flagged · Happening Now

A DRAM-based PIM Architecture for Accelerated and Energy-Efficient Execution of Transformers

SessionWednesday Work-in-Progress Posters

DescriptionTransformers-based language models have demonstrated tremendous accuracy in multiple natural language processing (NLP) tasks. Transformers use self-attention, in which matrix multiplication is the dominant computation. Moreover, their large size, makes the data movement a latency and energy efficiency bottleneck in conventional Von-Neumann systems. The processing-in-memory architectures with compute elements in the memory have been proposed to address the bottleneck. This paper presents PACT-3D, a PIM architecture with novel computing units interfaced with DRAM banks performing the required computations and achieving a 1.7X reduction in latency and 18.7X improvement in energy efficiency against the state-of-the-art PIM architecture.

Authors

Gian Singh

Arizona State University

Sarma Vrudhula

Arizona State University

Event Type

Work-in-Progress Poster

TimeWednesday, June 265:00pm - 6:00pm PDT

LocationLevel 2 Lobby

Topics

Next PresentationNext Presentation

Circuit Transformer: End-to-end Logic Synthesis by Predicting the Next Gate

DAC 2024