Presentation

· Contributors · Organizations · Search Program · Flagged · Happening Now

Heterogeneous Vector Accelerator for Matrix Multiplications on FPGA

SessionWednesday Work-in-Progress Posters

DescriptionLarge matrix multiplications are crucial in transformers, especially in self-attention. We propose a heterogeneous vector systolic accelerator where each processing element (PE) has varying vector lane widths, diverging from homogeneous lane widths across all PEs. We partition input matrices into sub-matrices for efficient mapping onto PEs, optimizing resource utilization and minimizing latency. We implement the design on an AMD-Xilinx ZCU104 FPGA. The heterogeneous architectures reports 1.68x better throughput and latency compared to a homogeneous architecture, with a 23% better resource utilization. While using heterogeneous vector tiles, we prefer tiles with larger
lane widths for optimal throughput.

Authors

Jay Shah

International Institute of Information Technology, Bangalore

Nanditha Rao

International Institute of Information Technology, Bangalore

Event Type

Work-in-Progress Poster

TimeWednesday, June 265:00pm - 6:00pm PDT

LocationLevel 2 Lobby

Topics

Next PresentationNext Presentation

An Efficient Framework for High-Fidelity Automotive Exterior Design

DAC 2024