Close

Presentation

CSTrans-OPU: An FPGA-based Overlay Processor with Full Compilation for Transformer Networks via Sparsity Exploration
DescriptionIn this work, we propose CSTrans-OPU, an FPGA-based overlay processor with full compilation for transformer networks via sparsity exploration. Specifically, we customize a multi-precision processing element (PE) array with DSP-packing for unified computation format with full resource utilization. Additionally, the introduced sorting and computation mode selection modules make it possible to explore the token sparsity. Moreover, equipped with a user-friendly compiler, CSTrans-OPU enables model parsing, operation fusion, model quantization, instruction generation and reordering directly from model files. To the best of our knowledge, our CSTrans-OPU is the first overlay processor for transformer networks considering sparsity.
Event Type
Research Manuscript
TimeTuesday, June 251:30pm - 1:45pm PDT
Location3003, 3rd Floor
Topics
AI
Design
Keywords
AI/ML Architecture Design