Close

Presentation

PIVOT- Input-aware Path Selection for Energy-efficient ViT Inference
DescriptionThe sophisticated self-attention-based spatial correlation entails a high inference delay cost in vision transformers. To this end, we propose PIVOT-a hardware-algorithm co-optimization framework for input-difficulty-aware attention skipping for attention bottleneck optimization. The attention-skipping configurations are obtained via an iterative hardware-in-the loop co-search method. On the ZCU102 MPSoC FPGA, PIVOT achieves 2.7×(1.73×) lower EDP at 0.2%(0.4%) accuracy reduction compared to standard LVViT-S (DeiT-S) ViTs. Unlike prior works that require nuanced hardware support, PIVOT is compatible with traditional GPU and CPU platforms- 1.8× higher throughput at 0.4-1.3% higher accuracy compared to prior works.
Event Type
Research Manuscript
TimeWednesday, June 261:30pm - 1:45pm PDT
Location3003, 3rd Floor
Topics
AI
Design
Keywords
AI/ML System and Platform Design