Presentation

· Contributors · Organizations · Search Program · Flagged · Happening Now

PIVOT- Input-aware Path Selection for Energy-efficient ViT Inference

SessionDo-More-with-Less: Optimizing AI Models for Inference Efficiencies

DescriptionThe sophisticated self-attention-based spatial correlation entails a high inference delay cost in vision transformers. To this end, we propose PIVOT-a hardware-algorithm co-optimization framework for input-difficulty-aware attention skipping for attention bottleneck optimization. The attention-skipping configurations are obtained via an iterative hardware-in-the loop co-search method. On the ZCU102 MPSoC FPGA, PIVOT achieves 2.7×(1.73×) lower EDP at 0.2%(0.4%) accuracy reduction compared to standard LVViT-S (DeiT-S) ViTs. Unlike prior works that require nuanced hardware support, PIVOT is compatible with traditional GPU and CPU platforms- 1.8× higher throughput at 0.4-1.3% higher accuracy compared to prior works.

Authors

Abhishek Moitra

Yale University

Abhiroop Bhattacharjee

Yale University

Priyadarshini Panda

Yale University

Event Type

Research Manuscript

TimeWednesday, June 261:30pm - 1:45pm PDT

Location3003, 3rd Floor

Topics

Keywords

Next PresentationNext Presentation

Deep Reorganization: Retaining Residuals in TinyML

DAC 2024