Presentation

· Contributors · Organizations · Search Program · Flagged · Happening Now

InterArch: Video Transformer Acceleration via Inter-Feature Deduplication with Cube-based Dataflow

SessionTransforming Transformers: Accelerating Transformer Models for ViT and LLMs

DescriptionIn the realm of video-oriented tasks, Video Transformer models (VidT), an evolution from vision Transformers (ViT), have demonstrated considerable success. However, their widespread application is constrained by substantial computational demands and high energy consumption. Addressing these limitations and thus improving VidT efficiency has become a hot topic. Current methodologies solve this challenge by dividing a video into several features and applying intra-feature sparsity. However, they neglect the crucial point of inter-feature redundancy and often entail prolonged latency in fine-tuning phases. In response, this paper introduces InterArch, a tailored framework designed to significantly enhance VidT efficiency. We first design a novel inter-feature sparsity algorithm consisting of hierarchical deduplication and recovery. The deduplication phase capitalizes on temporal similarities at both block and element levels, enabling the elimination of redundant computations across features in both coarse-grained and fine-grained manners. To prevent long-latency fine-tuning, we employ a lightweight recovery mechanism that constructs approximate features for the sparsified data. Furthermore, InterArch incorporates a regular dataflow strategy, which consolidates sparse features and effectively translates sparse computations into dense ones. Complementing this, we develop a spatial array architecture equipped with augmented processing elements (PEs), specifically optimized for our proposed dataflow. Extensive experiment results demonstrate that InterArch can achieve satisfactory performance speedups and energy saving.

Authors

Xuhang Wang

Shanghai Jiao Tong University

Zhuoran Song

Shanghai Jiao Tong University

Xiaoyao Liang

Shanghai Jiao Tong University

Event Type

Research Manuscript

TimeTuesday, June 2510:30am - 10:45am PDT

Location3002, 3rd Floor

Topics

Keywords

Next PresentationNext Presentation

TSAcc: An Efficient \underline{T}empo-\underline{S}patial Similarity Aware \underline{Acc}elerator for Attention Acceleration

DAC 2024