BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
X-LIC-LOCATION:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240626T180033Z
LOCATION:3002\, 3rd Floor
DTSTART;TZID=America/Los_Angeles:20240625T103000
DTEND;TZID=America/Los_Angeles:20240625T104500
UID:dac_DAC 2024_sess124_RESEARCH558@linklings.com
SUMMARY:InterArch: Video Transformer Acceleration via Inter-Feature Dedupl
 ication with Cube-based Dataflow
DESCRIPTION:Research Manuscript\n\nXuhang Wang, Zhuoran Song, and Xiaoyao 
 Liang (Shanghai Jiao Tong University)\n\nIn the realm of video-oriented ta
 sks, Video Transformer models (VidT), an evolution from vision Transformer
 s (ViT), have demonstrated considerable success. However, their widespread
  application is constrained by substantial computational demands and high 
 energy consumption. Addressing these limitations and thus improving VidT e
 fficiency has become a hot topic. Current methodologies solve this challen
 ge by dividing a video into several features and applying intra-feature sp
 arsity. However, they neglect the crucial point of inter-feature redundanc
 y and often entail prolonged latency in fine-tuning phases. In response, t
 his paper introduces InterArch, a tailored framework designed to significa
 ntly enhance VidT efficiency. We first design a novel inter-feature sparsi
 ty algorithm consisting of hierarchical deduplication and recovery. The de
 duplication phase capitalizes on temporal similarities at both block and e
 lement levels, enabling the elimination of redundant computations across f
 eatures in both coarse-grained and fine-grained manners. To prevent long-l
 atency fine-tuning, we employ a lightweight recovery mechanism that constr
 ucts approximate features for the sparsified data. Furthermore, InterArch 
 incorporates a regular dataflow strategy, which consolidates sparse featur
 es and effectively translates sparse computations into dense ones. Complem
 enting this, we develop a spatial array architecture equipped with augment
 ed processing elements (PEs), specifically optimized for our proposed data
 flow. Extensive experiment results demonstrate that InterArch can achieve 
 satisfactory performance speedups and energy saving.\n\nTopic: Design\n\nK
 eyword: AI/ML System and Platform Design\n\nSession Chairs: Liu Ke (Tensto
 rrent) and Ramtin Zand (University of South Carolina)
END:VEVENT
END:VCALENDAR
