BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
X-LIC-LOCATION:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240626T180034Z
LOCATION:3001\, 3rd Floor
DTSTART;TZID=America/Los_Angeles:20240625T134800
DTEND;TZID=America/Los_Angeles:20240625T140600
UID:dac_DAC 2024_sess101_RESEARCH1249@linklings.com
SUMMARY:Genetic Quantization-Aware Approximation for Non-Linear Operations
  in Transformers
DESCRIPTION:Research Manuscript\n\nPingcheng Dong, Yonghao Tan, and Dong Z
 hang (Hong Kong University of Science and Technology (HKUST)); Tianwei Ni 
 (Zhejiang University); Xuejiao Liu, Yu Liu, Peng Luo, and Luhong Liang (AI
  Chip Center for Emerging Smart System (ACCESS)); Shi-Yang Liu and Xijie H
 uang (Hong Kong University of Science and Technology (HKUST)); Huaiyu Zhu 
 and Yun Pan (Zhejiang University); Fengwei An (Southern University of Scie
 nce and Technology); and Kwang-Ting Cheng (Hong Kong University of Science
  and Technology (HKUST))\n\nNon-linear functions are prevalent in Transfor
 mers and their lightweight variants, incurring substantial and frequently 
 underestimated hardware costs. Previous state-of-the-art works optimize th
 ese operations by piece-wise linear approximation and store the parameters
  in look-up tables (LUT), but most of them require unfriendly high-precisi
 on arithmetics such as FP/INT 32 and lack consideration of integer-only IN
 T quantization. This paper proposed a genetic LUT-Approximation algorithm 
 namely GQA-LUT that can automatically determine the parameters with quanti
 zation awareness. The results demonstrate that GQA-LUT achieves negligible
  degradation on the challenging semantic segmentation task for both vanill
 a and linear Transformer models. Besides, proposed GQA-LUT enables the emp
 loyment of INT8-based LUT-Approximation that achieves an area savings of 8
 1.3~81.7% and a power reduction of 79.3~80.2% compared to the high-precisi
 on FP/INT 32 alternatives.\n\nTopic: AI\n\nKeyword: AI/ML Algorithms\n\nSe
 ssion Chairs: Sarada Krithivasan (IBM) and Igor Markov (Synopsys)
END:VEVENT
END:VCALENDAR
