BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
X-LIC-LOCATION:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240626T180033Z
LOCATION:3001\, 3rd Floor
DTSTART;TZID=America/Los_Angeles:20240625T111500
DTEND;TZID=America/Los_Angeles:20240625T113000
UID:dac_DAC 2024_sess103_RESEARCH082@linklings.com
SUMMARY:EPIM: Efficient Processing-In-Memory Accelerators based on Epitome
DESCRIPTION:Research Manuscript\n\nChenyu Wang (Princeton University); Zhe
 n Dong (University of California, Berkeley); Daquan Zhou (Bytedance Inc.);
  Zhenhua Zhu and Yu Wang (Tsinghua University); Jiashi Feng (Bytedance Inc
 .); and Kurt Keutzer (University of California, Berkeley)\n\nThe explorati
 on of Processing-In-Memory (PIM) accelerators has garnered significant att
 ention within the research community. However, the utilization of large-sc
 ale neural networks on Processing-In-Memory (PIM) accelerators encounters 
 challenges due to constrained on-chip memory capacity. To tackle this issu
 e, current works explore model compression algorithms to reduce the size o
 f Convolutional Neural Networks (CNNs). Most of these algorithms either ai
 m to represent neural operators with reduced-size parameters (e.g., quanti
 zation) or search for the best combinations of neural operators (e.g., neu
 ral architecture search). Designing neural operators to align with PIM acc
 elerators' specifications is an area that warrants further study. In this 
 paper, we introduce the Epitome, a lightweight neural operator offering co
 nvolution-like functionality, to craft memory-efficient CNN operators for 
 PIM accelerators (EPIM). On the software side, we evaluate epitomes' laten
 cy and energy on PIM accelerators and introduce a PIM-aware layer-wise des
 ign method to enhance their hardware efficiency. We apply epitome- aware q
 uantization to further reduce the size of epitomes. On the hardware side, 
 we modify the datapath of current PIM accelerators to accommodate epitomes
  and implement a feature map reuse technique to reduce computation cost. E
 xperimental results reveal that our 3-bit quantized EPIM-ResNet50 attains 
 71.59% top-1 accuracy on ImageNet, reducing crossbar areas by 30.65×. EPIM
  surpasses the state-of-the-art pruning methods on PIM\n\nTopic: AI\n\nKey
 word: AI/ML Algorithms\n\nSession Chairs: Hongyang Jia (Tsinghua Universit
 y) and Grace Li Zhang (Technische Universität Darmstadt)
END:VEVENT
END:VCALENDAR
