BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
X-LIC-LOCATION:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240626T180034Z
LOCATION:3001\, 3rd Floor
DTSTART;TZID=America/Los_Angeles:20240626T164500
DTEND;TZID=America/Los_Angeles:20240626T170000
UID:dac_DAC 2024_sess161_RESEARCH1403@linklings.com
SUMMARY:SARIS: Accelerating Stencil Computations on Energy-Efficient RISC-
 V Compute Clusters with Indirect Stream Registers
DESCRIPTION:Research Manuscript\n\nPaul Scheffler and Luca Colagrande (ETH
  Zürich) and Luca Benini (Università di Bologna)\n\nStencil codes are perf
 ormance-critical in many compute-intensive applications, but suffer from s
 ignificant address calculation and irregular memory access overheads. This
  work presents SARIS, a general and highly flexible methodology for stenci
 l acceleration using register-mapped indirect streams. We demonstrate SARI
 S for various stencil codes on an eight-core RISC-V compute cluster with i
 ndirect stream registers, achieving significant speedups of 2.72x, near-id
 eal FPU utilizations of 81%, and energy efficiency improvements of 1.58x o
 ver an RV32G baseline on average. Scaling out to a 256-core manycore syste
 m, we estimate an average FPU utilization of 64%, an average speedup of 2.
 14x, and up to 15% higher fractions of peak compute than a leading GPU cod
 e generator.\n\nTopic: Embedded Systems\n\nKeyword: Embedded Software
END:VEVENT
END:VCALENDAR
