Close

Presentation

Adaptive Neurosurgeon: DNN Computing Latency Minimization for Mobile Edge Intelligence
DescriptionThe pervasive integration of deep neural networks (DNNs) within smart devices has significantly increased computational workloads, consequently intensifying pressure on real-time performance and device power consumption. Offloading segments of DNNs to the edge has emerged as an effective strategy for reducing latency and device power usage. Nonetheless, determining the workload to offload presents a complex challenge, particularly in the face of fluctuating device workloads and varying wireless signal strengths. This paper introduces a streamlined approach aimed at swiftly and accurately forecasting the computing latency of a DNN. Building upon this, an adaptive neurosurgeon framework is proposed to dynamically select the optimal partition point of a DNN during runtime, effectively minimizing computing latency. Through experimental validation, our proposed adaptive neurosurgeon demonstrates superior performance in reducing computing latency amidst changing DNN workloads across devices and varying wireless communication capabilities, outperforming existing state-of-the-art approaches, such as the autodidactic neurosurgeon.
Event Type
Work-in-Progress Poster
TimeTuesday, June 256:00pm - 7:00pm PDT
LocationLevel 2 Lobby
Topics
AI
Autonomous Systems
Cloud
Design
EDA
Embedded Systems
IP
Security