Close

Presentation

MoteNN: Memory Optimization via Fine-grained Scheduling for Deep Neural Networks on Tiny Devices
DescriptionThere has been a growing trend in deploying deep neural networks (DNNs) on tiny devices.
However, deploying DNNs on such devices poses significant challenges due to the contradiction between DNNs' substantial memory requirements and the stringent memory constraints of tiny devices.
Some prior works incur large latency overhead to save memory and target only simple CNNs, while others employ coarse-grained scheduling for complicated networks, leading to limited memory footprint reduction. This paper proposes MoteNN that performs fine-grained scheduling via operator partitioning on arbitrary DNNs to dramatically reduce peak memory usage with little latency overhead.
MoteNN presents a graph representation named Axis Connecting Graph (ACG) to perform operator partition at graph-level efficiently. MoteNN further proposes an algorithm that finds the partition and schedule guided by memory bottlenecks.
We evaluate MoteNN using various popular networks and show that MoteNN achieves up to 80% of peak memory usage reduction compared to state-of-art works with nearly no latency overhead on tiny devices.
Event Type
Research Manuscript
TimeWednesday, June 2610:30am - 10:45am PDT
Location3008, 3rd Floor
Topics
AI
Design
Keywords
AI/ML System and Platform Design