Presentation

· Contributors · Organizations · Search Program · Flagged · Happening Now

MoteNN: Memory Optimization via Fine-grained Scheduling for Deep Neural Networks on Tiny Devices

SessionOptimizing Both Directions: Better AI Algorithm for Systems and Better Systems for AI

DescriptionThere has been a growing trend in deploying deep neural networks (DNNs) on tiny devices.
However, deploying DNNs on such devices poses significant challenges due to the contradiction between DNNs' substantial memory requirements and the stringent memory constraints of tiny devices.
Some prior works incur large latency overhead to save memory and target only simple CNNs, while others employ coarse-grained scheduling for complicated networks, leading to limited memory footprint reduction. This paper proposes MoteNN that performs fine-grained scheduling via operator partitioning on arbitrary DNNs to dramatically reduce peak memory usage with little latency overhead.
MoteNN presents a graph representation named Axis Connecting Graph (ACG) to perform operator partition at graph-level efficiently. MoteNN further proposes an algorithm that finds the partition and schedule guided by memory bottlenecks.
We evaluate MoteNN using various popular networks and show that MoteNN achieves up to 80% of peak memory usage reduction compared to state-of-art works with nearly no latency overhead on tiny devices.

Authors

Event Type