Close

Presentation

GSPO: A Graph Substitution and Parallelization Joint Optimization Framework for DNN Inference
DescriptionThis work proposes GSPO, an automatic unified framework that jointly applies graph substitution and parallelization for DNN inference. GSPO uses joint optimization computation graph (JOCG) to represent both graph substitution and parallelization at the operator level. Then, a novel cost model customized for joint optimization is used to quickly evaluate the computation graph execution time. Combined with backtracking search algorithm, GSPO is able to find the optimal joint optimization solution within acceptable search time. Compared to existing frameworks applying equivalent graph substitution or parallelization, GSPO can achieve up to 27.1% end-to-end performance improvement and reduce search time by up to 94.3%.
Event Type
Research Manuscript
TimeWednesday, June 262:00pm - 2:15pm PDT
Location3003, 3rd Floor
Topics
AI
Design
Keywords
AI/ML System and Platform Design