Presentation

· Contributors · Organizations · Search Program · Flagged · Happening Now

SMILE: LLC-based Shared Memory Expansion to Improve GPU Thread Level Parallelism

SessionAccelerators and Cache Memories Meet Heterogeneous Architectures

DescriptionAs the de facto high-throughput accelerators targeting at a wide spectrum of applications, graphics processing units (GPUs) keep adding computing and memory resources to meet the increasing demands. However, while designed for massive parallelism, GPUs are frequently suffering from low thread occupancy and limited data throughput, which are typically attributed to constrained on-chip resources, such as shared memory and register file. To alleviate the pressure, last-level cache (LLC) is being substantially enlarged to support continuously growing computation and to shrink the off-chip data traffic. Nevertheless, the frequent low usage of LLC leaves the space waste, impeding LLC from fully unleashing potentials. Towards the issue, we propose to manage partial LLC in a software way instead to expand precious shared memory, named as SMILE, helping to alleviate the low occupancy. SMILE splits the monolithic LLC into normal data cache and new software region, with the latter being to extend the limited SMEM. For adapting to diverse application characteristics, SMILE enables multiple splitting grades and meanwhile determines the appropriate partition through online profiling among streaming multiprocessors. Experimental results show that SMILE achieves average performance improvements of 14.7% and 8.4% respectively, compared to the default baseline and prior state-of-the-art.

Authors

Tianyu Guo

Sun Yat-Sen University

Xuanteng Huang

Sun Yat-Sen University

Kan Wu

Sun Yat-Sen University

Xianwei Zhang

Sun Yat-Sen University

Nong Xiao

Sun Yat-Sen University

Event Type