Close

Presentation

SMILE: LLC-based Shared Memory Expansion to Improve GPU Thread Level Parallelism
DescriptionAs the de facto high-throughput accelerators targeting at a wide spectrum of applications, graphics processing units (GPUs) keep adding computing and memory resources to meet the increasing demands. However, while designed for massive parallelism, GPUs are frequently suffering from low thread occupancy and limited data throughput, which are typically attributed to constrained on-chip resources, such as shared memory and register file. To alleviate the pressure, last-level cache (LLC) is being substantially enlarged to support continuously growing computation and to shrink the off-chip data traffic. Nevertheless, the frequent low usage of LLC leaves the space waste, impeding LLC from fully unleashing potentials. Towards the issue, we propose to manage partial LLC in a software way instead to expand precious shared memory, named as SMILE, helping to alleviate the low occupancy. SMILE splits the monolithic LLC into normal data cache and new software region, with the latter being to extend the limited SMEM. For adapting to diverse application characteristics, SMILE enables multiple splitting grades and meanwhile determines the appropriate partition through online profiling among streaming multiprocessors. Experimental results show that SMILE achieves average performance improvements of 14.7% and 8.4% respectively, compared to the default baseline and prior state-of-the-art.
Event Type
Research Manuscript
TimeTuesday, June 254:45pm - 5:00pm PDT
Location3012, 3rd Floor
Topics
Design
Keywords
SoC, Heterogeneous, and Reconfigurable Architectures