Close

Presentation

Towards Redundancy-Free Recommendation Model Training via Reusable-aware Near-Memory Processing
DescriptionThe memory-intensive embedding layer in the recommendation model continues to be the performance bottleneck. While prior works have attempted to improve the embedding layer performance by exploiting the data locality to cache the frequently accessed embedding vectors and their partial sums. However, these solutions rely on the static cache, which is invalidated in the embedding training scenario of the embedding vectors being updated frequently. To this end, this paper proposes ReFree, a redundancy-free near-memory processing (NMP) solution for embedding training. Specifically, ReFree identifies the reusable data in real-time for both forward and backpropagation of the embedding layer training, and leverages a lightweight NMP architecture to enable redundancy-free near-memory acceleration of the entire embedding training process. Evaluation results on real-world datasets show that ReFree outperforms the state-of-the-art solutions by 10.9x and reduces 5.3x energy consumption on average.
Event Type
Research Manuscript
TimeWednesday, June 2610:30am - 10:45am PDT
Location3003, 3rd Floor
Topics
Design
Keywords
In-memory and Near-memory Computing Architectures, Applications and Systems