Reducing tail latency of DNN-based recommender systems using in-storage processing

Minsub Kim, Sungjin Lee

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Most recommender systems are designed to comply with service level agreement (SLA) because prompt response to users' requests is the most important factor that decides the quality of service. Existing recommender systems, however, seriously suffer from long tail latency when the embedding tables cannot be entirely loaded in the main memory. In this paper, we propose a new SSD architecture called EMB-SSD, which mitigates the tail latency problem of recommender systems by leveraging in-storage processing. By offloading the data-intensive parts of the recommendation algorithm into an SSD, EMB-SSD not only reduces the data traffic between the host and the SSD, but also lowers software overheads caused by deep I/O stacks. Results show that EMB-SSD exhibits 47% and 25% shorter 99th percentile latency and average latency, respectively, over existing systems.

Original languageEnglish
Title of host publicationAPSys 2020 - Proceedings of the 2020 ACM SIGOPS Asia-Pacific Workshop on Systems
PublisherAssociation for Computing Machinery
Pages90-97
Number of pages8
ISBN (Electronic)9781450380690
DOIs
StatePublished - 24 Aug 2020
Event11th ACM SIGOPS Asia-Pacific Workshop on Systems, APSys 2020 - Tsukuba, Virtual, Japan
Duration: 24 Aug 202025 Aug 2020

Publication series

NameAPSys 2020 - Proceedings of the 2020 ACM SIGOPS Asia-Pacific Workshop on Systems

Conference

Conference11th ACM SIGOPS Asia-Pacific Workshop on Systems, APSys 2020
Country/TerritoryJapan
CityTsukuba, Virtual
Period24/08/2025/08/20

Bibliographical note

Publisher Copyright:
© 2020 ACM.

Keywords

  • in-storage processing
  • machine learning
  • recommender system

Fingerprint

Dive into the research topics of 'Reducing tail latency of DNN-based recommender systems using in-storage processing'. Together they form a unique fingerprint.

Cite this