Abstract
In this paper, we analyze and optimize I/O latency of a petabyte scale, high performance all-flash array (AFA) system based on NVMe SSDs. A flash-based SSD itself shows relatively low and consistent latency but, in AFA systems where several tens or hundreds of SSDs are combined in a single host machine, applications often see higher and more diverged I/O latency compared with a standalone SSD. To figure out a main source of such high I/O fluctuations, we analyze end-to-end I/O latency characteristics of a real-world AFA system. We find out that suboptimal kernel policies, parameters, and configurations result in serious degradation of I/O response times, causing very long tail latency. Based on our observations, we manually reconfigure several kernel parameters and revise storage firmware to achieve consistent I/O latency. Our experimental results show that, with the finely tuned kernel for AFA systems, the mean and standard deviation of the maximum latency can be reduced by x8 and x400, respectively. The findings in this work provide useful wisdom in designing system software and operating systems-CPU schedulers need to be revised to take into account the priority of IO-bound jobs, CPU isolation, and CPU-SSD affinity, and moreover, storage housekeeping protocols like SMART should be improved to avoid long tail latency.
Original language | English |
---|---|
Title of host publication | Proceedings - 2018 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2018 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 12-21 |
Number of pages | 10 |
ISBN (Electronic) | 9781538650103 |
DOIs | |
State | Published - 25 May 2018 |
Event | 2018 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2018 - Belfast, Northern Ireland, United Kingdom Duration: 2 Apr 2018 → 4 Apr 2018 |
Publication series
Name | Proceedings - 2018 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2018 |
---|
Conference
Conference | 2018 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2018 |
---|---|
Country/Territory | United Kingdom |
City | Belfast, Northern Ireland |
Period | 2/04/18 → 4/04/18 |
Bibliographical note
Publisher Copyright:© 2018 IEEE.
Keywords
- All flash array
- Long tail latency
- NAND flash
- NVMe SSDs
- Performance Analysis
- Storage Systems