CLC number: TP302
On-line Access: 2022-10-26
Received: 2021-12-08
Revision Accepted: 2022-10-26
Crosschecked: 2022-04-01
Cited: 0
Clicked: 1935
Citations: Bibtex RefMan EndNote GB/T7714
Yun TENG, Zhiyue LI, Jing HUANG, Guangyan ZHANG. ShortTail: taming tail latency for erasure-code-based in-memory systems[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2100566 @article{title="ShortTail: taming tail latency for erasure-code-based in-memory systems", %0 Journal Article TY - JOUR
ShortTail:降低纠删码内存存储系统的尾部延迟1吉林大学计算机科学与技术学院,中国长春市,130012 2清华大学计算机科学与技术系,中国北京市,100084 3吉林大学符号计算与知识工程教育部重点实验室,中国长春市,130012 4北京国家信息科学与技术研究中心(清华大学),中国北京市,100084 摘要:为获得高性能和高数据可用性,基于纠删码的内存存储系统得到广泛应用。然而,随着集群规模不断增长,服务器级别的性能降级问题出现得越来越频繁,进而导致长尾延迟。在基于纠删码的系统中,由于一个纠删码操作可能依赖于多个子操作的同步完成,长尾延迟的影响被进一步放大。本文提出一种称为ShortTail的基于纠删码的内存存储系统,该系统可实现稳定的性能和较低的读写延迟。首先,ShortTail使用轻量请求监视器监测每个内存节点性能,以便及时发现性能降级节点。其次,ShortTail选择性执行降级读操作和重定向写操作,以避免访问性能降级节点。最后,ShortTail采用一种自适应写策略降低小写请求的写放大程度。本文在Memcached上实现了ShortTail,并将其与两个系统进行比较。实验结果表明,ShortTail最高可降低63.77%的99分位延迟,且显著改善中位延迟和平均延迟。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Abebe M, Daudjee K, Glasbergen B, et al., 2018. EC-Store: bridging the gap between storage and latency in distributed erasure coded systems. Proc IEEE 38th Int Conf on Distributed Computing System, p.255-266. [2]Andersen DG, Balakrishnan H, Kaashoek MF, et al., 2005. Improving web availability for clients with MONET. Proc 2nd Symp on Networked Systems Design and Implementation, p.115-128. [3]Balaji SB, Krishnan MN, Vajha M, et al., 2018. Erasure coding for distributed storage: an overview. Sci China Inform Sci, 61(10):100301. [4]Cooper BF, Silberstein A, Tam E, et al., 2010. Benchmarking cloud serving systems with YCSB. Proc 1st ACM Symp on Cloud Computing, p.143-154. [5]Dimakis AG, Godfrey PB, Wu YN, et al., 2010. Network coding for distributed storage systems. IEEE Trans Inform Theory, 56(9):4539-4551. [6]Dragojević A, Narayanan D, Hodson O, et al., 2014. FaRM: fast remote memory. Proc 11th USENIX Conf on Networked Systems Design and Implementation, p.401-414. [7]Dragojević A, Narayanan D, Nightingale EB, et al., 2015. No compromises: distributed transactions with consistency, availability, and performance. Proc 25th Symp on Operating Systems Principles, p.54-70. [8]Fan B, Andersen DG, Kaminsky M, 2013. MemC3: compact and concurrent MemCache with dumber caching and smarter hashing. Proc 10th USENIX Conf on Networked Systems Design and Implementation, p.371-384. [9]Ford D, Labelle F, Popovici FI, et al., 2010. Availability in globally distributed storage systems. Proc 9th USENIX Conf on Operating Systems Design and Implementation, p.61-74. [10]Ganjam A, Jiang JC, Liu X, et al., 2015. C3: Internet-scale control plane for video quality optimization. Proc 12th USENIX Conf on Networked Systems Design and Implementation, p.131-144. [11]Gunawi HS, Suminto RO, Sears R, et al., 2018. Fail-slow at scale: evidence of hardware performance faults in large production systems. Proc 16th USENIX Conf on File and Storage Technologies, p.1-14. [12]Hu YC, Niu D, 2016. Reducing access latency in erasure coded cloud storage with local block migration. Proc 35th Annual IEEE Int Conf on Computer Communications, p.1-9. [13]Hu YC, Wang YS, Liu B, et al., 2017. Latency reduction and load balancing in coded storage systems. Symp on Cloud Computing, p.365-377. [14]Hu YC, Cheng LF, Yao QR, et al., 2021. Exploiting combined locality for wide-stripe erasure coding in distributed storage. Proc 19th USENIX Conf on File and Storage Technologies, p.233-248. [15]Huang C, Simitci H, Xu YK, et al., 2012. Erasure coding in windows azure storage. USENIX Conf on Annual Technical Conf, p.2. [16]Huang P, Guo CX, Zhou LD, et al., 2017. Gray failure: the Achilles' heel of cloud-scale systems. Proc 16th Workshop on Hot Topics in Operating Systems, p.150-155. [17]Intel, 2015. Intel Announces Optane Storage Brand for 3D XPoint Products. https://www.anandtech.com/show/9541/intel-announces-optane-storage-brand-for-3d-xpoint-products [Accessed on Nov. 8, 2021]. [18]Kalia A, Kaminsky M, Andersen DG, 2014. Using RDMA efficiently for key-value services. SIGCOMM Comput Commun Rev, 44(4):295-306. [19]Kalia A, Kaminsky M, Andersen DG, 2016. FaSST: fast, scalable and simple distributed transactions with two-sided (RDMA) datagram RPCs. Proc 12th USENIX Symp on Operating Systems Design and Implementation, p.185-201. [20]Lamport L, 1998. The part-time parliament. ACM Trans Comput Syst, 16(2):133-169. [21]Li C, Porto D, Clement A, et al., 2012. Making geo-replicated systems fast as possible, consistent when necessary. Proc 10th USENIX Conf on Operating Systems Design and Implementation, p.265-278. [22]Li XL, Li RH, Lee PPC, et al., 2019. OpenEC: toward unified and configurable erasure coding management in distributed storage systems. Proc 17th USENIX Conf on File and Storage Technologies, p.331-344. [23]Lin SY, Gong GW, Shen ZR, et al., 2021. Boosting full-node repair in erasure-coded storage. USENIX Annual Technical Conf, p.641-655. [24]Narayanan D, Donnelly A, Rowstron A, 2008. Write off-loading: practical power management for enterprise storage. ACM Trans Storage, 4(3):10. [25]Nishtala R, Fugal H, Grimm S, et al., 2013. Scaling memcache at Facebook. Proc 10th USENIX Symp on Networked Systems Design and Implementation, p.385-398. [26]Ovsiannikov M, Rus S, Reeves D, et al., 2013. The quantcast file system. Proc VLDB Endow, 6(11):1092-1101. [27]Pagh R, Rodler FF, 2004. Cuckoo hashing. J Algor, 51(2):122-144. [28]Pamies-Juarez L, Blagojevic F, Mateescu R, et al., 2016. Opening the chrysalis: on the real repair performance of MSR codes. Proc 14th USENIX Conf on File and Storage Technologies, p.81-94. [29]Plank JS, Huang C, 2013. Tutorial: erasure coding for storage applications. Proc 11th USENIX Conf on File and Storage Technologies. [30]Poke M, Hoefler T, 2015. DARE: high-performance state machine replication on RDMA networks. Proc 24th Int Symp on High-Performance Parallel and Distributed Computing, p.107-118. [31]Rashmi KV, Nakkiran P, Wang JY, et al., 2015. Having your cake and eating it too: jointly optimal erasure codes for I/O, storage and network-bandwidth. Proc 13th USENIX Conf on File and Storage Technologies, p.81-94. [32]Rashmi KV, Chowdhury M, Kosaian J, et al., 2016. EC-Cache: load-balanced, low-latency cluster caching with online erasure coding. Proc 12th USENIX Conf on Operating Systems Design and Implementation, p.401-417. [33]Reed IS, Solomon G, 1960. Polynomial codes over certain finite fields. J Soc Ind Appl Math, 8(2):300-304. [34]Shah NB, Lee K, Ramchandran K, 2016. When do redundant requests reduce latency? IEEE Trans Commun, 64(2):715-722. [35]Stewart C, Chakrabarti A, Griffith R, 2013. Zoolander: efficiently meeting very strict, low-latency SLOs. Proc 10th Int Conf on Autonomic Computing, p.265-277. [36]Uluyol M, Huang A, Goel A, et al., 2020. Near-optimal latency versus cost tradeoffs in geo-distributed storage. Proc 17th USENIX Symp on Networked Systems Design and Implementation, p.157-180. [37]Vajha M, Ramkumar V, Puranik B, et al., 2018. Clay codes: moulding MDS codes to yield an MSR code. Proc 16th USENIX Conf on File and Storage Technologies, p.139-154. [38]Weil SA, Brandt SA, Miller EL, et al., 2006. Ceph: a scalable, high-performance distributed file system. Proc 7th Symp on Operating Systems Design and Implementation, p.307-320. [39]Wilcox-O'Hearn Z, Warner B, 2008. Tahoe: the least-authority filesystem. Proc 4th ACM Int Workshop on Storage Security and Survivability, p.21-26. [40]Wilkes J, Golding R, Staelin C, et al., 1996. The HP AutoRAID hierarchical storage system. ACM Trans Comput Syst, 14(1):108-136. [41]Wu SZ, Mao B, Chen XL, et al., 2016. LDM: log disk mirroring with improved performance and reliability for SSD-based disk arrays. ACM Trans Storage, 12(4):22. Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE |
Open peer comments: Debate/Discuss/Question/Opinion
<1>