CLC number: TP391.4
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2021-08-16
Cited: 0
Clicked: 6363
Xingjun ZHANG, Ningjing LIANG, Yunfei LIU, Changjiang ZHANG, Yang LI. SA-RSR: a read-optimal data recovery strategy for XOR-coded distributed storage systems[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(6): 858-875.
@article{title="SA-RSR: a read-optimal data recovery strategy for XOR-coded distributed storage systems",
author="Xingjun ZHANG, Ningjing LIANG, Yunfei LIU, Changjiang ZHANG, Yang LI",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="6",
pages="858-875",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2100242"
}
%0 Journal Article
%T SA-RSR: a read-optimal data recovery strategy for XOR-coded distributed storage systems
%A Xingjun ZHANG
%A Ningjing LIANG
%A Yunfei LIU
%A Changjiang ZHANG
%A Yang LI
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 6
%P 858-875
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2100242
TY - JOUR
T1 - SA-RSR: a read-optimal data recovery strategy for XOR-coded distributed storage systems
A1 - Xingjun ZHANG
A1 - Ningjing LIANG
A1 - Yunfei LIU
A1 - Changjiang ZHANG
A1 - Yang LI
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 6
SP - 858
EP - 875
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2100242
Abstract: To ensure the reliability and availability of data, redundancy strategies are always required for distributed storage systems. Erasure coding, one of the representative redundancy strategies, has the advantage of low storage overhead, which facilitates its employment in distributed storage systems. Among the various erasure coding schemes, XOR-based erasure codes are becoming popular due to their high computing speed. When a single-node failure occurs in such coding schemes, a process called data recovery takes place to retrieve the failed node's lost data from surviving nodes. However, data transmission during the data recovery process usually requires a considerable amount of time. Current research has focused mainly on reducing the amount of data needed for data recovery to reduce the time required for data transmission, but it has encountered problems such as significant complexity and local optima. In this paper, we propose a random search recovery algorithm, named SA-RSR, to speed up single-node failure recovery of XOR-based erasure codes. SA-RSR uses a simulated annealing technique to search for an optimal recovery solution that reads and transmits a minimum amount of data. In addition, this search process can be done in polynomial time. We evaluate SA-RSR with a variety of XOR-based erasure codes in simulations and in a real storage system, Ceph. Experimental results in Ceph show that SA-RSR reduces the amount of data required for recovery by up to 30.0% and improves the performance of data recovery by up to 20.36% compared to the conventional recovery method.
[1]Arnold J, 2014. OpenStack Swift Using, Administering, and Developing for Swift Object Storage. O’Reilly Media, Sebastopol, USA.
[2]Blaum M, Roth RM, 1993. New array codes for multiple phased burst correction. IEEE Trans Inform Theory, 39(1):66-77.
[3]Blaum M, Brady J, Bruck J, et al., 1995. EVENODD: an efficient scheme for tolerating double disk failures in RAID architectures. IEEE Trans Comput, 44(2):192-202.
[4]Blaum M, Bruck J, Vardy A, 1996. MDS array codes with independent parity symbols. IEEE Trans Inform Theory, 42(2):529-542.
[5]Borthakur D, 2007. The Hadoop Distributed File System: Architecture and Design. http://hadoop.apache.org/core/docs/current/hdfs_design.html
[6]Calder B, Wang J, Ogus A, et al., 2011. Windows azure storage: a highly available cloud storage service with strong consistency. Proc 23rd ACM Symp on Operating Systems Principles, p.143-157.
[7]Corbett P, English B, Goel A, et al., 2004. Row-diagonal parity for double disk failure correction. Proc 3rd USENIX Conf on File and Storage Technologies, Article 1.
[8]Facebook, 2018. HDFS-RAID. http://wiki.apache.org/hadoop/HDFS-RAID
[9]Gad EE, Mateescu R, Blagojevic F, et al., 2013. Repair-optimal MDS array codes over GF(2). IEEE Int Symp on Information Theory, p.887-891.
[10]Ghemawat S, Gobioff H, Leung ST, 2003. The Google file system. Proc 19th ACM Symp on Operating Systems Principles, p.29-43.
[11]Goel A, Corbett P, 2012. RAID triple parity. ACM SIGOPS Oper Syst Rev, 46(3):41-49.
[12]Hou HX, Lee PPC, 2020. Binary MDS array codes with optimal repair. IEEE Trans Inform Theory, 66(3):1405-1422.
[13]Hou HX, Han YS, Lee PPC, et al., 2019a. A new design of binary MDS array codes with asymptotically weak-optimal repair. IEEE Trans Inform Theory, 65(11):7095-7113.
[14]Hou HX, Han YS, Lee PPC, et al., 2019b. New regenerating codes over binary cyclic codes. IEEE Int Symp on Information Theory, p.216-220.
[15]Hou HX, Lee PPC, Shum KW, et al., 2019c. Rack-aware regenerating codes for data centers. IEEE Trans Inform Theory, 65(8):4730-4745.
[16]Huang C, Xu LH, 2008. STAR: an efficient coding scheme for correcting triple storage node failures. IEEE Trans Comput, 57(7):889-901.
[17]Huang C, Simitci H, Xu YK, et al., 2012. Erasure coding in windows azure storage. Proc USENIX Conf on Annual Technical Conf, Article 2.
[18]Jiekak S, Kermarrec AM, Le Scouarnec N, et al., 2013. Regenerating codes: a system perspective. ACM SIGOPS Oper Syst Rev, 47(2):23-32.
[19]Jin C, Jiang H, Feng D, et al., 2009. P-Code: a new RAID-6 code with optimal properties. Proc 23rd Int Conf on Supercomputing, p.360-369.
[20]Khan O, Burns R, Plank J, et al., 2012. Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads. Proc 10th USENIX Conf on File and Storage Technologies, Article 20.
[21]Liang NJ, Zhang XJ, Yang HL, et al., 2020. An optimal recovery approach for liberation codes in distributed storage systems. IEEE Access, 8:137631-137645.
[22]Miyamae T, Nakao T, Shiozawa K, 2014. Erasure code with shingled local parity groups for efficient recovery from multiple disk failures. Proc 10th USENIX Conf on Hot Topics in System Dependability, Article 5.
[23]Pamies-Juarez L, Blagojevic F, Mateescu R, et al., 2016. Opening the chrysalis: on the real repair performance of MSR codes. Proc 14th USENIX Conf on File and Storage Technologies, p.81-94.
[24]Plank JS, 2008. The RAID-6 liberation codes. Proc 6th USENIX Conf on File and Storage Technologies, p.97-110.
[25]Plank JS, 2009. The RAID-6 Liber8Tion code. Int J High Perform Comput Appl, 23(3):242-251.
[26]Plank JS, Luo JQ, Schuman CD, et al., 2009. A performance evaluation and examination of open-source erasure coding libraries for storage. Proc 7th Conf on File and Storage Technologies, p.253-265.
[27]Plank JS, Buchsbaum AL, Zanden BTV, 2011. Minimum density RAID-6 codes. ACM Trans Stor, 6(4):16.
[28]RedHat, 2018. Ceph Erasure. http://docs.ceph.com/docs/master/architecture/erasurecodin
[29]Reed IS, Solomon G, 1960. Polynomial codes over certain finite fields. J Soc Ind Appl Math, 8(2):300-304.
[30]Roth RM, Lempel A, 1989. On MDS codes via Cauchy matrices. IEEE Trans Inform Theory, 35(6):1314-1319.
[31]Russell SJ, Norvig P, 2016. Artificial Intelligence: a Modern Approach. Prentice-Hall, Inc., USA.
[32]Sathiamoorthy M, Asteris M, Papailiopoulos D, et al., 2013. XORing elephants: novel erasure codes for big data. Proc VLDB Endow, 6(5):325-336.
[33]Schroeder B, Gibson GA, 2007. Disk failures in the real world: what does an MTTF of 1 000 000 hours mean to you? Proc 5th USENIX Conf on File and Storage Technologies, p.1-16.
[34]Shen ZR, Shu JW, 2014. HV Code: an all-around MDS code to improve efficiency and reliability of RAID-6 systems. Proc 44th Annual IEEE/IFIP Int Conf on Dependable Systems and Networks, p.550-561.
[35]Tamo I, Wang ZY, Bruck J, 2011. MDS array codes with optimal rebuilding. IEEE Int Symp on Information Theory, p.1240-1244.
[36]Tamo I, Wang ZY, Bruck J, 2013. Zigzag codes: MDS array codes with optimal rebuilding. IEEE Trans Inform Theory, 59(3):1597-1616.
[37]Vajha M, Ramkumar V, Puranik B, et al., 2018. Clay codes: moulding MDS codes to yield an MSR code. Proc 16th USENIX Conf on File and Storage Technologies, p.139-153.
[38]Wang ZY, Dimakis AG, Bruck J, 2010. Rebuilding for array codes in distributed storage systems. IEEE Globecom Workshops, p.1905-1909.
[39]Weil SA, Brandt SA, Miller EL, et al., 2006a. Ceph: a scalable, high-performance distributed file system. Proc 7th Symp on Operating Systems Design and Implementation, p.307-320.
[40]Weil SA, Brandt SA, Miller EL, et al., 2006b. CRUSH: controlled, scalable, decentralized placement of replicated data. Proc ACM/IEEE Conf on Supercomputing, Article 122-es.
[41]Weil SA, Leung AW, Brandt SA, et al., 2007. RADOS: a scalable, reliable storage service for petabyte-scale storage clusters. Proc 2nd Int Workshop on Petascale Data Storage: held in conjunction with Supercomputing, p.35-44.
[42]Wu CT, Wan SG, He XB, et al., 2011. H-Code: a hybrid MDS array code to optimize partial stripe writes in RAID-6. Proc IEEE Int Parallel & Distributed Processing Symp, p.782-793.
[43]Xiang LP, Xu YL, Lui JCS, et al., 2011. A hybrid approach to failed disk recovery using RAID-6 codes: algorithms and performance evaluation. ACM Trans Stor, 7(3):11.
[44]Xu LH, Bruck J, 1999. X-code: MDS array codes with optimal encoding. IEEE Trans Inform Theory, 45(1):272-276.
[45]Xu SL, Li RH, Lee PPC, et al., 2014. Single disk failure recovery for X-code-based parallel storage systems. IEEE Trans Comput, 63(4):995-1007.
[46]Ye FW, Liu SQ, Shum KW, et al., 2020. On secure exact-repair regenerating codes with a single Pareto optimal point. IEEE Trans Inform Theory, 66(1):176-201.
[47]Zhang YZ, Wu CT, Li J, et al., 2015. TIP-Code: a three independent parity code to tolerate triple disk failures with optimal update complextiy. Proc 45th Annual IEEE/IFIP Int Conf on Dependable Systems and Networks, p.136-147.
[48]Zhu YF, Lee PPC, Xu YL, et al., 2014. On the speedup of recovery in large-scale erasure-coded storage systems. IEEE Trans Parall Distrib Syst, 25(7):1830-1840.
Open peer comments: Debate/Discuss/Question/Opinion
<1>