Abstract
To achieve reliability in distributed storage systems, fault tolerance techniques like replication strategy are adopted. As the rapid growth of data, distributed storage systems have been transitioning replication strategy to coding strategies like Reed Solomon codes to achieve higher storage efficiency. But the repair cost of Reed Solomon codes in terms of network bandwidth is high. For repair efficiency, a new class of codes called Regenerating Codes are proposed and become more popular. However, how to quantify and evaluate the repair cost of these coding strategies at the system level remains unexplored. In this paper, we propose a metric of the repair cost at the level of whole systems, and then compare the two main classes of codes Reed Solomon codes and Regenerating codes. Our goal is to provide system designers with evaluation methods of the system level repair cost. Thus, system designers can choose optimal coding strategies according to their certain systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dimakis, A.G., Godfrey, P.B., Wu, Y., Wainwright, M.J., Ramchandran, K.: Network coding for distributed storage systems. IEEE Trans. Inf. Theor. 56, 4539–4551 (2010)
Jiekak, S., Kermarrec, A.M., Le Scouarnec, N., Straub, G., Van Kempen, A.: Regenerating codes: A system perspective. ACM SIGOPS Operating Syst. Rev. 47, 23–32 (2013)
Rashmi, K.V., Shah, N.B., Kumar, P.V.: Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction. IEEE Trans. Inf. Theory 57(8), 5227–5239 (2011)
Wu, Y., Dimakis, A.G.: Reducing repair traffic for erasure coding-based storage via interference alignment. In: IEEE International Symposium on Information Theory ISIT 2009 (2009)
Chun, B.G., Dabek, F., Haeberlen, A., Sit, E., Weatherspoon, H., Kaashoek, M.F., Kubiatowicz, J., Morris, R.: Efficient replica maintenance for distributed storage systems. In: NSDI (2006)
Hu, Y., Chen, H.C.H., Lee, P.P.C., Tang, Y.: NCCloud: applying network coding for the storage repair in a cloud-of-clouds. In: FAST (2012)
Sathiamoorthy, M., et al.: Xoring elephants: novel erasure codes for big data. In: Proceedings of the VLDB Endowment (2013)
Ford, D., Labelle, F., Popovici, F.I., Stokely, M., Truong, V.A., Barroso, L., Grimes, C., Quinlan, S.: Availability in globally distributed storage systems. In: OSDI, pp. 61–74 (2010)
Papailiopoulos, D.S., Dimakis, A.G.: Locally repairable codes. In: 2012 IEEE International Symposium on Information Theory Proceedings (ISIT) (2012)
Birolini, A.: Reliability Engineering, vol. 5. Springer, Heidelberg (2007)
Gardiner, C.W.: Stochastic Methods. Springer, Heidelberg (1985)
Shum, K.W.: Cooperative regenerating codes for distributed storage systems (2011). arXiv preprint arXiv:1101.5257
Chen, S., Sun, Y., Kozat, U.C., Huang, L., Sinha, P., Liang, G., Liu, X., Shroff, N.B.: When queueing meets coding: Optimal-latency data retrieving scheme in storage clouds. In: INFOCOM (2014)
Gross, D., Harris, C.: Fundamentals of Queueing Theory. Wiley Interscience, New York (1998)
Ramabhadran, S., Pasquale, J.: Analysis of long-running replicated systems. In: INFOCOM, pp. 1–9 (2006)
Li, R., Lin, J., Lee, P.P.C.: Core: Augmenting regenerating-coding-based recovery for single and concurrent failures in distributed storage systems. In: IEEE Mass Storage Systems and Technologies (MSST) (2013)
Acknowledgments
This research is supported in part by the Major State Basic Research Development Program of China (973 Program, 2012CB315803), the National Natural Science Foundation of China (61371078), and the Research Fund for the Doctoral Program of Higher Education of China (20130002110051).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Lu, Y., Hao, J., Liu, XJ., Xia, ST. (2015). Analysis of Repair Cost in Distributed Storage Systems with Fault-Tolerant Coding Strategies. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9531. Springer, Cham. https://doi.org/10.1007/978-3-319-27140-8_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-27140-8_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27139-2
Online ISBN: 978-3-319-27140-8
eBook Packages: Computer ScienceComputer Science (R0)