Cluster Computing

, Volume 22, Supplement 1, pp 2485–2494 | Cite as

Mirrored and hybrid disk arrays and their reliability

  • Alexander ThomasianEmail author


Replication and erasure coding are two alternative methods for disk arrays to deal with disk failures. This work concentrates on mirrored disk arrays, classified as RAID1, and hybrid disk arrays, which implement redundancy by storing XORed data blocks instead of replicas. We evaluate the reliability of disk arrays without and with repair using traditional reliability modeling techniques. A shortcut method based on asymptotic expansions is also used to compare the reliability of RAID(4+k) arrays with mirrored and hybrid disks. RAID1 with distributed redundancy attains more balanced disk loads and improved performance with respect to basic mirroring (BM) upon disk failure, but is less reliable than BM. Hybrid disk arrays incurring the same level of redundancy as RAID1 are more reliable than RAID1, but incur a higher cost for updates. The application of the asymptotic expansion method to hierarchical RAID shows that it is advantageous to associate higher redundancy with lower levels at the same overall redundancy overhead. It is also shown that sharing disk space sharing between RAID1 and RAID5 in heterogeneous disk arrays—HDAs may result in a lowered reliability. In addition to the classical rebuild model, we present an extension with a limited number of spares. Recovery methods based on reconfiguration from higher to lower reliability RAID arrays are also presented.


RAID Disk mirroring Hybrid disk arrays Reliability modeling Mean time to data loss—MTTDL 



Basic mirroring


Chained declustering


Continuous time Markov chain


Group rotate declustering


Heterogeneous disk array


Hierarchical RAID


Interleaved declustering


LSI logics’ RAID array


Mean time to failure


Mean time to repair


Mean time to data loss


Parity defining set (for Weaver codes)


Redundant array of independent disks


Self-adaptive disk array


Survivable storage using parity in redundant array layouts





Dr. Jun Xu at NJIT and Dr. Yujie Tang at Shenzhen Institute of Advanced Technology: collaborated on research topics covered in this paper.


  1. 1.
    Amer, A., Long, D.D.E., Paris, J.F., Schwarz, T.: Increased reliability with SSPiRAL data layouts. In: Proceedings 16th IEEE Int’l Symposium on Modeling, Analysis, and Simulation of Computer and Telecomm. Systems (MASCOTS’08), pp. 189–198. Baltimore, MD (2008)Google Scholar
  2. 2.
    Bachmat, E., Schindler, J.: Analysis of methods for scheduling low priority disk drive tasks. In: Proceedings of ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 55–65. Marina del Rey, CA (2002)Google Scholar
  3. 3.
    Blaum, M., Brady, J., Bruck, J., Menon, J., Vardy, A.: The EVENODD code and its generalization. In: Jin, H. et al. (eds.) Chapter 14 in High Performance Mass Storage and Parallel I/O: Technologies and Applications, pp. 187–208. IEEE & Wiley Press, New York (2002)Google Scholar
  4. 4.
    Chen, S.-Z., Towsley, D.F.: A performance evaluation of RAID architectures. IEEE Trans. Comput. 45(10), 1116–1130 (1996)CrossRefzbMATHGoogle Scholar
  5. 5.
    Chen, P.M., Lee, E.K., Gibson, G.A., Katz, R.H., Patterson, D.A.: RAID: High-performance, reliable secondary storage. ACM Computing Surveys 26(2), 145–185 (1994)CrossRefGoogle Scholar
  6. 6.
    Gibson, G.A.: Redundant Disk Arrays: Reliable, Parallel Secondary Storage. MIT Press, Cambridge (1992)Google Scholar
  7. 7.
    Hafner, J.L.: WEAVER codes: highly fault tolerant erasure codes for storage systems. In: Proceedings 4th USENIX Conference on File and Storage Technologies (FAST’05), pp. 211–224. San Francisco, CA (2005)Google Scholar
  8. 8.
    Hsiao, H.-I., DeWitt, D.J.: Chained declustering: a new availability strategy for multiprocessor database machines. In: Proceedings of IEEE International Conference. on Data Engineering (ICDE’90), pp. 456–465. Los Angeles, CA (1990)Google Scholar
  9. 9.
    Iliadis, I., Venkatesan, V.: Expected annual fraction of data loss as a metric for data storage reliability. In: Proceedings of IEEE 22nd Int’l Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS’14), pp. 375–384. Paris, France (2014)Google Scholar
  10. 10.
    Jacob, B.L., Ng, S.W., Wang, D.T.: Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann, Burlington (2008)Google Scholar
  11. 11.
    Paris, J.F., Schwarz, T. J. E., Long, D.D.E.: Self-adaptive disk arrays. In: Proceedings of 8th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 469–483. Dallas, TX (2006)Google Scholar
  12. 12.
    Patterson, D.A.: A simple way to estimate the cost of downtime. In: Proceedings of 16th Conference on Systems Administration (LISA 2002), pp. 185–188. Philadelphia, PA (2002)Google Scholar
  13. 13.
    Schroeder, B., Gibson, G.A.: Understanding disk failure rates: what does an MTTF of 1,000,000 hours mean to you? ACM Trans. Storage 3(3), 8-1–8-31 (2007)CrossRefGoogle Scholar
  14. 14.
    Thomasian, A.: Reconstruct versus read-modify writes in RAID. Inf. Process. Lett. 93(4), 163–168 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Thomasian, A.: Shortcut method for reliability comparisons in RAID5. J. Syst. Softw. 79(11), 1599–1605 (2006)CrossRefGoogle Scholar
  16. 16.
    Thomasian, A., Blaum, M.: Mirrored disk organization reliability analysis. IEEE Trans. Comput. 55(12), 1640–1644 (2006)CrossRefGoogle Scholar
  17. 17.
    Thomasian, A., Blaum, M.: Higher reliability redundant disk arrays: organization, operation, and coding. ACM Trans. Storage Syst. 5(3), 7:1–7:59 (2009)Google Scholar
  18. 18.
    Thomasian, A., Menon, J.: Performance analysis of RAID5 disk arrays with a vacationing server model for rebuild mode operation. In: Proceedings 10th International Conference on Data Engineering (ICDE), pp. 111–119. Houston, TX (1994)Google Scholar
  19. 19.
    Thomasian, A., Menon, J.: RAID5 performance with distributed sparing. IEEE Trans. Parallel Distrib. Syst. 8(6), 640–657 (1997)CrossRefGoogle Scholar
  20. 20.
    Thomasian, A., Tang, Y.: Performance, reliability, and performability of a hybrid RAID array and a comparison with traditional RAID1 arrays. Clust. Comput. 15(3), 239–253 (2012)CrossRefGoogle Scholar
  21. 21.
    Thomasian, A., Xu, J.: Reliability and performance of mirrored disk organizations. Comput. J. 51(6), 615–629 (2008)CrossRefGoogle Scholar
  22. 22.
    Thomasian, A., Xu, J.: RAID level selection for heterogeneous disk arrays. Clust. Comput. 14(2), 115–127 (2011)CrossRefGoogle Scholar
  23. 23.
    Thomasian, A., Xu, J.: Data allocation in a heterogeneous disk array (HDA) with multiple RAID levels for database applications. Comput. Syst. 21(5), 345–359 (2016).
  24. 24.
    Thomasian, A., Tang, Y., Hu, Y.: Hierarchical RAID: design, performance, reliability, and recovery. J. Parallel Distrib. Comput. 72(12), 1753–1769 (2012)CrossRefGoogle Scholar
  25. 25.
    Trivedi, K.S.: Probability and Statistics with Reliability, Queuing, and Computer Science Applications, 2nd edn. Wiley, New York (2001)zbMATHGoogle Scholar
  26. 26.
    Wilkes, J., Golding, R., Staelin, C., Sullivan, T.: The HP AutoRAID hierarchical storage system. ACM Trans. Comput. Syst. 14(1), 108–136 (1996)CrossRefGoogle Scholar
  27. 27.
    Wilner, A. Multiple drive failure tolerant RAID system. US Patent US 6,327,672 B1, LSI Logic Corporation, Milpitas, CA, (2001)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Thomasian and AssociatesPleasantvilleUSA

Personalised recommendations