Abstract
Prediction of imminent failures of large scale storage systems is critical to prevent loss of data. Various machine learning and statistical methods based on SMART attributes have been proposed by different researchers. Although they have achieved good prediction accuracy, but most of them focus on predicting the status of hard drives as “good” or “failed”. Moreover, the performance of hard drives deteriorates slowly than abruptly as indicated by continuous change in their corresponding SMART attributes. So, these models cannot predict this kind of continuous change. This paper gives decision tree based failure prediction model for hard drives which gives a better prediction accuracy. Experiments show that decision tree based model anticipates through 99.99% of failures, along with a false alarm rate under 0.001%. Also, we introduce prediction of lead time that proactively quantifies health status of hard drives to generate warnings in advance for triggering backups. We test the proposed model on a real-world dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Schroeder, B., Gibson, G.A.: Disk failures in the real world: what does an MTTF of 1, 000, 000 hours mean to you? In: FAST, vol. 7, pp. 1–16 (2007)
Xin, Q., Miller, E.L., Schwarz, T., Long, D.D., Brandt, S.A., Litwin, W.: Reliability mechanisms for very large storage systems. In: Proceedings of 20th IEEE/11th NASA Goddard Conference on IEEE Mass Storage Systems and Technologies, 2003, pp. 146–156 (2003)
Murray, J.F., Hughes, G.F., Kreutz-Delgado, K.: Machine learning methods for predicting failures in hard drives: a multiple-instance application. J. Mach. Learn. Res. 6, 783–816 (2005)
Hamerly, G., Elkan, C.: Bayesian approaches to failure prediction for disk drives. In: ICML, vol. 1, pp. 202–209 (2001)
Hughes, G.F., Murray, J.F., Kreutz-Delgado, K., Elkan, C.: Improved disk-drive failure warnings. IEEE Trans. Reliab. 51, 350–357 (2002)
Murray, J.F., Hughes, G.F., Kreutz-Delgado, K.: Hard drive failure prediction using non-parametric statistical methods. In: Proceedings of ICANN/ICONIP (2003)
Zhao, Y., Liu, X., Gan, S., Zheng, W.: Predicting disk failures with HMM-and HSMM-based approaches. In: Perner, P. (ed.) ICDM 2010. LNCS (LNAI), vol. 6171, pp. 390–404. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14400-4_30
Zhu, B., Wang, G., Liu, X., Hu, D., Lin, S., Ma, J.: Proactive drive failure prediction for large scale storage systems. In: 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–5. IEEE (2013)
Wang, Y., Miao, Q., Pecht, M.: Health monitoring of hard disk drive based on Mahalanobis distance. In: Prognostics and System Health Management Conference (PHM-Shenzhen), 2011, pp. 1–8. IEEE (2011)
Wang, Y., Miao, Q., Ma, E.W., Tsui, K.L., Pecht, M.G.: Online anomaly detection for hard disk drives based on Mahalanobis distance. IEEE Trans. Reliab. 62, 136–145 (2013)
Li, J., et al.: Hard drive failure prediction using classification and regression trees. In: P2014 44th Annual IEEE/IFIP International Conference on IEEE Dependable Systems and Networks (DSN), pp. 383–394 (2014)
Xu, C., Wang, G., Liu, X., Guo, D., Liu, T.Y.: Health status assessment and failure prediction for hard drives with recurrent neural networks. IEEE Trans. Comput. 65, 3502–3508 (2016)
Allen, B.: Monitoring hard disks with smart. Linux J. 117, 74–77 (2004)
Hard Drive Data and Stat. https://www.backblaze.com/b2/hard-drive-test-data.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kaur, K., Kaur, K. (2019). Failure Prediction and Health Status Assessment of Storage Systems with Decision Trees. In: Luhach, A., Singh, D., Hsiung, PA., Hawari, K., Lingras, P., Singh, P. (eds) Advanced Informatics for Computing Research. ICAICR 2018. Communications in Computer and Information Science, vol 955. Springer, Singapore. https://doi.org/10.1007/978-981-13-3140-4_33
Download citation
DOI: https://doi.org/10.1007/978-981-13-3140-4_33
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3139-8
Online ISBN: 978-981-13-3140-4
eBook Packages: Computer ScienceComputer Science (R0)