Abstract
Detecting failure traces can help system administrators timely recover from those failures and avoid them afterwards. For system managers, it is not difficult to detect whether a failure is currently occurring, because they only concern about several key measurements. If these measurements exceed the normal threshold, a failure event should be generated. But it is much more complicated to detect the failure traces which represented as failure related events. Because these failure traces may last for quite a long time and effect many components. Furthermore, current distributed system adds and removes new components so quickly that administrators may not have enough time and knowledge to set monitoring threshold for each of them. Based on these problems, we propose our FTD system. We first compare each component’s historical state and get outlier states as anomalous event. And then, combined with the failure event that the system provided, we detect the event correlations between failure events and anomalous events as failure traces. A network intrusion benchmark KDD99 is used to evaluate our work and we achieve good performances.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bahl, P., Chandra, R., Greenberg, A., Kandula, S., Maltz, D.A., Zhang, M.: Towards highly reliable enterprise network services via inference of multi-level dependencies, pp. 13–24
Aguilera, M.K., Mogul, J.C., Wiener, J.L., Reynolds, P., Muthitacharoen, A.: Performance debugging for distributed systems of black boxes, pp. 74–89
Reynolds, P., Wiener, J.L., Mogul, J.C., Aguilera, M.K., Vahdat, A.: WAP5: black-box performance debugging for wide-area systems, pp. 347–356
Sigelman, B.H., Barroso, L.A., Burrows, M., Stephenson, P., Plakal, M., Beaver, D., Jaspan, S., Shanbhag, C.: Dapper, a large-scale distributed systems tracing infrastructure. Google research (2010)
Aggarwal, C.C., Yu, P.S.: Outlier detection for high dimensional data, pp. 37–46
Boudjeloud-Assala, L.: Visual interactive evolutionary algorithm for high dimensional outlier detection and data clustering problems. Int. J. Bio-Inspired Comput. 4(1), 6–13 (2012)
Patcha, A., Park, J.-M.: An overview of anomaly detection techniques: existing solutions and latest technological trends. Comput. Netw. 51(12), 3448–3470 (2007)
Kim, M., Sumbaly, R., Shah, S.: Root cause detection in a service-oriented architecture, pp. 93–104
Tati, S., Ko, B.J., Cao, G., Swami, A., La Porta, T.: Adaptive algorithms for diagnosing large-scale failures in computer networks, pp. 1–12
Bronevetsky, G., Laguna, I., de Supinski, B.R., Bagchi, S.: Automatic fault characterization via abnormality-enhanced classification, pp. 1–12
Su, L., Han, W.-H., Yang, S.-Q., Zou, P., Jia, Y.: Continuous adaptive outlier detection on distributed data streams. In: Perrott, R., Chapman, B.M., Subhlok, J., de Mello, R.F., Yang, L.T. (eds.) HPCC 2007. LNCS, vol. 4782, pp. 74–85. Springer, Heidelberg (2007)
Wang, P., Wang, H., Liu, M., et al.: An algorithmic approach to event summarization. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 183–194. ACM (2010)
Acknowledgement
This research was supported by National 863 Program (No. 2011AA01A203), National Natural Science Foundation (61133004), P. R. China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meng, Y., Yu, L., Luan, Z., Qian, D., Xie, M., Du, Z. (2014). A Black-Box Approach for Detecting the Failure Traces. In: Yuan, Y., Wu, X., Lu, Y. (eds) Trustworthy Computing and Services. ISCTCS 2013. Communications in Computer and Information Science, vol 426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43908-1_32
Download citation
DOI: https://doi.org/10.1007/978-3-662-43908-1_32
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43907-4
Online ISBN: 978-3-662-43908-1
eBook Packages: Computer ScienceComputer Science (R0)