Abstract
In modern computer systems, system event logs have always been the primary source for checking the system status. As computer systems become more complex, such as cloud computing systems, the interaction among software and hardware is increasingly frequently. These components will generate enormous log information, including running reports and fault information. The massive data is a great challenge for analysis with manual method. In this paper, we implement a log management and analysis system, which can assist system administrators to understand the real-time status of the entire system, classify logs into different fault types, and determine the root cause of the faults. In addition, we improve the existing fault correlation analysis method based on the results of system log classification. We apply the log management and analysis system to cloud computing environment for evaluation. The results show that our system can classify fault logs effectively and automatically. By using the proposed system, administrators can easily detect the root cause of faults.
Chapter PDF
Similar content being viewed by others
References
Zawoad, S., Dutta, A.K., Hasan, R.: SecLaaS: secure logging-as-a-service for cloud forensics. In: Proceedings of the ACM Symposium on Information, Computer and Communications Security, pp. 219–230 (2013)
Rao, X., Wang, H., Shi, D., Chen, Z.: Identifying faults in large-scale distributed systems by filtering noisy error logs. In: Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 140–145 (2011)
Yuan, D., Mai, H., Xiong, W., Tan, L., Zhou, Y., Pasupathy, S.: SherLog: error diagnosis by connecting clues from run-time logs. Computer Architecture News 38, 143–154 (2010), doi:10.1145/1735971.1736038
Fu, Q., Lou, J., Wang, Y., Li, J.: Execution anomaly detection in distributed systems through unstructured log analysis. In: Proceedings of the IEEE International Conference on Data Mining, pp. 149–158 (2009)
Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.: Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM Symposium on Operating Systems Principles, pp. 117–132 (2009)
James, E.P.: Listening to your cluster with LoGS. In: Proceedings of the LCI International Conference on Linux Clusters: TheHPC Revolution, pp. 1–10 (2004)
Jain, S., Singh, I., Chandra, A., Zhang, Z., Bronevetsky, G.: Extracting the textual and temporal structure of supercomputing logs. In: Proceedings of the IEEE International Conference on High Performance Computing, pp. 254–263 (2009)
Stearley, J., Oliner, A.J.: Bad words: Finding faults in Spirit’s syslogs. In: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid, pp. 765–770 (2008)
Sandia, J.S., Stearley, J.: Towards informatic analysis of syslogs. In: Proceedings of the IEEE International Conference on Cluster Computing, pp. 309–318 (2004)
Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.: Mining Console Logs for Large-Scale System Problem Detection. In: Proceedings of the IEEE Conference on Tackling Computer Systems Problems with Machine Learning Techniques, pp. 4–14 (2008)
Salfner, F., Tschirpke, S.: Error Log Processing for Accurate Failure Prediction. In: Proceedings of the USENIX Workshop on Analysis of System Logs, pp. 23–31 (2008)
Park, J., Yoo, G., Lee, E.: Proactive self-healing system based on multi-agent technologies. In: Proceedings of the ACIS International Conference on Software Engineering Research, Management and Applications, pp. 256–263 (2005)
Li, T., Liang, F., Ma, S., Peng, W.: An integrated framework on mining logs files for computing system management. In: Proceedings of the ACM International Conference on Knowledge Discovery in Data Mining, pp. 776–781 (2005)
Vaarandi, R.: A data clustering algorithm for mining patterns from event logs. In: Proceedings of the IEEE Workshop on IP Operations and Management, pp. 119–126 (2003)
Vaarandi, R.: A breadth-first algorithm for mining frequent patterns from event logs. In: Aagesen, F.A., Anutariya, C., Wuwongse, V. (eds.) INTELLCOMM 2004. LNCS, vol. 3283, pp. 293–308. Springer, Heidelberg (2004)
Oliner, A., Stearley, J.: What supercomputers say: A study of five system logs. In: Proceedings of the Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 575–584 (2007)
Pecchia, A., Cotroneo, D., Kalbarczyky, Z., Iyer, R.K.: Improving log-based field failure data analysis of multi-node computing systems. In: Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 97–108 (2011)
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the ACM International Conference on Knowledge Discovery in Data Mining, pp. 226–231 (1996)
StrongCloud, http://211.69.198.202:91
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 IFIP International Federation for Information Processing
About this paper
Cite this paper
Zou, D., Qin, H., Jin, H., Qiang, W., Han, Z., Chen, X. (2014). Improving Log-Based Fault Diagnosis by Log Classification. In: Hsu, CH., Shi, X., Salapura, V. (eds) Network and Parallel Computing. NPC 2014. Lecture Notes in Computer Science, vol 8707. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44917-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-662-44917-2_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44916-5
Online ISBN: 978-3-662-44917-2
eBook Packages: Computer ScienceComputer Science (R0)