LogGAN: A Sequence-Based Generative Adversarial Network for Anomaly Detection Based on System Logs

Xia, Bin; Yin, Junjie; Xu, Jian; Li, Yun

doi:10.1007/978-3-030-34637-9_5

LogGAN: A Sequence-Based Generative Adversarial Network for Anomaly Detection Based on System Logs

Bin Xia¹²,
Junjie Yin¹²,
Jian Xu¹³ &
…
Yun Li¹²

Conference paper
First Online: 06 December 2019

1101 Accesses
10 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11933))

Abstract

System logs which trace system states and record valuable events comprise a significant component of any computer system in our daily life. There exist abundant information (i.e., normal and abnormal instances) involved in logs which assist administrators in diagnosing and maintaining the operation of the system. If diverse and complex anomalies (i.e., bugs and failures) cannot be detected and eliminated efficiently, the running workflows and transactions, even the system, would break down. Therefore, anomaly detection has become increasingly significant and attracted a lot of research attention. However, current approaches concentrate on the anomaly detection in a high-level granularity of logs (i.e., session) instead of detecting log-level anomalies which weakens the efficiency of responding anomalies and the diagnosis of system failures. To overcome the limitation, we propose a sequence-based generative adversarial network for anomaly detection based on system logs named LogGAN which detects log-level anomalies based on the patterns (i.e., the combination of latest logs). In addition, the generative adversarial network-based model relieves the effect of imbalance between normal and abnormal instances to improve the performance of capturing anomalies. To evaluate LogGAN, we conduct extensive experiments on two real-world datasets, and the experimental results show the effectiveness of our proposed approach to log-level anomaly detection.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Learned event embedding is used to demonstrate each event.
2.
In D, we cast the combination \(c_i\) as the input of LSTM and LSTM directly outputs the hidden layer without any manipulation. Then, we concatenate the \(m{-}\)dimensional vector with the hidden layer as an input of a two-layer full Connected neural network which outputs whether the \(m{-}\)dimensional vector is real or fake as a binary classification.

References

Bodik, P., Goldszmidt, M., Fox, A., Woodard, D.B., Andersen, H.: Fingerprinting the datacenter: automated classification of performance crises. In: Proceedings of the 5th European Conference on Computer Systems, pp. 111–124. ACM (2010)
Google Scholar
Chae, D.K., Kang, J.S., Kim, S.W., Lee, J.T.: CFGAN: a generic collaborative filtering framework based on generative adversarial networks. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 137–146. ACM (2018)
Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 15 (2009)
Article Google Scholar
Chawla, S., Sun, P.: SLOM: a new measure for local spatial outliers. Knowl. Inf. Syst. 9(4), 412–429 (2006)
Article Google Scholar
Chen, M., Zheng, A.X., Lloyd, J., Jordan, M.I., Brewer, E.: Failure diagnosis using decision trees. In: International Conference on Autonomic Computing. Proceedings, pp. 36–43. IEEE (2004)
Google Scholar
Du, M., Li, F., Zheng, G., Srikumar, V.: DeepLog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1285–1298. ACM (2017)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, Montreal, Quebec, Canada, 8–13 December 2014, pp. 2672–2680 (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets
Guo, S., Liu, Z., Chen, W., Li, T.: Event extraction from streaming system logs. In: Information Science and Applications 2018 - ICISA 2018, Hong Kong, China, 25–27th June 2018, pp. 465–474 (2018). https://doi.org/10.1007/978-981-13-1056-0_47
Google Scholar
Li, T., et al.: FIU-Miner (a fast, integrated, and user-friendly system for data mining) and its applications. Knowl. Inf. Syst. 52(2), 411–443 (2017)
Article Google Scholar
Liang, Y., Zhang, Y., Xiong, H., Sahoo, R.: Failure prediction in IBM BlueGene/L event logs. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), pp. 583–588. IEEE (2007)
Google Scholar
Lin, Q., Zhang, H., Lou, J.G., Zhang, Y., Chen, X.: Log clustering based problem identification for online service systems. In: Proceedings of the 38th International Conference on Software Engineering Companion, pp. 102–111. ACM (2016)
Google Scholar
Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422. IEEE (2008)
Google Scholar
Lou, J.G., Fu, Q., Yang, S., Xu, Y., Li, J.: Mining invariants from console logs for system problem detection. In: USENIX Annual Technical Conference, pp. 1–14 (2010)
Google Scholar
Sun, P., Chawla, S.: On local spatial outliers. In: Fourth IEEE International Conference on Data Mining (ICDM 2004), pp. 209–216. IEEE (2004)
Google Scholar
Tang, L., Li, T., Perng, C.S.: LogSig: generating system events from raw textual logs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 785–794. ACM (2011)
Google Scholar
Tuor, A.R., Baerwolf, R., Knowles, N., Hutchinson, B., Nichols, N., Jasper, R.: Recurrent neural network language models for open vocabulary event-level cyber anomaly detection. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Wang, J., et al.: IRGAN: a minimax game for unifying generative and discriminative information retrieval models. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 515–524. ACM (2017)
Google Scholar
Xia, B., Li, T., Zhou, Q.F., Li, Q., Zhang, H.: An effective classification-based framework for predicting cloud capacity demand in cloud services. IEEE Trans. Serv. Comput. (2018)
Google Scholar
Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.I.: Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, pp. 117–132. ACM (2009)
Google Scholar
Zhang, J., Wang, H.: Detecting outlying subspaces for high-dimensional data: the new task, algorithms, and performance. Knowl. Inf. Syst. 10(3), 333–355 (2006)
Article MathSciNet Google Scholar
Zhu, J., et al.: Tools and benchmarks for automated log parsing. CoRR abs/1811.03509 (2018). http://arxiv.org/abs/1811.03509

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China under Grant No. 61802205, 61872186, and 61772284, the Natural Science Research Project of Jiangsu Province under Grant 18KJB520037, and the research funds of NJUPT under Grant NY218116.

Author information

Authors and Affiliations

Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing, China
Bin Xia, Junjie Yin & Yun Li
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Jian Xu

Authors

Bin Xia
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Yin
View author publications
You can also search for this author in PubMed Google Scholar
Jian Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yun Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun Li .

Editor information

Editors and Affiliations

Institute of Information Engineering and School of Cybersecurity, University of Chinese Academy of Sciences, Beijing, China
Feng Liu
Nanjing University of Posts and Telecommunications, Nanning, China
Jia Xu
Department of Computer Science, The University of Texas at San Antonio, San Antonio, TX, USA
Shouhuai Xu
Google and Columbia University, New York, NY, USA
Moti Yung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xia, B., Yin, J., Xu, J., Li, Y. (2019). LogGAN: A Sequence-Based Generative Adversarial Network for Anomaly Detection Based on System Logs. In: Liu, F., Xu, J., Xu, S., Yung, M. (eds) Science of Cyber Security. SciSec 2019. Lecture Notes in Computer Science(), vol 11933. Springer, Cham. https://doi.org/10.1007/978-3-030-34637-9_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-34637-9_5
Published: 06 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34636-2
Online ISBN: 978-3-030-34637-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics