Feature Extraction in Security Analytics: Reducing Data Complexity with Apache Spark

Sisiaridis, Dimitrios; Markowitch, Olivier

doi:10.1007/978-3-319-76451-1_29

Dimitrios Sisiaridis¹⁸ &
Olivier Markowitch¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 733))

Included in the following conference series:

International Conference on Security with Intelligent Computing and Big-data Services

1083 Accesses

Abstract

Feature extraction is the first task of pre-processing input logs in order to detect cybersecurity threats and attacks while utilizing machine learning. When it comes to the analysis of heterogeneous data derived from different sources, this task is found to be time-consuming and difficult to be managed efficiently. In this paper we present an approach for handling feature extraction for security analytics of heterogeneous data derived from different network sensors. The approach is implemented in Apache Spark, using its python API, named pyspark.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The term flattening refers to data expressed in 2-D.
2.
The kill chain model [2] is an intelligence-driven, threat-focused approach to study intrusions from the adversaries perspective. The fundamental element is the indicator which corresponds to any piece of information that can describe a threat or an attack. Indicators can be either atomic such as IP or email addresses, computed such as hash values or regular expressions, or behavioural which are collections of computed and atomic indicators such as statements.

References

Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’ Reilly Media Inc. (2009)
Google Scholar
Hutchins, E.M., Cloppert, M.J., Amin, R.M.: Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains. In: Ryan, J. (ed.) Leading Issues in Information Warfare and Security Research, vol. 1, p. 80. Academic Publishing International Ltd., Reading (2011)
Google Scholar
Kalyan, V., Ignacio, A., Alfredo, C.-I., Vamsi, K., Costas, B., Ke, L.: AI2: Training a big data machine to defend. In: IEEE International Conference on Big Data Security, New York, NY, USA, June 2016
Google Scholar
Shyu, M.-L., Huang, Z., Luo, H.: Efficient mining and detection of sequential intrusion patterns for network intrusion detection systems. In: Yu, P.S., Tsai, J.J.P. (eds.) Machine Learning in Cyber Trust, pp. 133–154. Springer, Boston (2009)
Chapter Google Scholar
Sisiaridis, D., Carcillo, F., Markowitch, O.: A framework for threat detection in communication systems. In: Proceedings of the 20th Pan-Hellenic Conference on Informatics, pp. 68:1–68:6. ACM (2016)
Google Scholar
Sisiaridis, D., Kuchta, V., Markowitch, O.: A categorical approach in handling event-ordering in distributed systems. In: Parallel and Distributed Systems (ICPADS), pp. 1145–1150. IEEE (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

QualSec Group, Departement d’ Informatique, Université Libre de Bruxelles, Brussels, Belgium
Dimitrios Sisiaridis & Olivier Markowitch

Authors

Dimitrios Sisiaridis
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Markowitch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dimitrios Sisiaridis .

Editor information

Editors and Affiliations

Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien, Taiwan
Sheng-Lung Peng
Department of Information Management, Central Police University, Taoyuan, Taiwan
Shiuh-Jeng Wang
Department of Automation and Applied Informatics, Faculty of Engineering, Aurel Vlaicu University of Arad, Arad, Romania
Valentina Emilia Balas
School of Computer Technology, Jingzhou, China
Ming Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sisiaridis, D., Markowitch, O. (2018). Feature Extraction in Security Analytics: Reducing Data Complexity with Apache Spark. In: Peng, SL., Wang, SJ., Balas, V., Zhao, M. (eds) Security with Intelligent Computing and Big-data Services. SICBS 2017. Advances in Intelligent Systems and Computing, vol 733. Springer, Cham. https://doi.org/10.1007/978-3-319-76451-1_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-76451-1_29
Published: 29 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76450-4
Online ISBN: 978-3-319-76451-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics