Abstract
Feature relevance is often investigated in classification problems to determine the contribution of each feature, especially when a dataset comprises of numerous features. Feature selection or variable selection aids in creating an accurate predictive model because fewer attributes tend to reduce computational complexity, thereby promising better performance. Machine learning, a preferred approach to intrusion detection, manifests on the appropriate usage of features to improve attack detection rate. A new benchmark dataset, UNSW NB-15, has been used in the study which comprises of five classes of features. This work attempts to demonstrate the relevance of each feature class along with the importance of various combinations of feature classes. During the course of this analysis, 31 possible combinations of features were taken into consideration and their relevance was examined. Empirical results pertaining to feature reduction have shown that an accuracy of 97% could be obtained by using only 23 features. The entire sequence of experimentation was conducted on Microsoft Azure machine learning studio (MAMLS), a scalable machine learning platform. Two-class neural network was used to perform the classification task. Since UNSW NB-15 is a contemporary dataset with modern attack vectors, the research community is still in the process of exploring various facets of this dataset. This article thus intends to offer valuable insights on the significance of features found in UNSW NB-15 dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Von Solms, R., Van Niekerk, J.: From information security to cyber security. Comput. Secur. 38, 97–102 (2013)
Garcia-Teodoro, P., et al.: Anomaly-based network intrusion detection: techniques, systems and challenges. Comput. Secur. 28, 18–28 (2009)
Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: Military Communications and Information Systems Conference (MilCIS), pp. 1–6. IEEE, Canberra (2015)
Moustafa, N., Slay, J.: The evaluation of network anomaly detection systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Inf. Secur. J. Glob. Perspect. 25(1–3), 18–31 (2016)
Mund, Sumit: Microsoft azure Machine Learning. Packt Publishing Ltd., U.K. (2015)
Barga, R., Fontama, V., Tok, W.H.: Predictive Analytics with Microsoft Azure Machine Learning: Build and Deploy Actionable Solutions in Minutes. Apress (2014)
Chappell, D.: Introduction to Azure Machine Learning. Chappell & Associates, San Francisco (2015)
Barga R., Fontama V., Tok W.H.: Introducing Microsoft Azure machine learning. In: Predictive Analytics with Microsoft Azure Machine Learning. Apress, Berkeley, CA (2015)
Golik, P., Doetsch, P., Ney, H.: Cross-entropy vs. squared error training: a theoretical and experimental comparison. In: Interspeech, vol. 13, pp. 1756–1760 (2013)
Patro, S., Sahu, K.K.: Normalization: A Preprocessing Stage. arXiv preprint arXiv:1503.06462 (2015)
Hackeling, G.: Mastering Machine Learning with Scikit-Learn. Packt Publishing Ltd. (2014)
Garreta, R., Moncecchi, G.: Learning Scikit-Learn: Machine Learning in Python. Packt Publishing Ltd. (2013)
Kadiyala, Akhil, Kumar, Ashok: Applications of python to evaluate the performance of decision tree-based boosting algorithms. Environ. Prog. Sustain. Energy 37(2), 618–623 (2018)
Aggarwal, P., Sharma, S.K.: An empirical comparison of classifiers to analyze intrusion detection. In: 2015 Fifth International Conference on Advanced Computing & Communication Technologies (ACCT), pp. 446–450. IEEE, Haryana, India (2015)
Suthaharan, S.: Machine learning models and algorithms for big data classification. In: Thinking with Examples for Effective Learning. Springer, New York (2015)
Fushiki, T.: Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 21(2), 137–146 (2011)
Khammassi, Chaouki, Krichen, Saoussen: A GA-LR wrapper approach for feature selection in network intrusion detection. Comput. Secur. 70, 255–277 (2017)
Bhamare, D.: Feasibility of supervised machine learning for cloud security. In: International Conference on Information Science and Security (ICISS), pp. 1–5. IEEE, Pattaya, Thailand (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rajagopal, S., Hareesha, K.S., Kundapur, P.P. (2020). Feature Relevance Analysis and Feature Reduction of UNSW NB-15 Using Neural Networks on MAMLS. In: Pati, B., Panigrahi, C., Buyya, R., Li, KC. (eds) Advanced Computing and Intelligent Engineering. Advances in Intelligent Systems and Computing, vol 1082. Springer, Singapore. https://doi.org/10.1007/978-981-15-1081-6_27
Download citation
DOI: https://doi.org/10.1007/978-981-15-1081-6_27
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1080-9
Online ISBN: 978-981-15-1081-6
eBook Packages: EngineeringEngineering (R0)