A Novel Hybrid Data Reduction Strategy and Its Application to Intrusion Detection
The presence of useless information and the huge amount of data generated by telecommunication services can affect the efficiency of traditional Intrusion Detection Systems (IDSs). This fact encourage the development of data preprocessing strategies for improving the efficiency of IDSs. On the other hand, improving such efficiency relying on the data reduction strategies, without affecting the quality of the reduced dataset (i.e. keeping the accuracy during the classification process), represents a challenge. Also, the runtime of commonly used strategies is usually high. In this paper, a novel hybrid data reduction strategy is presented. The proposed strategy reduces the number of features and instances in the training collection without greatly affecting the quality of the reduced dataset. In addition, it improves the efficiency of the classification process. Finally, our proposal is favorably compared with other hybrid data reduction strategies.
KeywordsData mining Data reduction Instance selection Feature selection
- 3.Kotsiantis, S.B.: Supervised machine learning: a review of classification techniques. In: Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering, Amsterdam, The Netherlands, pp. 3–24 (2007)Google Scholar
- 8.Chou, C.H., Kuo, B.H. and Chang, F.: The generalized condensed nearest neighbor rule as a data reduction method. In: 18th International Conference on Pattern Recognition (ICPR 2006), vol. 2, pp. 556–559 (2006)Google Scholar
- 13.KDDCup 1999: Computer network intrusion detection. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. Accessed 25 Feb 2017
- 14.Song, J.: CDMC2013 intrusion detection dataset. Department of Science & Technology Security, Korea Institute of Science and Technology Information (KISTI) (2013)Google Scholar