Abstract
Data mining applications explore large amounts of heterogeneous data in search of consistent information. In such a challenging context, empirical learning methods aim to optimize prediction on unseen data, and an accurate estimate of the generalization error is of paramount importance. The paper shows that the theoretical formulation based on the Vapnik-Chervonenkis dimension (d vc ) can be of practical interest when applied to clustering methods for data-mining applications. The presented research adopts the K-Winner Machine (KWM) as a clustering-based, semi-supervised classifier; in addition to fruitful theoretical properties, the model provides a general criterion for evaluating the applicability of Vapnik’s generalization predictions in data mining. The general approach is verified experimentally in the practical problem of detecting intrusions in computer networks. Empirical results prove that the KWM model can effectively support such a difficult classification task and combine unsupervised and supervised.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mirkin, B.: Clustering for Data Mining: a Data-recovery Approach (2006)
Vapnik, V.: Estimation of Dependences Based on Empirical Data. Springer, Heidelberg (1982)
Ridella, S., Rovetta, S., Zunino, R.: K-winner machines for pattern classification. IEEE Trans. on Neural Networks 12, 371–385 (2001)
Kemmerer, R., Vigna, G.: Intrusion detection: a brief history and overview. Computer 35, 27–30 (2002)
Portnoy, L., Eskin, E., Stolfo, S.J.: Intrusion detection with unlabeled data using clustering. In: Proc. ACM CSS Workshop on Data Mining Applied to Security, pp. 123–130 (2001)
Eskin, E., Arnold, A., Prerau, M.: A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. Applications of Data Mining in Computer Security (2002)
Oh, S.H., Lee, W.S.: An anomaly intrusion detection method by clustering normal user behavior. Computers and Security 22, 596–612 (2003)
Lee, W., Stolfo, S., Mok, K.: Adaptive intrusion detection: a data mining approach. Artificial Intelligence Review 14, 533–567 (2000)
Zheng, J., Hu, M.: An anomaly intrusion detection system based on vector quantization. IEICE Trans. Inf. and Syst. E89-D, 201–210 (2006)
KDD Cup 1999 Intrusion detection dataset, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
Ridella, S., Rovetta, S., Zunino, R.: Plastic algorithm for adaptive vector quantization. Neural Computing and Applications 7, 37–51 (1998)
Tm, M., Sg, B., Kj, S.: Neural gas network for vector quantization and its application to time-series prediction. IEEE Trans. Neural Networks 4, 558–569 (1993)
Pfahringer, B.: Winning the kdd99 classification cup: bagged boosting. SIGKDD Explorations 1, 65–66 (2000)
Results of the KDD 1999 Classifier Learning Contest, http://www-cse.ucsd.edu/users/elkan/clresults.html
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Decherchi, S., Gastaldo, P., Redi, J., Zunino, R. (2008). Non-stationary Data Mining: The Network Security Issue. In: Kůrková, V., Neruda, R., Koutník, J. (eds) Artificial Neural Networks - ICANN 2008. ICANN 2008. Lecture Notes in Computer Science, vol 5164. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87559-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-87559-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87558-1
Online ISBN: 978-3-540-87559-8
eBook Packages: Computer ScienceComputer Science (R0)