Abstract
Data has become more and more important to individuals, organizations, and companies, and therefore, safeguarding these sensitive data in relational databases has become a critical issue. However, despite traditional security mechanisms, attacks directed to databases still occur. Thus, an intrusion detection system (IDS) specifically for the database that can provide protection from all possible malicious users is necessary. In this paper, we present a random forests (RF) method with weighted voting for the task of anomaly detection. RF is a graph-based technique suitable for modeling SQL queries, and weighted voting enhances its capabilities by balancing the voting impact of each tree. Experiments show that RF with weighted voting exhibits a more superior performance consistency, as well as better error rates with increasing number of trees, compared to conventional RF. Moreover, it outperforms all other state-of-the-art data mining algorithms in terms of false positive rate (0.076) and false negative rate (0.0028).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lee, S.Y., Low, W.L., Wong, P.Y.: Learning Fingerprints for a Database Intrusion Detection System. In: Gollmann, D., Karjoth, G., Waidner, M. (eds.) ESORICS 2002. LNCS, vol. 2502, pp. 264–279. Springer, Heidelberg (2002)
Huynh, V.H., Le, A.N.T.: Process mining and security: Visualization in database intrusion detection. In: Chau, M., Wang, G.A., Yue, W.T., Chen, H. (eds.) PAISI 2012. LNCS, vol. 7299, pp. 81–95. Springer, Heidelberg (2012)
Jin, X., Osborn, S.L.: Architecture for Data Collection in Database Intrusion Detection Systems. In: Jonker, W., Petković, M. (eds.) SDM 2007. LNCS, vol. 4721, pp. 96–107. Springer, Heidelberg (2007)
Ramasubramanian, P., Kannan, A.: A Genetic Algorithm Based Neural Network Short-term Forecasting Framework for Database Intrusion Prediction System. Soft Computing 10(8), 699–714 (2006)
Yaqub, M., Javaid, M.K., Cooper, C., Noble, J.A.: Investigation of the Role of Feature Selection and Weighted Voting in Random Forests for 3-D Volumetric Segmentation. IEEE Transactions on Medical Imaging 33(2), 258–271 (2014)
Barbara, D., Goel, R., Jajodia, S.: Mining Malicious Corruption of Data with Hidden Markov Models. In: Gudes, E., Shenoi, S. (eds.) Research Directions in Data and Applications Security. IFIP, vol. 128, pp. 175–189. Springer, Boston (2003)
Hu, Y., Panda, B.: A Data Mining Approach for Database Intrusion Detection. In: ACM Symposium on Applied Computing, pp. 711–716 (2004)
Srivastava, A., Sural, S., Majumdar, A.K.: Database Intrusion Detection Using Weighted Sequence Mining. Journal of Computers 1(4), 8–17 (2006)
Pinzón, C., Herrero, Á., De Paz, J.F., Corchado, E., Bajo, J.: CBRid4SQL: A CBR intrusion detector for SQL injection attacks. In: Corchado, E., Graña Romay, M., Manhaes Savio, A. (eds.) HAIS 2010, Part II. LNCS, vol. 6077, pp. 510–519. Springer, Heidelberg (2010)
Kamra, A., Terzi, E., Bertino, E.: Detecting Anomalous Access Patterns in Relational Databases. The VLDB Journal 17(5), 1063–1077 (2008)
Ronao, C.A., Cho, S.-B.: A Comparison of Data Mining Techniques for Anomaly Detection in Relational Databases. In: Intl. Conf. on Digital Society (ICDS), pp. 11–16 (2015)
Robnik-Šikonja, M.: Improving Random Forests. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 359–370. Springer, Heidelberg (2004)
Tsymbal, A., Pechenizkiy, M., Cunningham, P.: Dynamic Integration with Random Forests. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 801–808. Springer, Heidelberg (2006)
El Habib Daho, M., Settouti, N., El Amine Lazouni, M., El Amine Chikh, M.: Weighted Vote for Trees Aggregation in Random Forest. In: Intl Conf. on Multimedia Computing Systems (ICMCS), pp. 438–443 (2014)
Kulkarni, V.Y., Sinha, P.K.: Effective Learning and Classification using Random Forest Algorithm. Intl. Journal of Engg. and Innovative Technology (IJEIT) 3(11), 267–273 (2014)
Transaction Processing Performance Council (TPC): TPC Benchmark E, Standard Specification, Version 1.13.0 (2014)
Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)
Valeur, F., Mutz, D., Vigna, G.: A Learning-Based Approach to the Detection of SQL Attacks. In: Julisch, K., Kruegel, C. (eds.) DIMVA 2005. LNCS, vol. 3548, pp. 123–140. Springer, Heidelberg (2005)
Bockermann, C., Apel, M., Meier, M.: Learning SQL for Database Intrusion Detection Using Context-Sensitive Modelling (Extended abstract). In: Flegel, U., Bruschi, D. (eds.) DIMVA 2009. LNCS, vol. 5587, pp. 196–205. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ronao, C.A., Cho, SB. (2015). Random Forests with Weighted Voting for Anomalous Query Access Detection in Relational Databases. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2015. Lecture Notes in Computer Science(), vol 9120. Springer, Cham. https://doi.org/10.1007/978-3-319-19369-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-19369-4_4
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19368-7
Online ISBN: 978-3-319-19369-4
eBook Packages: Computer ScienceComputer Science (R0)