Skip to main content
Log in

Bagging-RandomMiner: a one-class classifier for file access-based masquerade detection

  • Special Issue Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Dependence on personal computers has required the development of security mechanisms to protect the information stored in these devices. There have been different approaches to profile user behavior to protect information from a masquerade attack; one such recent approach is based on user file-access patterns. In this paper, we propose a novel classification ensemble for file access-based masquerade detection. We have successfully validated the hypothesis that a one-class classification approach to file access-based masquerade detection outperforms a multi-class one. In particular, our proposed one-class classifier significantly outperforms several state-of-the-art multi-class classifiers. Our results indicate that one-class classification attains better classification results, even when unknown attacks arise. Additionally, we introduce three new repositories of datasets for the identification of the three main types of attacks reported in the literature, where each training dataset contains no object belonging to the type of attack to be identified. These repositories can be used for testing future classifiers, simulating attacks carried out in a real scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. Available at www.schonlau.net/intrusion.html.

  2. http://homepage.cem.itesm.mx/raulm/wuil-ds/.

  3. drive.google.com/file/d/0B9jNhJSXlx7GNTJKUHFqb0Rtdjg/view.

Abbreviations

AUC:

Area under the receiver operating characteristic curve

CD:

Critical difference

FP:

False positive

FPrate:

False positive detection rate

FS:

File system

FSN:

File-system navigation

GUI:

Graphical user interface

HCI:

Human–computer interaction

MDS:

Masquerade detection system

MTeS:

Masquerade testing set

MTrS:

Masquerade training set

MRO:

Most representative object

PC:

Personal computer

PCA:

Principal component analysis

ROC:

Receiver operating characteristic

TeS:

Testing set

TP:

True positive

TPrate:

True positive detection rate

TrS:

Training set

UTeS:

User testing set

UTrS:

User training set

ZFP:

Zero-false positives

1vA:

One versus all

5-FCV:

Fivefold cross-validation

References

  1. Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991). https://doi.org/10.1007/BF00153759

    Article  Google Scholar 

  2. Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston (1999)

    Google Scholar 

  3. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  4. Camiña, B., Monroy, R., Trejo, L.A., Sánchez, E.: Towards building a masquerade detection method based on user file system navigation. In: Batyrshin, I., Sidorov, G. (eds.) Proceedings of the 10th Mexican International Conference on Artificial Intelligence (MICAI 2011), pp. 174–186. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-25324-9_15

    Chapter  Google Scholar 

  5. Camiña, J.B., Hernndez-Gracidas, C., Monroy, R., Trejo, L.: The windows-users and -intruder simulations logs dataset (wuil): an experimental framework for masquerade detection mechanisms. Expert Syst. Appl. 41(3), 919–930 (2014). https://doi.org/10.1016/j.eswa.2013.08.022. Methods and Applications of Artificial and Computational Intelligence

    Article  Google Scholar 

  6. Camiña, J.B., Monroy, R., Trejo, L.A., Medina-Pérez, M.A.: Temporal and spatial locality: an abstraction for masquerade detection. IEEE Trans. Inf. Forensics Secur. 11(9), 2036–2051 (2016). https://doi.org/10.1109/TIFS.2016.2571679

    Article  Google Scholar 

  7. Camiña, J.B., Rodríguez, J., Monroy, R.: Towards a masquerade detection system based on user’s tasks. In: Stavrou, A., Bos, H., Portokalidis, G. (eds.) Proceedings of the 17th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2014), pp. 447–465. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11379-1_22

    Chapter  Google Scholar 

  8. Cessie, S.L., Houwelingen, J.C.V.: Ridge estimators in logistic regression. J. R. Stat. Soc. Ser. C (Appl. Stat.) 41(1), 191–201 (1992)

    MATH  Google Scholar 

  9. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  10. Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. SE-13(2), 222–232 (1987). https://doi.org/10.1109/TSE.1987.232894

  11. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 7th edn. Wiley-Interscience, Hoboken (2012)

    MATH  Google Scholar 

  12. Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006). https://doi.org/10.1016/j.patrec.2005.10.010. ROC Analysis in Pattern Recognition

    Article  MathSciNet  Google Scholar 

  13. Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 148–156 (1996)

  14. García, S., Herrera, F.: An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2008)

    MATH  Google Scholar 

  15. Garg, A., Rahalkar, R., Upadhyaya, S., Kwiat, K.: Profiling users in GUI based systems for masquerade detection. In: IEEE Information Assurance Workshop, pp. 48–54 (2006). https://doi.org/10.1109/IAW.2006.1652076

  16. Gates, C., Li, N., Xu, Z., Chari, S.N., Molloy, I., Park, Y.: Detecting insider information theft using features from file access logs. In: Kutyłowski, M., Vaidya, J. (eds.) Proceedings of the 19th European Symposium on Research in Computer Security (ESORICS), pp. 383–400. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11212-1_22

    Chapter  Google Scholar 

  17. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009). https://doi.org/10.1145/1656274.1656278

    Article  Google Scholar 

  18. Haykin, S.S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Tsinghua University Press, Beijing (2001)

    MATH  Google Scholar 

  19. Japkowicz, N.: Assessment Metrics for Imbalanced Learning. In: He, H., Ma, Y. (eds.) Imbalanced Learning: Foundations, Algorithms, and Applications, Chap. 8, pp. 187–206. Wiley, New York (2013). https://doi.org/10.1002/9781118646106.ch8

    Chapter  Google Scholar 

  20. Jian, Z., Shirai, H., Takahashi, I., Kuroiwa, J., Odaka, T., Ogura, H.: Masquerade detection by boosting decision stumps using unix commands. Comput. Secur. 26(4), 311–318 (2007). https://doi.org/10.1016/j.cose.2006.11.008

    Article  Google Scholar 

  21. John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI’95, pp. 338–345. Morgan Kaufmann Publishers Inc., San Francisco (1995). http://dl.acm.org/citation.cfm?id=2074158.2074196

  22. Kholidy, H.A., Baiardi, F., Hariri, S.: DDSGA: a data-driven semi-global alignment approach for detecting masquerade attacks. IEEE Trans. Dependable Secure Comput. 12(2), 164–178 (2015). https://doi.org/10.1109/TDSC.2014.2327966

    Article  Google Scholar 

  23. Killourhy, K., Maxion, R.: Why did my detector do that?!. In: Jha, S., Sommer, R., Kreibich, C. (eds.) Proceedings of the 13th International Symposium on Recent Advances in Intrusion Detection (RAID), pp. 256–276. Springer, Berlin (2010). https://doi.org/10.1007/978-3-642-15512-3_14

    Chapter  Google Scholar 

  24. Killourhy, K.S., Maxion, R.A.: Comparing anomaly-detection algorithms for keystroke dynamics. In: International Conference on Dependable Systems Networks (IFIP), pp. 125–134 (2009). https://doi.org/10.1109/DSN.2009.5270346

  25. Kim, H.S., Cha, S.D.: Empirical evaluation of SVM-based masquerade detection using unix commands. Comput. Secur. 24(2), 160–168 (2005). https://doi.org/10.1016/j.cose.2004.08.007

    Article  Google Scholar 

  26. Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: 14th International Conference on Machine Learning (ICML97), pp. 179–186 (1997)

  27. Kudłacik, P., Porwik, P., Wesołowski, T.: Fuzzy approach for intrusion detection based on user’s commands. Soft. Comput. 20(7), 2705–2719 (2016). https://doi.org/10.1007/s00500-015-1669-6

    Article  Google Scholar 

  28. Loyola-González, O., Medina-Pérez, M.A., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Monroy, R., García-Borroto, M.: PBC4cip: a new contrast pattern-based classifier for class imbalance problems. Knowl. Based Syst. 115, 100–109 (2017). https://doi.org/10.1016/j.knosys.2016.10.018

    Article  Google Scholar 

  29. MathWorks, Inc.: Treebagger (2015). http://www.mathworks.com/help/toolbox/stats/treebagger.html

  30. Maxion, R.A.: Masquerade detection using enriched command lines. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 5–14 (2003). https://doi.org/10.1109/DSN.2003.1209911

  31. Maxion, R.A., Townsend, T.N.: Masquerade detection using truncated command lines. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 219–228 (2002). https://doi.org/10.1109/DSN.2002.1028903

  32. Medina-Pérez, M.A., Monroy, R., Camiña, J.B., García-Borroto, M.: Bagging-TPMiner: a classifier ensemble for masquerader detection based on typical objects. Soft. Comput. 21(3), 557–569 (2017). https://doi.org/10.1007/s00500-016-2278-8

    Article  Google Scholar 

  33. Messerman, A., Mustafi, T., Camtepe, S.A., Albayrak, S.: Continuous and non-intrusive identity verification in real-time environments based on free-text keystroke dynamics. In: International Joint Conference on Biometrics (IJCB), pp. 1–8 (2011). https://doi.org/10.1109/IJCB.2011.6117552

  34. Morales, A., Fierrez, J., Ortega-Garcia, J.: Towards predicting good users for biometric recognition based on keystroke dynamics. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) Proceedings of the Workshop on Computer Vision (ECCV 2014), pp. 711–724. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16181-5_54

    Chapter  Google Scholar 

  35. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schólkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods, pp. 185–208. MIT, Cambridge, MA, USA (1999)

  36. Pusara, M., Brodley, C.E.: User re-authentication via mouse movements. In: Proceedings of the Workshop on Visualization and Data Mining for Computer Security, VizSEC/DMSEC ’04, pp. 1–8. ACM, New York (2004). https://doi.org/10.1145/1029208.1029210

  37. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., Los Altos (1993)

    Google Scholar 

  38. Rodríguez, J., Cañete, L., Monroy, R., Medina-Pérez, M.A.: Experimenting with masquerade detection via user task usage. Int. J. Interact. Des. Manuf. (IJIDeM) 11(4), 771–784 (2016). https://doi.org/10.1007/s12008-016-0360-1

    Article  Google Scholar 

  39. Salem, M.B., Stolfo, S.J.: Modeling user search behavior for masquerade detection. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) Proceedings of the 14th International Symposium on Recent Advances in Intrusion Detection, pp. 181–200. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-23644-0_10

    Chapter  Google Scholar 

  40. Saljooghinejad, H., Bhukya, W.N.: Layered security architecture for masquerade attack detection. In: Cuppens-Boulahia, N., Cuppens, F., Garcia-Alfaro, J. (eds.) Proceedings of the 26th Conference on Data and Applications Security and Privacy, pp. 255–262. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-31540-4_19

    Chapter  Google Scholar 

  41. Schonlau, M., DuMouchel, W., Ju, W.H., Karr, A.F., Theus, M., Vardi, Y.: Computer intrusion: detecting masquerades. Stat. Sci. 16(1), 58–74 (2001)

  42. Shen, C., Cai, Z., Guan, X., Maxion, R.: Performance evaluation of anomaly-detection algorithms for mouse dynamics. Comput. Secur. 45, 156–171 (2014). https://doi.org/10.1016/j.cose.2014.05.002

    Article  Google Scholar 

  43. Song, Y., Salem, M.B., Hershkop, S., Stolfo, S.J.: System level user behavior biometrics using fisher features and gaussian mixture models. In: IEEE Security and Privacy Workshops, pp. 52–59 (2013). https://doi.org/10.1109/SPW.2013.33

  44. Vidal, J.M., Orozco, A.L.S., Villalba, L.J.G.: Online masquerade detection resistant to mimicry. Expert Syst. Appl. 61, 162–180 (2016). https://doi.org/10.1016/j.eswa.2016.05.036

    Article  Google Scholar 

  45. Wang, K., Stolfo, S.J.: One-class training for masquerade detection. In: Workshop on Data Mining for Computer Security, p. 10. Citeseer (2003)

  46. Wang, X., Sun, Y., Wang, Y.: An abnormal file access behavior detection approach based on file path diversity. IET Conference Proceedings, pp. 455–459 (2014). http://digital-library.theiet.org/content/conferences/10.1049/cp.2014.0632

  47. Wang, X., Wang, Y., Liu, Q., Sun, Y., Xie, P.: Insider detection by analyzing process behaviors of file access. In: Park, J.J.J.H., Yi, G., Jeong, Y.S., Shen, H. (eds.) Advances in Parallel and Distributed Computing and Ubiquitous Services (UCAWSN & PDCAT), pp. 209–219. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-0068-3_28

    Chapter  Google Scholar 

  48. Weiss, A., Ramapanicker, A., Shah, P., Noble, S., Immohr, L.: Mouse movements biometric identification: a feasibility study. Proc. Student/Faculty Research Day CSIS. Pace University, White Plains (2007)

Download references

Acknowledgements

We wish to express our gratitude to the members of the GIEE-ML group at Tecnológico de Monterrey for providing useful suggestions and advice on earlier versions of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miguel Angel Medina-Pérez.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Camiña, J.B., Medina-Pérez, M.A., Monroy, R. et al. Bagging-RandomMiner: a one-class classifier for file access-based masquerade detection. Machine Vision and Applications 30, 959–974 (2019). https://doi.org/10.1007/s00138-018-0957-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-018-0957-4

Keywords

Navigation