Skip to main content

Deployable Classifiers for Malware Detection

  • Conference paper
Book cover Information Systems, Technology and Management (ICISTM 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 285))

  • 1189 Accesses

Abstract

The application of machine learning methods to malware detection has opened up possibilities of generating large number of classifiers that use different kinds of features and learning algorithms. A straightforward way to select the best classifier is to pick the one with best holdout or cross-validation performance. Cross-validation or holdout gives a point estimate of generalization performance that varies with training data and learning algorithm parameters. We propose a classifier selection criterion that considers bounds on the performance estimates using confidence intervals in conjunction with a performance target. Performance targets are commonly used in practice, particularly in security applications like malware detection, for classifier selection. The proposed criterion, called deployability, selects a classifier as deployable if the cost target lies within or above the classifier’s expected cost confidence interval. We conducted an experiment with machine learning based malware detectors to evaluate the criterion. We found that for a given confidence level and cost target, even the classifier with least expected cost may not be deployable and classifiers with higher expected cost may also be deployable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brazdil, P., Gama, J., Henery, B.: Characterizing the Applicability of Classification Algorithms Using Meta-Level Learning. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 83–102. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  2. Elovici, Y., Braha, D.: A decision-theoretic approach to data mining. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 33(1), 42–51 (2003)

    Article  Google Scholar 

  3. Gaffney Jr., J., Ulvila, J.: Evaluation of intrusion detectors: A decision theory approach. In: Proc. of IEEE Symposium on Security and Privacy, pp. 50–61 (2001)

    Google Scholar 

  4. Gama, J., Brazdil, P.: Characterization of classification algorithms. In: Progress in Artificial Intelligence, pp. 189–200 (1995)

    Google Scholar 

  5. Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann (2006)

    Google Scholar 

  6. Kleinberg, J., Papadimitriou, C., Raghavan, P.: A microeconomic view of data mining. Data Mining and Knowledge Discovery 2(4), 311–324 (1998)

    Article  Google Scholar 

  7. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence, vol. 14, pp. 1137–1145 (1995)

    Google Scholar 

  8. Kolter, J., Maloof, M.: Learning to detect malicious executables in the wild. In: Proc. of the Tenth ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 470–478 (2004)

    Google Scholar 

  9. Kolter, J., Maloof, M.: Learning to detect and classify malicious executables in the wild. The Journal of Machine Learning Research 7, 2721–2744 (2006)

    MathSciNet  MATH  Google Scholar 

  10. Miller, I., Miller, M.: John E. Freund’s mathematical statistics with applications. Prentice Hall (2004)

    Google Scholar 

  11. Moskovitch, R., Feher, C., Tzachar, N., Berger, E., Gitelman, M., Dolev, S., Elovici, Y.: Unknown Malcode Detection Using OPCODE Representation. In: Ortiz-Arroyo, D., Larsen, H.L., Zeng, D.D., Hicks, D., Wagner, G. (eds.) EuroIsI 2008. LNCS, vol. 5376, pp. 204–215. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Singh, A., Singh, S., Walenstein, A., Lakhotia, A. (2012). Deployable Classifiers for Malware Detection. In: Dua, S., Gangopadhyay, A., Thulasiraman, P., Straccia, U., Shepherd, M., Stein, B. (eds) Information Systems, Technology and Management. ICISTM 2012. Communications in Computer and Information Science, vol 285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29166-1_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29166-1_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29165-4

  • Online ISBN: 978-3-642-29166-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics