On Robust Malware Classifiers by Verifying Unwanted Behaviours

Chen, Wei; Aspinall, David; Gordon, Andrew D.; Sutton, Charles; Muttik, Igor

doi:10.1007/978-3-319-33693-0_21

Wei Chen¹⁵,
David Aspinall¹⁵,
Andrew D. Gordon^15,16,
Charles Sutton¹⁵ &
…
Igor Muttik¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9681))

Included in the following conference series:

International Conference on Integrated Formal Methods

1077 Accesses
2 Citations

Abstract

Machine-learning-based Android malware classifiers perform badly on the detection of new malware, in particular, when they take API calls and permissions as input features, which are the best performing features known so far. This is mainly because signature-based features are very sensitive to the training data and cannot capture general behaviours of identified malware. To improve the robustness of classifiers, we study the problem of learning and verifying unwanted behaviours abstracted as automata. They are common patterns shared by malware instances but rarely seen in benign applications, e.g., intercepting and forwarding incoming SMS messages. We show that by taking the verification results against unwanted behaviours as input features, the classification performance of detecting new malware is improved dramatically. In particular, the precision and recall are respectively 8 and 51 points better than those using API calls and permissions, measured against industrial datasets collected across several years. Our approach integrates several methods: formal methods, machine learning and text mining techniques. It is the first to automatically generate unwanted behaviours for Android malware detection. We also demonstrate unwanted behaviours constructed for well-known malware families. They compare well to those described in human-authored descriptions of these families.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Malware Genome Project (2012). http://www.malgenomeproject.org/
Forensic Blog (2014). http://forensics.spreitzenbarth.de/android-malware/
Juniper Networks (2015). https://www.juniper.net/security/auto/includes/mobile_signature_descriptions.html
Symantec security response (2015). http://www.symantec.com/security_response/
Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware detection in Android. In: Zia, T., Zomaya, A., Varadharajan, V., Mao, M. (eds.) SecureComm 2013. LNICST, vol. 127, pp. 86–103. Springer, Heidelberg (2013)
Chapter Google Scholar
Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)
MathSciNet Google Scholar
Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)
Article MathSciNet MATH Google Scholar
Arp, D., et al.: Drebin: efficient and explainable detection of Android malware in your pocket. In: NDSS, pp. 23–26 (2014)
Google Scholar
Au, K.W.Y., et al.: PScout: analyzing the Android permission specification. In: CCS, pp. 217–228 (2012)
Google Scholar
Barrera, D., Kayacik, H.G., van Oorschot, P.C., Somayaji, A.: A methodology for empirical analysis of permission-based security models and its application to Android. In: CCS, pp. 73–84 (2010)
Google Scholar
Beaucamps, P., Gnaedig, I., Marion, J.-Y.: Behavior abstraction in malware analysis. In: Barringer, H., et al. (eds.) RV 2010. LNCS, vol. 6418, pp. 168–182. Springer, Heidelberg (2010)
Chapter Google Scholar
Biermann, A.W., Feldman, J.A.: On the synthesis of finite-state machines from samples of their behavior. IEEE Trans. Comput. 21(6), 592–597 (1972)
Article MathSciNet MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Chakradeo, S., Reaves, B., Traynor, P., Enck, W.: MAST: triage for market-scale mobile malware analysis. In: WiSec, pp. 13–24 (2013)
Google Scholar
Chapelle, O., Schlkopf, B., Zien, A.: Semi-Supervised Learning. The MIT Press, Cambridge (2010)
Google Scholar
Chen, K.Z., et al.: Contextual policy enforcement in Android applications with permission event graphs. In: NDSS (2013)
Google Scholar
Enck, W., Octeau, D., McDaniel, P., Chaudhuri, S.: A study of Android application security. In: USENIX Security Symposium (2011)
Google Scholar
Esparza, J., Hansel, D., Rossmanith, P., Schwoon, S.: Efficient algorithms for model checking pushdown systems. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 232–247. Springer, Heidelberg (2000)
Chapter Google Scholar
Fredrikson, M., et al.: Synthesizing near-optimal malware specifications from suspicious behaviors. In: Proceedings of the IEEE Symposium on Security and Privacy, SP 2010, pp. 45–60 (2010)
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Article MathSciNet MATH Google Scholar
Gascon, H., Yamaguchi, F., Arp, D., Rieck, K.: Structural detection of Android malware using embedded call graphs. In: AISec, pp. 45–54 (2013)
Google Scholar
Gorla, A., et al.: Checking app behavior against app descriptions. In: ICSE, pp. 1025–1035 (2014)
Google Scholar
Küster, J.-C., Bauer, A.: Monitoring real Android malware. In: Bartocci, E., et al. (eds.) RV 2015. LNCS, vol. 9333, pp. 136–152. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23820-3_9
Chapter Google Scholar
McAfee Threat Center (2015). http://www.mcafee.com/uk/threat-center.aspx
Norvig, P.: Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (1992)
Google Scholar
Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Google Scholar
Reina, A., Fattori, A., Cavallaro, L.: A system call-centric analysis and stimulation technique to automatically reconstruct Android malware behaviors. In: European Workshop on System Security (EUROSEC) (2013)
Google Scholar
Schneider, F.B.: Enforceable security policies. ACM Trans. Inf. Syst. Secur. 3(1), 30–50 (2000)
Article Google Scholar
Song, F., Touili, T.: LTL model-checking for malware detection. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013 (ETAPS 2013). LNCS, vol. 7795, pp. 416–431. Springer, Heidelberg (2013)
Chapter Google Scholar
Spreitzenbarth, M., et al.: Mobile-sandbox: combining static and dynamic analysis with machine-learning techniques. Int. J. Inf. Secur. 14(2), 141–153 (2015)
Article Google Scholar
Steinwart, I., Christmann, A.: Support Vector Machines. Springer, New York (2008)
MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Ser. B 58, 267–288 (1994)
MathSciNet MATH Google Scholar
Vardi, M.Y., Wolper, P.: Automata-theoretic techniques for modal logics of programs. J. Comput. Syst. Sci. 32(2), 183–221 (1986)
Article MathSciNet MATH Google Scholar
Whaley, J., Martin, M.C., Lam, M.S.: Automatic extraction of object-oriented component interfaces. SIGSOFT Softw. Eng. Notes 27(4), 218–228 (2002)
Article Google Scholar
Yang, C., et al.: Droidminer: automated mining and characterization of fine-grained malicious behaviors in Android applications. In: ESORICS, pp. 163–182 (2014)
Google Scholar
Yerima, S.Y., Sezer, S., McWilliams, G., Muttik, I.: A new Android malware detection approach using bayesian classification. In: AINA, pp. 121–128 (2013)
Google Scholar
Zhou, Y., Jiang, X.: Dissecting Android malware: characterization and evolution. In: IEEE Symposium on Security and Privacy, pp. 95–109 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Edinburgh, Edinburgh, UK
Wei Chen, David Aspinall, Andrew D. Gordon & Charles Sutton
Microsoft Research Cambridge, Cambridge, UK
Andrew D. Gordon
Intel Security, Alesbury, UK
Igor Muttik

Authors

Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
David Aspinall
View author publications
You can also search for this author in PubMed Google Scholar
Andrew D. Gordon
View author publications
You can also search for this author in PubMed Google Scholar
Charles Sutton
View author publications
You can also search for this author in PubMed Google Scholar
Igor Muttik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Chen .

Editor information

Editors and Affiliations

RWTH Aachen University, Aachen, Germany
Erika Ábrahám
University of Twente, Enschede, Overijssel, The Netherlands
Marieke Huisman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, W., Aspinall, D., Gordon, A.D., Sutton, C., Muttik, I. (2016). On Robust Malware Classifiers by Verifying Unwanted Behaviours. In: Ábrahám, E., Huisman, M. (eds) Integrated Formal Methods. IFM 2016. Lecture Notes in Computer Science(), vol 9681. Springer, Cham. https://doi.org/10.1007/978-3-319-33693-0_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-33693-0_21
Published: 24 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33692-3
Online ISBN: 978-3-319-33693-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics