Skip to main content
Log in

Android malware detection based on system call sequences and LSTM

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

As Android-based mobile devices become increasingly popular, malware detection on Android is very crucial nowadays. In this paper, a novel detection method based on deep learning is proposed to distinguish malware from trusted applications. Considering there is some semantic information in system call sequences as the natural language, we treat one system call sequence as a sentence in the language and construct a classifier based on the Long Short-Term Memory (LSTM) language model. In the classifier, at first two LSTM models are trained respectively by the system call sequences from malware and those from benign applications. Then according to these models, two similarity scores are computed. Finally, the classifier determines whether the application under analysis is malicious or trusted by the greater score. Thorough experiments show that our approach can achieve high efficiency and reach high recall of 96.6% with low false positive rate of 9.3%, which is better than the other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Arp D, Spreitzenbarth M, Hubner M, et al (2014) DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket, in: Proceeding of 21th Annual Network and Distributed System Security Symposium (NDSS), San Diego, 2014

  2. Aung Z, Zaw W (2013) Permission-based android malware detection. Int J Sci Technol Res 2:228–234

    Google Scholar 

  3. Battista P, Mercaldo F, Nardone V, et al (2016) Identification of Android Malware Families with Model Checking, in: Proceeding of International Conference on Information Systems Security and Privacy, Rome, 2016

  4. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult, IEEE Press, Neural Networks, 5(2) (1994), pp. 157–166

  5. Bengio Y, Schwenk H, Senécal J et al (2003) Probabilistic language models. J Mach Learn Res 3:1137–1155

    Google Scholar 

  6. Canfora G, Medvet E, Mercaldo F, et al (2015) Detecting Android malware using sequences of system calls, in: Proceeding of International Workshop on Software Development Lifecycle for Mobile (DeMobile), 2015, pp 13–20

  7. Canfora G, Mercaldo F, Visaggio CA (2016) An HMM and structural entropy based detector for android malware: an empirical study. Comput Secur 61:1–18

    Article  Google Scholar 

  8. Chen PS, Lin SC, Sun CH (2015) Simple and effective method for detecting abnormal internet behaviors of mobile devices. Inf Sci 321:193–204

    Article  Google Scholar 

  9. Chen S, Xue M, Tang Z, et al (2016) StormDroid: A Streaminglized Machine Learning-Based System for Detecting Android Malware, in: Proceeding of ACM on Asia Conference on Computer and Communications Security(ASIACCS), Xian, 2016, pp 377–388

  10. Dimja, Marko, Atzeni S, et al (2016) Evaluation of Android Malware Detection Based on System Calls. In: Proceedings of ACM on International Workshop on Security and Privacy Analytics (IWSPA), New Orlean, pp 1–8

  11. Elman JL (1990) Finding structure in time 1990. Cogn Sci 14:179–211

  12. Feng Y, Anand S, Dillig I, et al (2014) Apposcopy: semantics-based detection of Android malware through static analysis, in: Proceeding of 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE14), Hong Kong, 2014, pp 576–587

  13. FireEye, Out of Pocket (2015): A Comprehensive Mobile Threat Assessment of 7 Million iOS and Android Apps, < https://www.fireeye.com/rs/fireeye/images/rpt-mobilethreat assessment.pdf>, (accessed 17.08.27)

  14. Graves A (2012), Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence

  15. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780

    Article  Google Scholar 

  16. Li Q, Li X (2015) Android Malware Detection Based on Static Analysis of Characteristic Tree, in: Proceeding of International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, Xian, 2015, pp 84–91

  17. Li Y, Shen T, Sun X, et al (2015) Detection, Classification and Characterization of Android Malware Using API Data Dependency, in: Proceeding of International Conference on Security and Privacy in Communication Systems(SecureComm2015), Dallas, 2015, pp 23–40

  18. Lunden I. (2015) 6.1b smartphone users globally by 2020, overtaking basic fixed phone subscriptions.<http://techcrunch.com/2015/06/02/6-1b-smartphone-users-globally-by-2020-overtaking-basic-fixed-phone-subscriptions >, (accessed 17.08.27)

  19. android-market-api-py. <https://github.com/liato/android-market-api-py> (accessed 17.08.27)

  20. Mercaldo F, Nardone V, Santone A, et al (2016) Download Malware? No, Thanks. How Formal Methods Can Block Update Attacks, in: Proceeding of Fme Workshop on Formal Methods in Software Engineering, Austin, 2016, pp 22–28

  21. Mercaldo F, Nardone V, Santone A, et al (2016) Ransomware Steals Your Phone. Formal Methods Rescue It, in: Proceeding of International Conference on Formal Techniques for Distributed Objects, Components, and Systems, Heraklion, Crete, Greece, 2016, pp 212–221

  22. Mikolov T, Karafiat M, Burget L, et al (2010) Recurrent neural network based language model, in: Proceeding of the Annual Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, 2010, pp 1045–1048

  23. Rashidi B, Fung C, Bertino E (2016) Android resource usage risk assessment using hidden Markov model and online learning. Comput Secur 65:90–107

    Article  Google Scholar 

  24. Saracino A, Sgandurra D, Dini G, et al (2016) MADAM: Effective and Efficient Behavior-based Android Malware Detection and Prevention. IEEE Transactions on Dependable & Secure Computing, 2016, pp 1–1

  25. Sundermeyer M, Schluter R, Ney H (2012) LSTM Neural Networks for Language Modeling, in: Proceeding of the Annual Conference of the International Speech Communication Association (Interspeech2012), Portland, 2012, pp 601–608

  26. System call https://en.wikipedia.org/wiki/System_call > (accessed 17.08.27)

  27. Tensorflow http://www.tensorflow.org> (accessed 17.08.27)

  28. Wang Z, Li C, Guan Y, et al (2015) DroidChain: A novel malware detection method for Android based on behavior chain, in: Proceeding of Communications and Network Security (CNS), FLORENCE, 2015, pp 727–728

  29. Wu W C, Hung S H (2014) DroidDolphin: a dynamic Android malware detection framework using big data and machine learning, in: Proceeding of 2014 Conference on Research in Adaptive and Convergent Systems, Towson, 2014, pp 247–252

  30. Xiao X, Wang Z, Li Q et al (2017) Back-propagation neural network on Markov chains from system call sequences: a new approach for detecting android malware with system call sequences. IET Inf Secur 11:8–15

    Article  Google Scholar 

  31. Xu K, Li Y, Deng RH (2016) ICCDetector: ICC-based malware detection on android. IEEE Trans Inf Forensics Secur 11:1252–1264

    Article  Google Scholar 

  32. Yeh C W, Yeh W T, Hung S H, et al (2016) Flattened Data in Convolutional Neural Networks: Using Malware Detection as Case Study, in: Proceeding of International Conference on Research in Adaptive and Convergent Systems, Odense, 2016, pp 130–135

  33. Yu W, Ge L, Xu G, et al (2014) Towards Neural Network Based Malware Detection on Android Mobile Devices, Cybersecurity Systems for Human Cognition Augmentation, 2014, pp 99–117

  34. Zhou Y, Jiang X (2012) Dissecting Android Malware: Characterization and Evolution, in: Proceedings of 33rd IEEE Symposium on Security and Privacy, SAN FRANCISCO, 2012, pp. 95–109

  35. Zhou Y, Jiang X (2013) Android malware, Springer, New York, USA, 2013

Download references

Acknowledgements

This work is supported by the NSFC projects (61375054, 61402255, 61202358), the National High-tech R&D Program of China (2015AA016102), Guangdong Natural Science Foundation (2015A030310492, 2014A030313745) and the RD Program of Shenzhen (JCYJ20150630170146831, JCYJ20160301152145171, JCYJ20160531174259309, JSGG20150512162853495, Shenfagai [2015] 986), and Cross fund of Graduate School at Shenzhen, Tsinghua University (JC20140001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangwu Hu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiao, X., Zhang, S., Mercaldo, F. et al. Android malware detection based on system call sequences and LSTM. Multimed Tools Appl 78, 3979–3999 (2019). https://doi.org/10.1007/s11042-017-5104-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5104-0

Keywords

Navigation