Skip to main content

Feature-Based Semi-supervised Learning to Detect Malware from Android

  • Chapter
  • First Online:
Automated Software Engineering: A Deep Learning-Based Approach

Part of the book series: Learning and Analytics in Intelligent Systems ((LAIS,volume 8))

Abstract

Malware is potentially harmful to an Android operating system just like desktop operating system. With the exponential growth of Android device we have analyze that the growth of Android malware is also increasing day by day and it paid a serious security threat to user’s privacy. Previously developed frameworks and virus protection softwares are capable to detect “known” malware specifically. In the previous studies researchers, has applied distinct supervised machine learning approaches to detect “unknown” malware, but practicality is far to be achieve because it needs a wide range of labeled data to train. In this work, we present a unique procedure to detect malware by employing a renowned semi-supervised learning technique. The approach presented in this chapter is help us to select best features by applying feature sub-set selection methods and to establish a malware detection model. We performed an empirical validation to demonstrate that semi-supervised machine learning techniques are sustaining the higher accuracy rates like supervised machine learning techniques used in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://play.google.com/store?hl=en.

  2. 2.

    http://android.pandaapp.com/.

  3. 3.

    http://www.gfan.com/.

  4. 4.

    http://www.hiapk.com/.

  5. 5.

    http://andrdoid.d.cn/.

  6. 6.

    http://www.appchina.com/.

  7. 7.

    http://www.mumayi.com/.

  8. 8.

    http://slideme.org/.

  9. 9.

    https://www.virustotal.com/.

  10. 10.

    https://www.microsoft.com/en-in/windows/comprehensive-security.

  11. 11.

    https://en.wikipedia.org/wiki/Rough_set.

References

  1. https://www.statista.com/statistics/266136/global-market-share-held-by-smartphone-operating-systems/

  2. https://www.statista.com/statistics/266210/number-of-available-applications-in-the-google-play-store/

  3. https://www.statista.com/statistics/271644/worldwide-free-and-paid-mobile-app-store-downloads/

  4. https://www.mcafee.com/in/resources/reports/rp-mobile-threat-report-2018.pdf

  5. https://source.android.com/security/reports/Google Android Security 2017 Report Final.pdf

  6. https://thehackernews.com/2018/03/android-botnet-malware.html

  7. I.H. Witten, E. Frank, M.A. Hall, C.J. Pal, Data Mining: Practical Machine Learning Tools and techniques (Morgan Kaufmann, 2016)

    Google Scholar 

  8. J. Sahs, L. Khan, A machine learning approach to android malware detection, in 2012 European Intelligence and Security Informatics Conference (IEEE, 2012), pp. 141–147

    Google Scholar 

  9. B. Sanz, I. Santos, C. Laorden, X. Ugarte-Pedrero, P. Garcia Bringas, G. Álvarez, Puma: permission usage to detect malware in android, in International Joint Conference CISIS’12-ICEUTE 12-SOCO 12 Special Sessions (Springer, Berlin, Heidelberg, 2013), pp. 289–298

    Google Scholar 

  10. A. Shabtai, U. Kanonov, Y. Elovici, C. Glezer, Y. Weiss, Andromaly: a behavioral malware detection framework for android devices. J. Intell. Inf. Syst. 38(1), 161–190 (2012)

    Article  Google Scholar 

  11. A. Mahindru, P. Singh, Dynamic permissions based android malware detection using machine learning techniques, in Proceedings of the 10th Innovations in Software Engineering Conference (ACM, 2017), pp. 202–210

    Google Scholar 

  12. D. Zhou, O. Bousquet, T.N. Lal, J. Weston, B. Schölkopf, Learning with local and global consistency, in Advances in Neural Information Processing Systems (2004) pp. 321–328

    Google Scholar 

  13. L. Chen, M. Zhang, C. Yang, R. Sahita, POSTER: semi-supervised classification for dynamic android malware detection, in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (ACM, 2017), pp. 2479–2481

    Google Scholar 

  14. C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  15. J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-an, H. Ye, Significant permission identification for machine-learning-based android malware detection. IEEE Trans. Ind. Inf. 14(7), 3216–3225 (2018)

    Article  Google Scholar 

  16. A. Zulkifli, I.R.A. Hamid, W. Md Shah, Z. Abdullah, Android malware detection based on network traffic using decision tree algorithm, in International Conference on Soft Computing and Data Mining (Springer, Cham, 2018), pp. 485–494

    Google Scholar 

  17. W. Wang, M. Zhao, J. Wang, Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J. Ambient Intell. Humanized Comput. 10(8), 3035–3043 (2019)

    Article  Google Scholar 

  18. Z. Aung, W. Zaw, Permission-based android malware detection. Int. J. Sci. Technol. Res. 2(3), 228–234 (2013)

    Google Scholar 

  19. L. Cen, C.S. Gates, L. Si, N. Li, A probabilistic discriminative model for android malware detection with decompiled source code. IEEE Trans. Dependable Secure Comput. 12(4), 400–412 (2014)

    Article  Google Scholar 

  20. L. Weichselbaum, M. Neugschwandtner, M. Lindorfer, Y. Fratantonio, V. van der Veen, C. Platzer, Andrubis: android malware under the magnifying glass. Vienna University of Technology, Tech. Rep. TR-ISECLAB-0414-001 (2014)

    Google Scholar 

  21. P. Faruki, V. Ganmoor, V. Laxmi, M.S. Gaur, A. Bharmal, AndroSimilar: robust statistical feature signature for Android malware detection, Proceedings of the 6th International Conference on Security of Information and Networks (ACM, 2013), pp. 152–159

    Google Scholar 

  22. A.P. Felt, E. Chin, S. Hanna, D. Song, D. Wagner, Android permissions demystified, in Proceedings of the 18th ACM Conference on Computer and Communications Security (ACM, 2011), pp. 627–638

    Google Scholar 

  23. W. Tang, G. Jin, J. He, X. Jiang, Extending android security enforcement with a security distance model, in 2011 International Conference on Internet Technology and Applications (IEEE, 2011), pp. 1–4

    Google Scholar 

  24. M. Zheng, M. Sun, J.C.S. Lui, Droid analytics: a signature based analytic system to collect, extract, analyze and associate android malware, in 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (IEEE, 2013), pp. 163–171

    Google Scholar 

  25. E.R. Wognsen, H.S. Karlsen, M.C. Olesen, R.R. Hansen, Formalisation and analysis of Dalvik bytecode. Sci. Comput. Program. 92, 25–55 (2014)

    Article  Google Scholar 

  26. W. Enck, M. Ongtang, P. McDaniel, On lightweight mobile phone application certification, in Proceedings of the 16th ACM Conference on Computer and Communications Security (ACM, 2009), pp. 235–245

    Google Scholar 

  27. R. Sato, D. Chiba, S. Goto, Detecting android malware by analyzing manifest files. Proc. Asia-Pac. Adv. Netw. 36, 23–31 (2013)

    Article  Google Scholar 

  28. D.J. Wu, C.-H. Mao, T.-E. Wei, H.-M. Lee, K.-P. Wu, Droidmat: android malware detection through manifest and api calls tracing, in 2012 Seventh Asia Joint Conference on Information Security (IEEE, 2012), pp. 62–69

    Google Scholar 

  29. W. Zhou, Y. Zhou, X. Jiang, P. Ning, Detecting repackaged smartphone applications in third-party android marketplaces, in Proceedings of the Second ACM Conference on Data and Application Security and Privacy (ACM, 2012), pp. 317–326

    Google Scholar 

  30. C.Y. Huang, Y.-T. Tsai, C.-H. Hsu, Performance evaluation on permission-based detection for android malware, in Advances in Intelligent Systems and Applications, vol. 2 (Springer, Berlin, Heidelberg, 2013), pp. 111–120

    Google Scholar 

  31. Y. Aafer, W. Du, H. Yin, Droidapiminer: mining api-level features for robust malware detection in android, in International Conference on Security and Privacy in Communication Systems (Springer, Cham, 2013), pp. 86–103

    Google Scholar 

  32. E. Chin, A.P. Felt, K. Greenwood, D. Wagner, Analyzing inter-application communication in Android, in Proceedings of the 9th International Conference on Mobile Systems, Applications, and Services (ACM, 2011), pp. 239–252

    Google Scholar 

  33. D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, C.E.R.T. Siemens, Drebin: effective and explainable detection of android malware in your pocket. Ndss 14, 23–26 (2014)

    Google Scholar 

  34. I. Burguera, U. Zurutuza, S. Nadjm-Tehrani, Crowdroid: behavior-based malware detection system for android, in Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices (ACM, 2011), pp. 15–26

    Google Scholar 

  35. M. Zhao, F. Ge, T. Zhang, Z. Yuan, AntiMalDroid: an efficient SVM-based malware detection framework for android, in International Conference on Information Computing and Applications (Springer, Berlin, Heidelberg, 2011), pp. 158–166

    Google Scholar 

  36. W. Enck, P. Gilbert, S. Han, V. Tendulkar, B.-G. Chun, L.P. Cox, J. Jung, P. McDaniel, A.N. Sheth, TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM Trans. Comput. Syst. (TOCS) 32(2) (2014)

    Article  Google Scholar 

  37. L.K. Yan, H. Yin, DroidScope: seamlessly reconstructing the OS and Dalvik semantic views for dynamic android malware analysis, in Presented as Part of the 21st USENIX Security Symposium (USENIX Security 12) (2012), pp. 569–584

    Google Scholar 

  38. Y. Feng, S. Anand, I. Dillig, A. Aiken, Apposcopy: semantics-based detection of android malware through static analysis, in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (ACM, 2014), pp. 576–587

    Google Scholar 

  39. A. Narayanan, M. Chandramohan, L. Chen, Y. Liu, Context-aware, adaptive and scalable android malware detection through online learning (extended version). arXiv preprint arXiv:1706.00947 (2017)

  40. BlackHat, Reverse Engineering with Androguard https://code.google.com/androguard (Online; Accessed 29 Mar. 2013)

    Google Scholar 

  41. H. Kang, J. Jang, A. Mohaisen, H.K. Kim, Detecting and classifying android malware using static analysis along with creator information. Int. J. Distrib. Sens. Netw. 11(6) (2015)

    Article  Google Scholar 

  42. D. Octeau, S. Jha, M. Dering, P. McDaniel, A. Bartel, L. Li, J. Klein, Y.L. Traon, Combining static analysis with probabilistic models to enable market-scale android inter-component analysis, in ACM SIGPLAN Notices, vol. 51, no. 1 (ACM, 2016), pp. 469–484

    Article  Google Scholar 

  43. B. Amos, H. Turner, J. White, Applying machine learning classifiers to dynamic android malware detection at scale, in 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC) (IEEE, 2013), pp. 1666–1671

    Google Scholar 

  44. W.-C. Wu, S.-H. Hung, DroidDolphin: a dynamic Android malware detection framework using big data and machine learning, in Proceedings of the 2014 Conference on Research in Adaptive and Convergent Systems (ACM, 2014), pp. 247–252

    Google Scholar 

  45. S. Sheen, R. Anitha, V. Natarajan, Android based malware detection using a multifeature collaborative decision fusion approach. Neurocomputing 151, 905–912 (2015)

    Article  Google Scholar 

  46. M. Damshenas, A. Dehghantanha, K.-K. Raymond Choo, R. Mahmud, M0droid: an android behavioral-based malware detection model. J. Inf. Priv. Secur. 11(3), 141–157 (2015)

    Article  Google Scholar 

  47. R. Vinayakumar, K.P. Soman, P. Poornachandran, S. Sachin Kumar, Detecting android malware using long short-term memory (LSTM). J. Intell. Fuzzy Syst. 34(3), 1277–1288 (2018)

    Article  Google Scholar 

  48. M.Z. Mas’ ud, S. Sahib, M.F. Abdollah, S. Rahayu Selamat, R. Yusof, Analysis of features selection and machine learning classifier in android malware detection, in 2014 International Conference on Information Science & Applications (ICISA) (IEEE, 2014), pp. 1–5

    Google Scholar 

  49. A. Narayanan, M. Chandramohan, L. Chen, Y. Liu, A multi-view context-aware approach to Android malware detection and malicious code localization. Empirical Softw. Eng. 23(3), 1222–1274 (2018)

    Article  Google Scholar 

  50. K. Allix, T.F. Bissyandé, Q. Jérome, J. Klein, Y. Le Traon, Empirical assessment of machine learning-based malware detectors for Android. Empirical Softw. Eng. 21(1), 183–211 (2016)

    Article  Google Scholar 

  51. A. Azmoodeh, A. Dehghantanha, K.-K. Raymond Choo, Robust malware detection for internet of (battlefield) things devices using deep eigenspace learning. IEEE Trans. Sustain. Comput. 4(1), 88–95 (2018)

    Article  Google Scholar 

  52. A.F.A. Kadir, N. Stakhanova, A.A. Ghorbani, Android botnets: what urls are telling us, in International Conference on Network and System Security (Springer, Cham, 2015), pp. 78–91

    Google Scholar 

  53. Y. Zhou, X. Jiang, Dissecting android malware: characterization and evolution, in 2012 IEEE Symposium on Security and Privacy (IEEE, 2012), pp. 95–109

    Google Scholar 

  54. Botnet Research Team. SandDroid: An APK Analysis Sandbox. Xi’an Jiaotong University (2014)

    Google Scholar 

  55. M. Dash, H. Liu, Consistency-based search in feature selection. Artif. Intell. 151(1–2), 155–176 (2003)

    Article  MathSciNet  Google Scholar 

  56. R. Kohavi, G.H. John, Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)

    Article  Google Scholar 

  57. Z. Pawlak, Rough sets. Int. J. Comput. Inf. Sci. 11(5), 341–356 (1982)

    Article  Google Scholar 

  58. C.-Y. Huang, Y.-T. Tsai, C.-H. Hsu, Performance evaluation on permission-based detection for android malware, in Advances in Intelligent Systems and Applications, vol. 2 (Springer, Berlin, Heidelberg, 2013), pp. 111–120

    Google Scholar 

  59. I. Santos, B. Sanz, C. Laorden, F. Brezo, P.G. Bringas, Opcode-sequence-based semi-supervised unknown malware detection, in Computational Intelligence in Security for Information Systems (Springer, Berlin, Heidelberg, 2011), pp. 50–57

    Google Scholar 

  60. S. Kokoska, C. Nevison, Critical values for Cochran’s test, in Statistical Tables and Formulae (Springer, New York, 1989), p. 74

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mahindru, A., Sangal, A.L. (2020). Feature-Based Semi-supervised Learning to Detect Malware from Android. In: Automated Software Engineering: A Deep Learning-Based Approach. Learning and Analytics in Intelligent Systems, vol 8. Springer, Cham. https://doi.org/10.1007/978-3-030-38006-9_6

Download citation

Publish with us

Policies and ethics