TriDroid: a triage and classification framework for fast detection of mobile threats in android markets

Abstract

The Android platform is highly targeted by malware developers, which aim to infect the maximum number of mobile devices by uploading their malicious applications to different app markets. In order to keep a healthy Android ecosystem, app-markets check the maliciousness of newly submitted apps. These markets need to (a) correctly detect malicious app, and (b) speed up the detection process of the most likely dangerous applications among an overwhelming flow of submitted apps, to quickly mitigate their potential damages. To address these challenges, we propose TriDroid, a market-scale triage and classification system for Android apps. TriDroid prioritizes apps analysis according to their risk likelihood. To this end, we categorize the submitted apps as: botnet, general malware, and benign. TriDroid starts by performing a (1) Triage process, which applies a fast coarse-grained and less-accurate analysis on a continuous stream of the submitted apps to identify their corresponding queue in a three-class priority queuing system. Then, (2) the Classification process extracts fine-grained static features from the apps in the priority queue, and applies three-class machine learning classifiers to confirm with high accuracy the classification decisions of the triage process. In addition to the priority queuing model, we also propose a multi-server queuing model where the classification of each app category is run on a different server. Experiments on a dataset with more than 24K malicious and 3K benign applications show that the priority model offers a trade-off between waiting time and processing overhead, as it requires only one server compared to the multi-server model. Also it successfully prioritizes malicious apps analysis, which allows a short waiting time for dangerous applications compared to the FIFO policy.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. Acm sigmod record. ACM 22:207–216

    Google Scholar 

  2. Ahmed AA, Jabbar WA, Sadiq AS, Patel H (2020) Deep learning-based classification model for botnet attack detection. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-01848-9

    Article  Google Scholar 

  3. Anwar S, Zain JM, Inayat Z, Haq RU, Karim A, Jabir AN (2016) A static approach towards mobile botnet detection. In: 2016 3rd International conference on electronic design (ICED), IEEE, pp 563–567. https://doi.org/10.1109/ICED.2016.7804708

  4. Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K, Siemens C (2014) Drebin: effective and explainable detection of android malware in your pocket. In: NDSS

  5. Bergstra JS, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Advances in neural information processing systems, pp 2546–2554

  6. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    MATH  Article  Google Scholar 

  7. Cawley GC, Talbot NL (2010) On over-fitting in model selection and subsequent selection bias in performance evaluation. J Mach Learn Res 11:2079–2107

    MathSciNet  MATH  Google Scholar 

  8. Chakradeo S, Reaves B, Traynor P, Enck W (2013) Mast: Triage for market-scale mobile malware analysis. In: Proceedings of the sixth ACM conference on security and privacy in wireless and mobile networks, ACM, pp 13–24. https://doi.org/10.1145/2462096.2462100

  9. Chen S, Xue M, Tang Z, Xu L, Zhu H (2016) Stormdroid: A streaminglized machine learning-based system for detecting android malware. In: Proceedings of the 11th ACM on Asia conference on computer and communications security, ACM, pp 377–388. https://doi.org/10.1145/2897845.2897860

  10. Chicco D (2017) Ten quick tips for machine learning in computational biology. BioData Min 10(1):35

    Article  Google Scholar 

  11. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  12. da Costa VG, Barbon S, Miani RS, Rodrigues JJ, Zarpelão BB (2017) Detecting mobile botnets through machine learning and system calls analysis. In: 2017 IEEE international conference on communications (ICC), IEEE, pp 1–6. https://doi.org/10.1109/ICC.2017.7997390

  13. Dalziel H, Abraham A (2015) Automated security analysis of android and iOS applications with mobile security framework, 1st edn. Syngress Publishing

  14. Dharmalingam VP, Palanisamy V (2020) A novel permission ranking system for android malware detection-the permission grader. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-01957-5

    Article  Google Scholar 

  15. Digitaltrends (2019) Google insists it’s doing what it can to purge Play Store of malicious apps. https://tinyurl.com/y4bte92s. Accessed 13 June 2020

  16. Ding Y, Zhang X, Hu J, Xu W (2020) Android malware detection method based on bytecode image. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-02196-4

    Article  Google Scholar 

  17. Fournier-Viger P, Wu CW, Tseng VS (2012) Mining top-k association rules. In: Canadian conference on artificial intelligence, Springer, pp 61–73. https://doi.org/10.1007/978-3-642-30353-1_6

  18. Fournier-Viger P, Lin JCW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The spmf open-source data mining library version 2. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 36–40. https://doi.org/10.1007/978-3-319-46131-1_8

  19. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    MathSciNet  MATH  Article  Google Scholar 

  20. Fu H, Zheng Z, Bose S, Bishop M, Mohapatra P (2017) Leaksemantic: Identifying abnormal sensitive network transmissions in mobile applications. In: IEEE INFOCOM 2017-IEEE conference on computer communications, IEEE, pp 1–9. https://doi.org/10.1109/INFOCOM.2017.8057221

  21. Gdata (2018) Cyber attacks on Android devices on the rise. https://www.gdatasoftware.com/blog/2018/11/31255-cyber-attacks-on-android-devices-on-the-rise. Accessed 13 June 2020

  22. Girei DA, Shah MA, Shahid MB (2016) An enhanced botnet detection technique for mobile devices using log analysis. In: 2016 22nd International conference on automation and computing (ICAC), IEEE, pp 450–455

  23. Google (2020) Google Play Store. https://play.google.com. Accessed 13 June 2020

  24. Heaton J (2016) Comparing dataset characteristics that favor the apriori, eclat or fp-growth frequent itemset mining algorithms. In: SoutheastCon 2016, IEEE, pp 1–7

  25. Itpro (2018) Hackers building a botnet out of five million compromised Android devices. https://tinyurl.com/y3wu7p88. Accessed 13 June 2020

  26. Karbab EB, Debbabi M, Derhab A, Mouheb D (2016) Cypider: building community-based cyber-defense infrastructure for android malware detection. In: Proceedings of the 32nd annual conference on computer security applications, ACM, pp 348–362. https://doi.org/10.1145/2991079.2991124

  27. Karim A, Salleh R, Khan MK (2016) Smartbot: a behavioral analysis framework augmented with machine learning to identify mobile botnet applications. PLoS one 11(3):e0150077. https://dx.doi.org/10.1371/journal.pone.0150077

  28. Khan A, Baharudin B, Lee LH, Khan K (2010) A review of machine learning algorithms for text-documents classification. J Adv Inf Technol 1(1):4–20

    Google Scholar 

  29. Kohavi R, Quinlan JR (2002) Data mining tasks and methods: classification: decision-tree discovery. In: Handbook of data mining and knowledge discovery. Oxford University Press, Oxford, pp 267–276

  30. Kothari S (2020) Real time analysis of android applications by calculating risk factor to identify botnet attack. In: ICCCE 2019. Springer, pp 55–62. https://doi.org/10.1007/978-981-13-8715-9_7

  31. Lakovic V (2020) Crisis management of android botnet detection using adaptive neuro-fuzzy inference system. Ann Data Sci. https://doi.org/10.1007/s40745-020-00265-1

    Article  Google Scholar 

  32. Liao Y, Vemuri VR (2002) Use of k-nearest neighbor classifier for intrusion detection. Comput Secur 21(5):439–448

    Article  Google Scholar 

  33. Lin D, Patrick J, Labeau F (2014) Estimating the waiting time of multi-priority emergency patients with downstream blocking. Health Care Manag Sci 17(1):88–99

    Article  Google Scholar 

  34. Liu P, Wang W, Luo X, Wang H, Liu C (2020) Nsdroid: efficient multi-classification of android malware using neighborhood signature in local function call graphs. Int J Inf Secur. https://doi.org/10.1007/s10207-020-00489-5

    Article  Google Scholar 

  35. Matloff N (2008) Introduction to discrete-event simulation and the simpy language. Davis, CA Dept of Computer Science University of California at Davis Retrieved on August 2(2009):1–33

  36. Mehtab A, Shahid WB, Yaqoob T, Amjad MF, Abbas H, Afzal H, Saqib MN (2020) Addroid: rule-based machine learning framework for android malware analysis. Mob Netw Appl 25(1):180–192. https://doi.org/10.1007/s11036-019-01248-0

    Article  Google Scholar 

  37. Mirzaei O, Suarez-Tangil G, Tapiador J, de Fuentes JM (2017) Triflow: Triaging android applications using speculative information flows. In: Proceedings of the 2017 ACM on Asia conference on computer and communications security, ACM, pp 640–651. https://doi.org/10.1145/3052973.3053001

  38. Moodi M, Ghazvini M (2019) A new method for assigning appropriate labels to create a 28 standard android botnet dataset (28-sabd). J Ambient Intell Hum Comput 10(11):4579–4593. https://doi.org/10.1007/s12652-018-1140-5

    Article  Google Scholar 

  39. Moodi M, Ghazvini M, Moodi H, Ghavami B (2020) A smart adaptive particle swarm optimization–support vector machine: android botnet detection application. J Supercomput. https://doi.org/10.1007/s11227-020-03233-x

    Article  Google Scholar 

  40. Nolan G (2012) Decompiling android, 1st edn. Apress, New York

    Google Scholar 

  41. Oberheide J, Miller C (2012) Dissecting the android bouncer. SummerCon2012, New York

  42. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  43. Rasheed MM, Faieq AK, Hashim AA (2020) Android botnet detection using machine learning. Ingénierie des Systèmes d’Information 25. https://doi.org/10.18280/isi.250117

  44. Rasthofer S, Arzt S, Kolhagen M, Pfretzschner B, Huber S, Bodden E, Richter P (2015) Droidsearch: a tool for scaling android app triage to real-world app stores. In: Science and information conference (SAI), 2015, IEEE, pp 247–256. https://doi.org/10.1109/SAI.2015.7237151

  45. Sandeep H (2019) Static analysis of android malware detection using deep learning. In: 2019 International conference on intelligent computing and control systems (ICCS), IEEE, pp 841–845. https://doi.org/10.1109/ICCS45141.2019.9065765

  46. Saracino A, Sgandurra D, Dini G, Martinelli F (2016) Madam: effective and efficient behavior-based android malware detection and prevention. IEEE Trans Dependable Secure Comput. https://doi.org/10.1109/TDSC.2016.2536605

  47. Sartea R, Farinelli A, Murari M (2020) Secur-ama: active malware analysis based on monte carlo tree search for android systems. Eng Appl Artif Intell 87:103303. https://doi.org/10.1016/j.engappai.2019.103303

    Article  Google Scholar 

  48. Securityaffairs (2018) HiddenMiner Android Cryptocurrency miner can brick your device. https://securityaffairs.co/wordpress/70968/malware/hiddenminer-android-miner.html. Accessed 13 June 2020

  49. Sheen S, Anitha R, Natarajan V (2015) Android based malware detection using a multifeature collaborative decision fusion approach. Neurocomputing 151:905–912

    Article  Google Scholar 

  50. Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437

    Article  Google Scholar 

  51. Srivastava S, Gupta MR, Frigyik BA (2007) Bayesian quadratic discriminant analysis. J Mach Learn Res 8:1277–1305

    MathSciNet  MATH  Google Scholar 

  52. Statcounter (2020) Mobile Os market share. http://gs.statcounter.com/os-market-share/mobile/worldwide. Accessed 13 June 2020

  53. Symantec (2012) Android.Bmaster: a million-dollar mobile botnet. https://tinyurl.com/yyrnb289. Accessed 13 June 2020

  54. Thakkar A, Lohiya R (2020) Attack classification using feature selection techniques: a comparative study. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-02167-9

    Article  Google Scholar 

  55. Vij D, Balachandran V, Thomas T, Surendran R (2020) Gramac: A graph based android malware classification mechanism. In: Proceedings of the tenth ACM conference on data and application security and privacy, pp 156–158. https://doi.org/10.1145/3374664.3379530

  56. Wang G, Liu Z (2020) Android malware detection model based on lightgbm. In: Recent trends in intelligent computing, communication and devices. Springer, pp 237–243. https://doi.org/10.1007/978-981-13-9406-5_29

  57. Wang W, Zhao M, Wang J (2019) Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J Ambient Intell Hum Comput 10(8):3035–3043. https://doi.org/10.1007/s12652-018-0803-6

    Article  Google Scholar 

  58. Wang W, Shang Y, He Y, Li Y, Liu J (2020) Botmark: automated botnet detection with hybrid analysis of flow-based and graph-based traffic behaviors. Inf Sci 511:284–296. https://doi.org/10.1016/j.ins.2019.09.024

    Article  Google Scholar 

  59. Wei F, Li Y, Roy S, Ou X, Zhou W (2017) Deep ground truth analysis of current android malware. In: International conference on detection of intrusions and malware, and vulnerability assessment. Springer, pp 252–276. https://doi.org/10.1007/978-3-319-60876-1_12

  60. Yuan B, Wang J, Liu D, Guo W, Wu P, Bao X (2020) Byte-level malware classification based on Markov images and deep learning. Comput Secur 92:101740. https://doi.org/10.1016/j.cose.2020.101740

    Article  Google Scholar 

Download references

Acknowledgements

The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through research group No (RG-1439-021).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Abdelouahab Amira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Amira, A., Derhab, A., Karbab, E.B. et al. TriDroid: a triage and classification framework for fast detection of mobile threats in android markets. J Ambient Intell Human Comput (2020). https://doi.org/10.1007/s12652-020-02243-0

Download citation

Keywords

  • Android security
  • App triage
  • Malware detection
  • Data mining
  • Machine learning