Skip to main content
Log in

Mobile Apps identification based on network flows

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Network operators and mobile carriers are facing serious security challenges caused by an increasing number of services provided by smartphone Apps. For example, Android OS has more than 1 million Apps in stores. Hence, network administrators tend to adopt strict policies to secure their infrastructure. The aim of this study is to propose an efficient framework that has a classification component based on traffic analysis of Android Apps. The framework differs from other proposed studies by focusing on identifying Apps traffic from a network perspective without introducing any overhead on subscribers smartphones. Additionally, it involves a technique for pre-processing network flows generated by Apps to acquire a set of features that are used to build an identification model using machine learning algorithms. The classification model is built using classification ensembles. A group of chosen users contribute in training the classification model, which learns the normal behavior of selected Apps. Eventually, the model should be able to detect abnormal behavior of similar Apps across the network. A 93.78% classification accuracy is achieved with a low false positive rate under 0.5%. In addition, the framework is able to detect abnormal flows of unknown classes by implementing an outlier detection mechanism and reported a 94% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Smartphone os market share 2015, 2014, 2013, and 2012. http://www.idc.com/prodserv/smartphone-os-market-share.jsp. Accessed 2016

  2. Baghel SK, Keshav K, Manepalli VR (2012). An investigation into traffic analysis for diverse data applications on smartphones. In: IEEE 2012 national conference on communications (NCC), pp 1–5

  3. Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1–2):105–139

    Article  Google Scholar 

  4. Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. KDD workshop, vol 10. Seattle, WA, pp 359–370

  5. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  Google Scholar 

  6. Burguera I, Zurutuza U, Nadjm-Tehrani S (2011). Crowdroid: behavior-based malware detection system for android. In Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices, pp 15–26. ACM. Chicago, IL, USA

  7. Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data. University of California, Berkeley, pp 1–12

    Google Scholar 

  8. Choi Y, Chung JY, Park B, Hong JW-K (2012) Automated classifier generation for application-level mobile traffic identification. In: 2012 IEEE network operations and management symposium. IEEE. MAUI, HAWAII, USA, pp 1075–1081

  9. Conti M, Mancini LV, Spolaor R, Verde NV (2016) Analyzing android encrypted network traffic to identify user actions. IEEE Trans Inf Forensics Secur 11(1):114–125

    Article  Google Scholar 

  10. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  11. Dai S, Tongaonkar A, Wang X, Nucci A, Song D (2013) Networkprofiler: towards automatic fingerprinting of android apps. In: INFOCOM, 2013 Proceedings IEEE. IEEE. Turin, Italy, pp 809–817

  12. Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. CRC Press, Boca Raton

    MATH  Google Scholar 

  13. Falaki H, Lymberopoulos D, Mahajan R, Kandula S, Estrin D (2010). A first look at traffic on smartphones. In: Proceedings of the 10th ACM SIGCOMM conference on Internet measurement, pp 281–287. ACM. Melbourne, Australia

  14. Johnson R, Wang Z, Gagnon C, Stavrou A (2012) Analysis of android applications’ permissions. In: 2012 IEEE sixth international conference on software security and reliability companion (SERE-C). IEEE. Gaithersburg, MD, USA, pp 45–46

  15. Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques In: Proceedings of the 2007 conference on emerging artificial intelligence applications in computer engineering: real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologie. IOS Press, Netherlands, pp 3–24. http://dl.acm.org/citation.cfm?id=1566770.1566773

  16. Kuncheva LI (2004). Classifier ensembles for changing environments. In: International workshop on multiple classifier systems, Springer, pp 1–15

  17. Li J, Zhai L, Zhang X, Quan D (2014) Research of android malware detection based on network traffic monitoring. In: 2014 9th IEEE conference on industrial electronics and applications. IEEE. Hangzhou, China, pp 1739–1744

  18. Miller KW, Voas JM, Hurlburt GF (2012) Byod: security and privacy considerations. It Prof 14(5):53–55

    Article  Google Scholar 

  19. Mongkolluksamee S, Visoottiviseth V, Fukuda K (2016) Combining communication patterns and traffic patterns to enhance mobile traffic identification performance. J Inf Process 24(2):247–254

    Google Scholar 

  20. Moore AW (2001) Information gain. School of Computer Science, Carnegie Mellon University. http://www.cs.cmu.edu/~awm/tutorials

  21. Murphey YL, Guo H, Feldkamp LA (2004) Neural learning from unbalanced data. Appl Intell 21(2):117–128

    Article  MATH  Google Scholar 

  22. Nissim N, Moskovitch R, BarAd O, Rokach L, Elovici Y (2016) Aldroid: efficient update of android anti-virus software using designated active learning methods. Knowl Inf Syst 49(3):795–833

    Article  Google Scholar 

  23. Oprişa C, Gavriluţ D, Cabău G (2016) A scalable approach for detecting plagiarized mobile applications. Knowl Inf Syst 49(1):143–169

    Article  Google Scholar 

  24. Osuna E, Freund R, Girosi F (1997) Support vector machines: training and applications. Massachusetts Institute of Technology, USA. http://www.ncstrl.org:8900/ncstrl/servlet/search?formname=detail&id=oai%3Ancstrlh%3Amitai%3AMIT-AILab%2F%2FAIM-1602

  25. Pieterse H, Olivier MS (2012) Android botnets on the rise: trends and characteristics. In: IEEE 2012 Information security for South Africa, pp 1–5

  26. Qi Y, Cao M, Zhang C, Wu R (2014) A design of network behavior-based malware detection system for android. IN: International conference on algorithms and architectures for parallel processing. Springer. Dalian, China, pp 590–600

  27. Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

    Google Scholar 

  28. Quinlan JR (1996) Bagging, boosting, and c4. 5. AAAI/IAAI 1:725–730

    Google Scholar 

  29. Rish I (2001) An empirical study of the naive bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 3. IBM New York. Seattle, Washington, USA, pp 41–46

  30. Saab F, Elhajj I, Kayssi A, Chehab A (2016). A crowdsourcing game-theoretic intrusion detection and rating system. In Proceedings of the 31st annual ACM symposium on applied computing, pp 622–625. ACM

  31. Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Nieves J, Bringas PG, Álvarez Marañón G (2013) Mama: manifest analysis for malware detection in android. Cybern Syst 44(6–7):469–488

    Article  Google Scholar 

  32. Shabtai A, Kanonov U, Elovici Y, Glezer C, Weiss Y (2012) andromaly: a behavioral malware detection framework for android devices. J Intell Inf Syst 38(1):161–190

    Article  Google Scholar 

  33. Shabtai A, Tenenboim-Chekina L, Mimran D, Rokach L, Shapira B, Elovici Y (2014) Mobile malware detection through analysis of deviations in application network behavior. Comput Secur 43:1–18

    Article  Google Scholar 

  34. Taylor VF, Spolaor R, Conti M, Martinovic I (2016) Appscanner: automatic fingerprinting of smartphone apps from encrypted network traffic. In: 2016 IEEE European symposium on security and privacy (EuroS&P). IEEE. Saarbrcken, GERMANY, pp 439–454

  35. Tsompanidis I, Zahran AH, Sreenan CJ (2014) Mobile network traffic: a user behaviour model. In: 2014 7th IFIP wireless and mobile networking conference (WMNC). IEEE. Vilamoura, Algarve, Portugal, pp 1–8

  36. Upadhyaya S, Singh K (2012) Classification based outlier detection techniques. Int J Comput Trends Technol 3(2):294–298

    Google Scholar 

  37. Wei X, Gomez L, Neamtiu I, Faloutsos M (2012). Profiledroid: multi-layer profiling of android applications. In: Proceedings of the 18th annual international conference on Mobile computing and networking, pp 137–148. ACM. Istanbul, Turkey

  38. Zaman M, Siddiqui T, Amin MR, Hossain MS (2015) Malware detection in android by network traffic analysis. In: 2015 International conference on networking systems and security (NSysS). IEEE. Dhaka, Bangladesh, pp 1–5

  39. Zhang J, Zulkernine M, Haque A (2008) Random-forests-based network intrusion detection systems. IEEE Trans Syst Man Cybern C Appl Rev 38(5):649–659

    Article  Google Scholar 

Download references

Acknowledgements

This research is funded by TELUS Corp., Canada.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Georgi Ajaeiya.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ajaeiya, G., Elhajj, I.H., Chehab, A. et al. Mobile Apps identification based on network flows. Knowl Inf Syst 55, 771–796 (2018). https://doi.org/10.1007/s10115-017-1111-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-017-1111-8

Keywords

Navigation