Abstract
Network operators and mobile carriers are facing serious security challenges caused by an increasing number of services provided by smartphone Apps. For example, Android OS has more than 1 million Apps in stores. Hence, network administrators tend to adopt strict policies to secure their infrastructure. The aim of this study is to propose an efficient framework that has a classification component based on traffic analysis of Android Apps. The framework differs from other proposed studies by focusing on identifying Apps traffic from a network perspective without introducing any overhead on subscribers smartphones. Additionally, it involves a technique for pre-processing network flows generated by Apps to acquire a set of features that are used to build an identification model using machine learning algorithms. The classification model is built using classification ensembles. A group of chosen users contribute in training the classification model, which learns the normal behavior of selected Apps. Eventually, the model should be able to detect abnormal behavior of similar Apps across the network. A 93.78% classification accuracy is achieved with a low false positive rate under 0.5%. In addition, the framework is able to detect abnormal flows of unknown classes by implementing an outlier detection mechanism and reported a 94% accuracy.
Similar content being viewed by others
References
Smartphone os market share 2015, 2014, 2013, and 2012. http://www.idc.com/prodserv/smartphone-os-market-share.jsp. Accessed 2016
Baghel SK, Keshav K, Manepalli VR (2012). An investigation into traffic analysis for diverse data applications on smartphones. In: IEEE 2012 national conference on communications (NCC), pp 1–5
Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1–2):105–139
Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. KDD workshop, vol 10. Seattle, WA, pp 359–370
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Burguera I, Zurutuza U, Nadjm-Tehrani S (2011). Crowdroid: behavior-based malware detection system for android. In Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices, pp 15–26. ACM. Chicago, IL, USA
Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data. University of California, Berkeley, pp 1–12
Choi Y, Chung JY, Park B, Hong JW-K (2012) Automated classifier generation for application-level mobile traffic identification. In: 2012 IEEE network operations and management symposium. IEEE. MAUI, HAWAII, USA, pp 1075–1081
Conti M, Mancini LV, Spolaor R, Verde NV (2016) Analyzing android encrypted network traffic to identify user actions. IEEE Trans Inf Forensics Secur 11(1):114–125
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Dai S, Tongaonkar A, Wang X, Nucci A, Song D (2013) Networkprofiler: towards automatic fingerprinting of android apps. In: INFOCOM, 2013 Proceedings IEEE. IEEE. Turin, Italy, pp 809–817
Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. CRC Press, Boca Raton
Falaki H, Lymberopoulos D, Mahajan R, Kandula S, Estrin D (2010). A first look at traffic on smartphones. In: Proceedings of the 10th ACM SIGCOMM conference on Internet measurement, pp 281–287. ACM. Melbourne, Australia
Johnson R, Wang Z, Gagnon C, Stavrou A (2012) Analysis of android applications’ permissions. In: 2012 IEEE sixth international conference on software security and reliability companion (SERE-C). IEEE. Gaithersburg, MD, USA, pp 45–46
Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques In: Proceedings of the 2007 conference on emerging artificial intelligence applications in computer engineering: real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologie. IOS Press, Netherlands, pp 3–24. http://dl.acm.org/citation.cfm?id=1566770.1566773
Kuncheva LI (2004). Classifier ensembles for changing environments. In: International workshop on multiple classifier systems, Springer, pp 1–15
Li J, Zhai L, Zhang X, Quan D (2014) Research of android malware detection based on network traffic monitoring. In: 2014 9th IEEE conference on industrial electronics and applications. IEEE. Hangzhou, China, pp 1739–1744
Miller KW, Voas JM, Hurlburt GF (2012) Byod: security and privacy considerations. It Prof 14(5):53–55
Mongkolluksamee S, Visoottiviseth V, Fukuda K (2016) Combining communication patterns and traffic patterns to enhance mobile traffic identification performance. J Inf Process 24(2):247–254
Moore AW (2001) Information gain. School of Computer Science, Carnegie Mellon University. http://www.cs.cmu.edu/~awm/tutorials
Murphey YL, Guo H, Feldkamp LA (2004) Neural learning from unbalanced data. Appl Intell 21(2):117–128
Nissim N, Moskovitch R, BarAd O, Rokach L, Elovici Y (2016) Aldroid: efficient update of android anti-virus software using designated active learning methods. Knowl Inf Syst 49(3):795–833
Oprişa C, Gavriluţ D, Cabău G (2016) A scalable approach for detecting plagiarized mobile applications. Knowl Inf Syst 49(1):143–169
Osuna E, Freund R, Girosi F (1997) Support vector machines: training and applications. Massachusetts Institute of Technology, USA. http://www.ncstrl.org:8900/ncstrl/servlet/search?formname=detail&id=oai%3Ancstrlh%3Amitai%3AMIT-AILab%2F%2FAIM-1602
Pieterse H, Olivier MS (2012) Android botnets on the rise: trends and characteristics. In: IEEE 2012 Information security for South Africa, pp 1–5
Qi Y, Cao M, Zhang C, Wu R (2014) A design of network behavior-based malware detection system for android. IN: International conference on algorithms and architectures for parallel processing. Springer. Dalian, China, pp 590–600
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Quinlan JR (1996) Bagging, boosting, and c4. 5. AAAI/IAAI 1:725–730
Rish I (2001) An empirical study of the naive bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 3. IBM New York. Seattle, Washington, USA, pp 41–46
Saab F, Elhajj I, Kayssi A, Chehab A (2016). A crowdsourcing game-theoretic intrusion detection and rating system. In Proceedings of the 31st annual ACM symposium on applied computing, pp 622–625. ACM
Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Nieves J, Bringas PG, Álvarez Marañón G (2013) Mama: manifest analysis for malware detection in android. Cybern Syst 44(6–7):469–488
Shabtai A, Kanonov U, Elovici Y, Glezer C, Weiss Y (2012) andromaly: a behavioral malware detection framework for android devices. J Intell Inf Syst 38(1):161–190
Shabtai A, Tenenboim-Chekina L, Mimran D, Rokach L, Shapira B, Elovici Y (2014) Mobile malware detection through analysis of deviations in application network behavior. Comput Secur 43:1–18
Taylor VF, Spolaor R, Conti M, Martinovic I (2016) Appscanner: automatic fingerprinting of smartphone apps from encrypted network traffic. In: 2016 IEEE European symposium on security and privacy (EuroS&P). IEEE. Saarbrcken, GERMANY, pp 439–454
Tsompanidis I, Zahran AH, Sreenan CJ (2014) Mobile network traffic: a user behaviour model. In: 2014 7th IFIP wireless and mobile networking conference (WMNC). IEEE. Vilamoura, Algarve, Portugal, pp 1–8
Upadhyaya S, Singh K (2012) Classification based outlier detection techniques. Int J Comput Trends Technol 3(2):294–298
Wei X, Gomez L, Neamtiu I, Faloutsos M (2012). Profiledroid: multi-layer profiling of android applications. In: Proceedings of the 18th annual international conference on Mobile computing and networking, pp 137–148. ACM. Istanbul, Turkey
Zaman M, Siddiqui T, Amin MR, Hossain MS (2015) Malware detection in android by network traffic analysis. In: 2015 International conference on networking systems and security (NSysS). IEEE. Dhaka, Bangladesh, pp 1–5
Zhang J, Zulkernine M, Haque A (2008) Random-forests-based network intrusion detection systems. IEEE Trans Syst Man Cybern C Appl Rev 38(5):649–659
Acknowledgements
This research is funded by TELUS Corp., Canada.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ajaeiya, G., Elhajj, I.H., Chehab, A. et al. Mobile Apps identification based on network flows. Knowl Inf Syst 55, 771–796 (2018). https://doi.org/10.1007/s10115-017-1111-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1111-8