Abstract
The influence of portable devices in our day-to-day activities is of a concern due to possibilities of a security breach. A large number of malwares are concealed inside Android apps which requires high-performance Android malware detection systems. To increase the performance, we have applied ensemble learning at feature selection level (pre-classification) and at prediction level (post-classification). The features extracted are the API classes and for generating the model, extreme learning machine (ELM) has been used. The filter feature selection methods employed are Chi-Square, OneR, and Relief. The experimental results on a corpus of 14762 Android apps show that ensemble learning is promising and results in high performance as compared to the individual classifier. We also present a comparison of the pre- and post-classification ensemble approaches for the Android malware detection problem.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Simpson, R.: Android overtakes Windows for first time. http://gs.statcounter.com/press/android-overtakes-windows-for-first-time
Loeffler, A.: Virginia Tech researchers: Android apps can conspire to mine information from your smartphone. https://vtnews.vt.edu/articles/2017/03/eng-compsci-androidapps.html
Google Play Protect. https://www.android.com/play-protect
AV-TEST: Android Security Apps Provide Better Protection than Google Play Protect. https://www.av-test.org/en/news/news-single-view/android-security-apps-provide-better-protection-than-google-play-protect/
Yerima, S.Y., Sezer, S., McWilliams, G., Muttik, I.: A new android malware detection approach using Bayesian classification. In: 2013 IEEE 27th International Conference on Advanced Information Networking and Applications, pp. 121–128 (2013)
Idrees, F., Rajarajan, M., Conti, M., Chen, T.M., Rahulamathavan, Y.: PIndroid: a novel Android malware detection system using ensemble learning methods. Comput. Secur. 68, 36–46 (2017)
Zhu, H.J., You, Z.H., Zhu, Z.X., Shi, W.L., Chen, X., Cheng, L.: DroidDet: effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing 272, 638–646 (2018)
Zhang, W., Ren, H., Jiang, Q., Zhang, K.: Exploring feature extraction and ELM in malware detection for android devices. In: Hu, X., Xia, Y., Zhang, Y., Zhao, D. (eds.) ISNN 2015. LNCS, vol. 9377, pp. 489–498. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25393-0_54
Demertzis, K., Iliadis, L.: Bio-inspired hybrid intelligent method for detecting android malware. Adv. Intell. Syst. Comput. 416, 289–304 (2016)
Sun, Y., Xie, Y., Qiu, Z., Pan, Y., Weng, J., Guo, S.: Detecting android malware based on extreme learning machine. In: 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing, 15th International Conference on Pervasive Intelligence and Computing, 3rd International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 47–53 (2017)
Class Index. https://developer.android.com/reference/classes.html
Sung, A., Mukkamala, S.: Identifying important features for intrusion detection using support vector machines and neural networks. In: Proceedings of the 2003 Symposium on Applications and the Internet, pp. 3–10 (2003)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51, 181–207 (2003)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: An ensemble of filters and classifiers for microarray data classification. Pattern Recogn. 45, 531–539 (2012)
Tsai, C.F., Hsiao, Y.C.: Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches. Decis. Support Syst. 50, 258–269 (2010)
Imam, I.F., Michalski, R.S., Kerschberg, L.: Discovering attribute dependence in databases by integrating symbolic learning and statistical analysis techniques. In: Proceedings of the 1st International Workshop on Knowledge Discovery in Databases, Washington, DC, pp. 1–13 (1993)
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11, 63–90 (1993)
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of AAAI 1992, pp. 129–134 (1992)
Ding, S.F., Xu, X.Z., Nie, R.: Extreme learning machine and its applications. Neural Comput. Appl. 25, 549–556 (2014)
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 985–990 (2004)
Huang, G.-B.B., Zhu, Q.-Y.Y., Siew, C.-K.K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)
Huang, G.B.: Learning capability and storage capacity of two-hidden-layer feedforward networks. IEEE Trans. Neural Netw. 14, 274–281 (2003)
Huang, G.B., Chen, L.: Convex incremental extreme learning machine. Neurocomputing 70, 3056–3062 (2007)
Huang, G.B., Chen, L., Siew, C.K.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17, 879–892 (2006)
Rao, C.R., Mitra, S.K.: Generalized Inverse of Matrices and Its Applications, vol. 7. Wiley, New York (1971)
Petrakova, A., Affenzeller, M., Merkurjeva, G.: Heterogeneous versus homogeneous machine learning ensembles. Inf. Technol. Manag. Sci. 18, 135–140 (2015)
Dietterich, T.G.: Ensemble methods in machine learning. In: International Workshop on Multiple Classifier Systems, pp. 1–15 (2000)
Aswini, A.M., Vinod, P.: Android malware analysis using ensemble features. In: Chakraborty, R.S., Matyas, V., Schaumont, P. (eds.) SPACE 2014. LNCS, vol. 8804, pp. 303–318. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12060-7_20
Sheen, S., Anitha, R., Natarajan, V.: Android based malware detection using a multifeature collaborative decision fusion approach. Neurocomputing 151, 905–912 (2015)
Google Play. https://play.google.com
Kang, H., Jang, J.W., Mohaisen, A., Kim, H.K.: Detecting and classifying android malware using static analysis along with creator information. Int. J. Distrib. Sens. Netw. 2015 (2015)
Arp, D., Spreitzenbarth, M., Malte, H., Gascon, H., Rieck, K.: DREBIN: effective and explainable detection of android malware in your pocket. In: Symposium on Network and Distributed System Security, pp. 23–26 (2014)
Virus Total. https://www.virustotal.com/
Androguard. https://github.com/androguard/androguard
Bolon-Canedo, V., Sanchez-Marono, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34, 483–519 (2013)
Wang, H.: A comparative study of ensemble feature selection techniques for software defect prediction. Mach. Learn. Appl. 135–140 (2010)
R Development Core Team: R: a language and environment for statistical computing. The R Foundation for Statistical Computing, Vienna, Austria (2005)
Romanski, P., Kotthoff, L.: FSelector: Selecting Attributes. https://cran.r-project.org/package=FSelector
Gosso, A.: elmNN: implementation of ELM (Extreme Learning Machine) algorithm for SLFN (Single Hidden Layer Feedforward Neural Networks). https://cran.r-project.org/package=elmNN
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Badhani, S., Muttoo, S.K. (2018). Comparative Analysis of Pre- and Post-Classification Ensemble Methods for Android Malware Detection. In: Singh, M., Gupta, P., Tyagi, V., Flusser, J., Ören, T. (eds) Advances in Computing and Data Sciences. ICACDS 2018. Communications in Computer and Information Science, vol 906. Springer, Singapore. https://doi.org/10.1007/978-981-13-1813-9_44
Download citation
DOI: https://doi.org/10.1007/978-981-13-1813-9_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1812-2
Online ISBN: 978-981-13-1813-9
eBook Packages: Computer ScienceComputer Science (R0)