Abstract
Modern malware detection systems have largely relied on the definition of signatures to characterize malwares to their corresponding malware families. These signatures that characterize malware families are parts of codes and it is believed that families of malwares share commonalities in their signatures. We hypothesize that changes in these signatures generate newer sub-families of malwares. In the present work we have evaluated the signature conservation across two sub-families of rootkits. We have carried out our experiments to establish the fact that features in the rootkit family of malware are conserved. We report that our feature extraction yielded the accuracy of 84.17% using the Naïve Bayes classification algorithm. The results reported in this work reinforce our belief that there are subsets of independent features that discriminate between sub-families but not exhibiting any trend of conservation. We conclude that certain features (if not all) are preserved and discriminate between sub-families.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Szor, P.: The Art of Computer Virus Research and Defense. Addison Wesley, Reading (2005)
Christodorescu, M., Jha, S.: Static analysis of executables to detect malicious patterns. In: Proceedings of the 12th USENIX Security Symposium (Security 2003), pp. 169–186. USENIX Association (2003)
McGraw, G., Morrisett, G.: Attacking malicious code: a report to the infosec research council. IEEE Soft. 17(5), 33–41 (2000)
Golbeck, J., Hendler, J.: Reputation network analysis for email filtering. In: CEAS (2004)
Newman, M.E.J., Forrest, S., Balthrop, J.: Email networks and the spread of computer viruses. Phys. Rev. E 66, 035101 (2002)
Schultz, M., Eskin, E., Zadok, E.: MEF: malicious email filter, a UNIX mail filter that detects malicious windows executables. In: USENIX Annual Technical Conference - FREENIX Track, June 2001
Masud, M.M., Khan, L., Thuraisingham, B.: Feature based techniques for auto-detection of novel email worms. In: The Eleventh Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) (2007)
Singh, S., Estan, C., Varghese, G., Savage, S.: The Earlybird system for real-time detection of unknown worms. Technical report – cs 2003–0761, UCSD (2003)
Kim, H.A., Karp, B.: Autograph: toward automated, distributed worm signature detection. In: The Proceedings of the 13th Usenix Security Symposium (Security 2004), San Diego, CA, August 2004
Newsome, J., Karp, B., Song, D.: Polygraph: automatically generating signatures for polymorphic worms. In: Proceedings of the IEEE Symposium on Security and Privacy, May 2005
Schultz, M., Eskin, E., Zadok, E., Stolfo, S.: Data mining methods for detection of new malicious executables. In: Proceedings of IEEE Symposium on Security and Privacy, pp. 178–184 (2001)
Masud, M.M., Khan, L., Thuraisingham, B.: A hybrid model to detect malicious executables. In: Proceedings of 2007 IEEE International Conference on Communications, pp. 1443–1448. IEEE, June 2007
Siddiqui, M., Wang, M.C., Lee, J.: Detecting trojans using data mining techniques. In: Hussain, D.M.A., Rajput, A.Q.K., Chowdhry, B.S., Gee, Q. (eds.) IMTIC 2008. CCIS, vol. 20, pp. 400–411. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89853-5_43
Nataraj, L., Yegneswaran, V., Porras, P., Zhang, J.: A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In: ACM AISec 2011 (2011)
Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)
Perdisci, R., Lanzi, A., Lee, W.: Mcboost: boosting scalability in malware collection and analysis using statistical classification of executables. In: ACSAC 2008 (2008)
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)
Rieck, K., Krueger, T., Dewald, A.: Cujo: efficient detection and prevention of drive-bydownload attacks. In: ACSAC 2010 (2010)
Jang, J., Brumley, D., Venkataraman, S.: Bitshred: feature hashing malware for scalable triage and semantic analysis. In: Proceedings of ACM CCS 2011 (2011)
Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74320-0_10
Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: NDSS 2009 (2009)
Hartigan, J.A.: Direct clustering of a data matrix. J. Am. Statis. Assoc. 67(337), 123–129 (1972)
Cheng, Y., Church, G.: Biclustering of expression data. In: International Conference on Intelligent Systems for Molecular Biology (ISMB), Department of Genetics, Harvard Medical School, Boston, MA 02115, USA, vol. 8, pp. 93–103 (1999)
Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. Proc. Nat. Acad. Sci. 97(22), 12079–12084 (2000)
Califano, A., Stolovitzky, G., Tu, Y.: Analysis of gene expression microarrays for phenotype classification. In: Proceedings of International Conference on Intelligent Systems for Molecular Biology (ISMB), vol. 8, pp. 75–85 (2000)
Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Statistica Sinica 12, 61–86 (2002)
Segal, E., et al.: Rich probabilistic models for gene expression. Bioinformatics, 17 Suppl 1(1), S243–S252 (2001)
Tang, C., et al.: Interrelated two-way clustering: an unsupervised approach for gene expression data analysis. In: Proceedings - 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering, BIBE 2001, pp. 41–48 (2001)
Yang, J., et al.: Delta-clusters: capturing subspace correlation in a large data set. In: Proceedings of 18th International Conference on Data Engineering, p. 12 (2002)
Kluger, Y., et al.: Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13(4), 703–716 (2003)
Segal, E., Battle, A., Koller, D.: Decomposing gene expression into cellular processes. In: Pacific Symposium on Biocomputing, pp. 89–100 (2003)
Liu, J., Wang, W.: OP-cluster: clustering by tendency in high dimensional space. In: Proceedings of Third IEEE International Conference on Data Mining, pp. 187–194 (2003)
Mitchell, T.M.: Machine Learning. McGraw‐Hill, Maidenhead (1997)
Quinlan, J.R.: Programs for machine learning. Mach. Learn. 240, 302 (1993)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Hssina, B., Merbouha, A., Ezzikouri, H., Erritali, M.: A comparative study of decision tree ID3 and C4.5. Int. J. Adv. Comput. Sci. Appl. 4(2) (2014)
Shabalin, A.A., Weigman, V.J., Perou, C.M., Nobel, A.B.: Finding large average submatrices in high dimensional data. Ann. Appl. Statis. 3, 985–1012 (2009)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of 30th STOC, pp. 604–613 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Das, P. (2018). Conservation of Feature Sub-spaces Across Rootkit Sub-families. In: Sharma, R., Mantri, A., Dua, S. (eds) Computing, Analytics and Networks. ICAN 2017. Communications in Computer and Information Science, vol 805. Springer, Singapore. https://doi.org/10.1007/978-981-13-0755-3_14
Download citation
DOI: https://doi.org/10.1007/978-981-13-0755-3_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0754-6
Online ISBN: 978-981-13-0755-3
eBook Packages: Computer ScienceComputer Science (R0)