Conservation of Feature Sub-spaces Across Rootkit Sub-families

Das, Prasenjit

doi:10.1007/978-981-13-0755-3_14

Conservation of Feature Sub-spaces Across Rootkit Sub-families

Prasenjit Das¹¹

Conference paper
First Online: 07 July 2018

519 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 805))

Abstract

Modern malware detection systems have largely relied on the definition of signatures to characterize malwares to their corresponding malware families. These signatures that characterize malware families are parts of codes and it is believed that families of malwares share commonalities in their signatures. We hypothesize that changes in these signatures generate newer sub-families of malwares. In the present work we have evaluated the signature conservation across two sub-families of rootkits. We have carried out our experiments to establish the fact that features in the rootkit family of malware are conserved. We report that our feature extraction yielded the accuracy of 84.17% using the Naïve Bayes classification algorithm. The results reported in this work reinforce our belief that there are subsets of independent features that discriminate between sub-families but not exhibiting any trend of conservation. We conclude that certain features (if not all) are preserved and discriminate between sub-families.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://avl.enemy.org/utils/hextools/.

References

Szor, P.: The Art of Computer Virus Research and Defense. Addison Wesley, Reading (2005)
Google Scholar
Christodorescu, M., Jha, S.: Static analysis of executables to detect malicious patterns. In: Proceedings of the 12th USENIX Security Symposium (Security 2003), pp. 169–186. USENIX Association (2003)
Google Scholar
McGraw, G., Morrisett, G.: Attacking malicious code: a report to the infosec research council. IEEE Soft. 17(5), 33–41 (2000)
Article Google Scholar
Golbeck, J., Hendler, J.: Reputation network analysis for email filtering. In: CEAS (2004)
Google Scholar
Newman, M.E.J., Forrest, S., Balthrop, J.: Email networks and the spread of computer viruses. Phys. Rev. E 66, 035101 (2002)
Article Google Scholar
Schultz, M., Eskin, E., Zadok, E.: MEF: malicious email filter, a UNIX mail filter that detects malicious windows executables. In: USENIX Annual Technical Conference - FREENIX Track, June 2001
Google Scholar
Masud, M.M., Khan, L., Thuraisingham, B.: Feature based techniques for auto-detection of novel email worms. In: The Eleventh Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) (2007)
Google Scholar
Singh, S., Estan, C., Varghese, G., Savage, S.: The Earlybird system for real-time detection of unknown worms. Technical report – cs 2003–0761, UCSD (2003)
Google Scholar
Kim, H.A., Karp, B.: Autograph: toward automated, distributed worm signature detection. In: The Proceedings of the 13th Usenix Security Symposium (Security 2004), San Diego, CA, August 2004
Google Scholar
Newsome, J., Karp, B., Song, D.: Polygraph: automatically generating signatures for polymorphic worms. In: Proceedings of the IEEE Symposium on Security and Privacy, May 2005
Google Scholar
Schultz, M., Eskin, E., Zadok, E., Stolfo, S.: Data mining methods for detection of new malicious executables. In: Proceedings of IEEE Symposium on Security and Privacy, pp. 178–184 (2001)
Google Scholar
Masud, M.M., Khan, L., Thuraisingham, B.: A hybrid model to detect malicious executables. In: Proceedings of 2007 IEEE International Conference on Communications, pp. 1443–1448. IEEE, June 2007
Google Scholar
Siddiqui, M., Wang, M.C., Lee, J.: Detecting trojans using data mining techniques. In: Hussain, D.M.A., Rajput, A.Q.K., Chowdhry, B.S., Gee, Q. (eds.) IMTIC 2008. CCIS, vol. 20, pp. 400–411. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89853-5_43
Chapter Google Scholar
Nataraj, L., Yegneswaran, V., Porras, P., Zhang, J.: A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In: ACM AISec 2011 (2011)
Google Scholar
Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)
Google Scholar
Perdisci, R., Lanzi, A., Lee, W.: Mcboost: boosting scalability in malware collection and analysis using statistical classification of executables. In: ACSAC 2008 (2008)
Google Scholar
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)
Article Google Scholar
Rieck, K., Krueger, T., Dewald, A.: Cujo: efficient detection and prevention of drive-bydownload attacks. In: ACSAC 2010 (2010)
Google Scholar
Jang, J., Brumley, D., Venkataraman, S.: Bitshred: feature hashing malware for scalable triage and semantic analysis. In: Proceedings of ACM CCS 2011 (2011)
Google Scholar
Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74320-0_10
Chapter Google Scholar
Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: NDSS 2009 (2009)
Google Scholar
Hartigan, J.A.: Direct clustering of a data matrix. J. Am. Statis. Assoc. 67(337), 123–129 (1972)
Article Google Scholar
Cheng, Y., Church, G.: Biclustering of expression data. In: International Conference on Intelligent Systems for Molecular Biology (ISMB), Department of Genetics, Harvard Medical School, Boston, MA 02115, USA, vol. 8, pp. 93–103 (1999)
Google Scholar
Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. Proc. Nat. Acad. Sci. 97(22), 12079–12084 (2000)
Article Google Scholar
Califano, A., Stolovitzky, G., Tu, Y.: Analysis of gene expression microarrays for phenotype classification. In: Proceedings of International Conference on Intelligent Systems for Molecular Biology (ISMB), vol. 8, pp. 75–85 (2000)
Google Scholar
Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Statistica Sinica 12, 61–86 (2002)
MathSciNet MATH Google Scholar
Segal, E., et al.: Rich probabilistic models for gene expression. Bioinformatics, 17 Suppl 1(1), S243–S252 (2001)
Article Google Scholar
Tang, C., et al.: Interrelated two-way clustering: an unsupervised approach for gene expression data analysis. In: Proceedings - 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering, BIBE 2001, pp. 41–48 (2001)
Google Scholar
Yang, J., et al.: Delta-clusters: capturing subspace correlation in a large data set. In: Proceedings of 18th International Conference on Data Engineering, p. 12 (2002)
Google Scholar
Kluger, Y., et al.: Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13(4), 703–716 (2003)
Article Google Scholar
Segal, E., Battle, A., Koller, D.: Decomposing gene expression into cellular processes. In: Pacific Symposium on Biocomputing, pp. 89–100 (2003)
Google Scholar
Liu, J., Wang, W.: OP-cluster: clustering by tendency in high dimensional space. In: Proceedings of Third IEEE International Conference on Data Mining, pp. 187–194 (2003)
Google Scholar
https://vxheaven.org/
Mitchell, T.M.: Machine Learning. McGraw‐Hill, Maidenhead (1997)
Google Scholar
Quinlan, J.R.: Programs for machine learning. Mach. Learn. 240, 302 (1993)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Hssina, B., Merbouha, A., Ezzikouri, H., Erritali, M.: A comparative study of decision tree ID3 and C4.5. Int. J. Adv. Comput. Sci. Appl. 4(2) (2014)
Google Scholar
Shabalin, A.A., Weigman, V.J., Perou, C.M., Nobel, A.B.: Finding large average submatrices in high dimensional data. Ann. Appl. Statis. 3, 985–1012 (2009)
Article MathSciNet Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of 30th STOC, pp. 604–613 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Chitkara University, Baddi, Himachal Pradesh, India
Prasenjit Das

Authors

Prasenjit Das
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prasenjit Das .

Editor information

Editors and Affiliations

Chitkara University, Chandigarh, India
Rajnish Sharma
Chitkara University, Chandigarh, India
Archana Mantri
Department of Computer Science and Electrical Engineering, Louisiana Tech University, Ruston, Louisiana, USA
Sumeet Dua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Das, P. (2018). Conservation of Feature Sub-spaces Across Rootkit Sub-families. In: Sharma, R., Mantri, A., Dua, S. (eds) Computing, Analytics and Networks. ICAN 2017. Communications in Computer and Information Science, vol 805. Springer, Singapore. https://doi.org/10.1007/978-981-13-0755-3_14

Download citation

DOI: https://doi.org/10.1007/978-981-13-0755-3_14
Published: 07 July 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0754-6
Online ISBN: 978-981-13-0755-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics