Skip to main content

Conservation of Feature Sub-spaces Across Rootkit Sub-families

  • Conference paper
  • First Online:
  • 519 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 805))

Abstract

Modern malware detection systems have largely relied on the definition of signatures to characterize malwares to their corresponding malware families. These signatures that characterize malware families are parts of codes and it is believed that families of malwares share commonalities in their signatures. We hypothesize that changes in these signatures generate newer sub-families of malwares. In the present work we have evaluated the signature conservation across two sub-families of rootkits. We have carried out our experiments to establish the fact that features in the rootkit family of malware are conserved. We report that our feature extraction yielded the accuracy of 84.17% using the Naïve Bayes classification algorithm. The results reported in this work reinforce our belief that there are subsets of independent features that discriminate between sub-families but not exhibiting any trend of conservation. We conclude that certain features (if not all) are preserved and discriminate between sub-families.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://avl.enemy.org/utils/hextools/.

References

  1. Szor, P.: The Art of Computer Virus Research and Defense. Addison Wesley, Reading (2005)

    Google Scholar 

  2. Christodorescu, M., Jha, S.: Static analysis of executables to detect malicious patterns. In: Proceedings of the 12th USENIX Security Symposium (Security 2003), pp. 169–186. USENIX Association (2003)

    Google Scholar 

  3. McGraw, G., Morrisett, G.: Attacking malicious code: a report to the infosec research council. IEEE Soft. 17(5), 33–41 (2000)

    Article  Google Scholar 

  4. Golbeck, J., Hendler, J.: Reputation network analysis for email filtering. In: CEAS (2004)

    Google Scholar 

  5. Newman, M.E.J., Forrest, S., Balthrop, J.: Email networks and the spread of computer viruses. Phys. Rev. E 66, 035101 (2002)

    Article  Google Scholar 

  6. Schultz, M., Eskin, E., Zadok, E.: MEF: malicious email filter, a UNIX mail filter that detects malicious windows executables. In: USENIX Annual Technical Conference - FREENIX Track, June 2001

    Google Scholar 

  7. Masud, M.M., Khan, L., Thuraisingham, B.: Feature based techniques for auto-detection of novel email worms. In: The Eleventh Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) (2007)

    Google Scholar 

  8. Singh, S., Estan, C., Varghese, G., Savage, S.: The Earlybird system for real-time detection of unknown worms. Technical report – cs 2003–0761, UCSD (2003)

    Google Scholar 

  9. Kim, H.A., Karp, B.: Autograph: toward automated, distributed worm signature detection. In: The Proceedings of the 13th Usenix Security Symposium (Security 2004), San Diego, CA, August 2004

    Google Scholar 

  10. Newsome, J., Karp, B., Song, D.: Polygraph: automatically generating signatures for polymorphic worms. In: Proceedings of the IEEE Symposium on Security and Privacy, May 2005

    Google Scholar 

  11. Schultz, M., Eskin, E., Zadok, E., Stolfo, S.: Data mining methods for detection of new malicious executables. In: Proceedings of IEEE Symposium on Security and Privacy, pp. 178–184 (2001)

    Google Scholar 

  12. Masud, M.M., Khan, L., Thuraisingham, B.: A hybrid model to detect malicious executables. In: Proceedings of 2007 IEEE International Conference on Communications, pp. 1443–1448. IEEE, June 2007

    Google Scholar 

  13. Siddiqui, M., Wang, M.C., Lee, J.: Detecting trojans using data mining techniques. In: Hussain, D.M.A., Rajput, A.Q.K., Chowdhry, B.S., Gee, Q. (eds.) IMTIC 2008. CCIS, vol. 20, pp. 400–411. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89853-5_43

    Chapter  Google Scholar 

  14. Nataraj, L., Yegneswaran, V., Porras, P., Zhang, J.: A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In: ACM AISec 2011 (2011)

    Google Scholar 

  15. Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)

    Google Scholar 

  16. Perdisci, R., Lanzi, A., Lee, W.: Mcboost: boosting scalability in malware collection and analysis using statistical classification of executables. In: ACSAC 2008 (2008)

    Google Scholar 

  17. Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)

    Article  Google Scholar 

  18. Rieck, K., Krueger, T., Dewald, A.: Cujo: efficient detection and prevention of drive-bydownload attacks. In: ACSAC 2010 (2010)

    Google Scholar 

  19. Jang, J., Brumley, D., Venkataraman, S.: Bitshred: feature hashing malware for scalable triage and semantic analysis. In: Proceedings of ACM CCS 2011 (2011)

    Google Scholar 

  20. Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74320-0_10

    Chapter  Google Scholar 

  21. Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: NDSS 2009 (2009)

    Google Scholar 

  22. Hartigan, J.A.: Direct clustering of a data matrix. J. Am. Statis. Assoc. 67(337), 123–129 (1972)

    Article  Google Scholar 

  23. Cheng, Y., Church, G.: Biclustering of expression data. In: International Conference on Intelligent Systems for Molecular Biology (ISMB), Department of Genetics, Harvard Medical School, Boston, MA 02115, USA, vol. 8, pp. 93–103 (1999)

    Google Scholar 

  24. Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. Proc. Nat. Acad. Sci. 97(22), 12079–12084 (2000)

    Article  Google Scholar 

  25. Califano, A., Stolovitzky, G., Tu, Y.: Analysis of gene expression microarrays for phenotype classification. In: Proceedings of International Conference on Intelligent Systems for Molecular Biology (ISMB), vol. 8, pp. 75–85 (2000)

    Google Scholar 

  26. Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Statistica Sinica 12, 61–86 (2002)

    MathSciNet  MATH  Google Scholar 

  27. Segal, E., et al.: Rich probabilistic models for gene expression. Bioinformatics, 17 Suppl 1(1), S243–S252 (2001)

    Article  Google Scholar 

  28. Tang, C., et al.: Interrelated two-way clustering: an unsupervised approach for gene expression data analysis. In: Proceedings - 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering, BIBE 2001, pp. 41–48 (2001)

    Google Scholar 

  29. Yang, J., et al.: Delta-clusters: capturing subspace correlation in a large data set. In: Proceedings of 18th International Conference on Data Engineering, p. 12 (2002)

    Google Scholar 

  30. Kluger, Y., et al.: Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13(4), 703–716 (2003)

    Article  Google Scholar 

  31. Segal, E., Battle, A., Koller, D.: Decomposing gene expression into cellular processes. In: Pacific Symposium on Biocomputing, pp. 89–100 (2003)

    Google Scholar 

  32. Liu, J., Wang, W.: OP-cluster: clustering by tendency in high dimensional space. In: Proceedings of Third IEEE International Conference on Data Mining, pp. 187–194 (2003)

    Google Scholar 

  33. https://vxheaven.org/

  34. Mitchell, T.M.: Machine Learning. McGraw‐Hill, Maidenhead (1997)

    Google Scholar 

  35. Quinlan, J.R.: Programs for machine learning. Mach. Learn. 240, 302 (1993)

    Google Scholar 

  36. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  37. Hssina, B., Merbouha, A., Ezzikouri, H., Erritali, M.: A comparative study of decision tree ID3 and C4.5. Int. J. Adv. Comput. Sci. Appl. 4(2) (2014)

    Google Scholar 

  38. Shabalin, A.A., Weigman, V.J., Perou, C.M., Nobel, A.B.: Finding large average submatrices in high dimensional data. Ann. Appl. Statis. 3, 985–1012 (2009)

    Article  MathSciNet  Google Scholar 

  39. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of 30th STOC, pp. 604–613 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prasenjit Das .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Das, P. (2018). Conservation of Feature Sub-spaces Across Rootkit Sub-families. In: Sharma, R., Mantri, A., Dua, S. (eds) Computing, Analytics and Networks. ICAN 2017. Communications in Computer and Information Science, vol 805. Springer, Singapore. https://doi.org/10.1007/978-981-13-0755-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-0755-3_14

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-0754-6

  • Online ISBN: 978-981-13-0755-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics