Skip to main content

Symbolic Deobfuscation: From Virtualized Code Back to the Original

  • Conference paper
  • First Online:
Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA 2018)

Abstract

Software protection has taken an important place during the last decade in order to protect legit software against reverse engineering or tampering. Virtualization is considered as one of the very best defenses against such attacks. We present a generic approach based on symbolic path exploration, taint and recompilation allowing to recover, from a virtualized code, a devirtualized code semantically identical to the original one and close in size. We define criteria and metrics to evaluate the relevance of the deobfuscated results in terms of correctness and precision. Finally we propose an open-source setup allowing to evaluate the proposed approach against several forms of virtualization.

Work partially funded by ANR and PIA under grant ANR-15-IDEX-02.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Solving the Tigress Challenge was presented at the French industrial conference SSTIC’17 [18]. The work presented here adds a revisited description of the method, a strong systematic experimental evaluation as well as new metrics to evaluate the accuracy of the approach.

  2. 2.

    Such as simplifycfg and instcombine.

  3. 3.

    https://github.com/JonathanSalwan/Tigress_protection/blob/master/solve-vm.py.

  4. 4.

    https://github.com/JonathanSalwan/Tigress_protection.

  5. 5.

    Thanks to Christian Collberg for having provided us the original source codes.

  6. 6.

    http://tigress.cs.arizona.edu.

  7. 7.

    MD5 is one of the most involving examples in our benchmark.

  8. 8.

    http://tigress.cs.arizona.edu/challenges.html#current.

  9. 9.

    http://tigress.cs.arizona.edu/index.htm.

References

  1. Bardin, S., David, R., Marion, J.-Y.: Backward-bounded DSE: targeting infeasibility questions on obfuscated codes. In: S&P, pp. 633–651. IEEE (2017)

    Google Scholar 

  2. Banescu, S., Collberg, C., Ganesh, V., Newsham, Z., Pretschner, A.: Code obfuscation against symbolic execution attacks. In: ACSAC (2016)

    Google Scholar 

  3. Codevirtualizer. https://oreans.com/codevirtualizer.php

  4. Themida. https://www.oreans.com/themida.php

  5. Tigress: C diversifier/obfuscator. http://tigress.cs.arizona.edu/

  6. Clause, J., Li, W., Orso, A.: Dytan: a generic dynamic taint analysis framework. In: ISSTA. ACM (2007)

    Google Scholar 

  7. Coogan, K., Lu, G., Debray, S.: Deobfuscation of virtualization-obfuscated software: a semantics-based approach. In: CCS. ACM (2011)

    Google Scholar 

  8. Eyrolles, N., Guinet, A., Videau, M.: Arybo: Manipulation, canonicalization and identification of mixed Boolean-arithmetic symbolic expressions. In: GreHack (2016)

    Google Scholar 

  9. Godefroid, P., de Halleux, J., Nori, A.V., Rajamani, S.K., Schulte, W., Tillmann, N., Levin, M.Y.: Automating software testing using program analysis. IEEE Softw. 25(5), 30–37 (2008)

    Article  Google Scholar 

  10. Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: PLDI. ACM (2005)

    Article  Google Scholar 

  11. Jha, S., Gulwani, S., Seshia, S.A., Tiwari, A.: Oracle-guided component-based program synthesis. In: ICSE. ACM/IEEE (2010)

    Google Scholar 

  12. Kinder, J.: Towards static analysis of virtualization-obfuscated binaries. In: 19th Working Conference on Reverse Engineering, WCRE (2012)

    Google Scholar 

  13. Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis and transformation (2004)

    Google Scholar 

  14. Maximus: Reversing a simple virtual machine. CodeBreakers 1.2 (2006)

    Google Scholar 

  15. David, R., Bardin, S., Feist, J., Mounier, L., Potet, M-L., Thanh Dinh Ta, Marion, J-Y.: Specification of concretization and symbolization policies in symbolic execution. In: ISSTA. ACM (2016)

    Google Scholar 

  16. Rolles, R.: Defeating HyperUnpackMe2 with an IDA processor module (2007)

    Google Scholar 

  17. Rolles, R.: Unpacking virtualization obfuscators. In: WOOT (2009)

    Google Scholar 

  18. Salwan, J., Bardin, S., Potet, M.L.: Deobfuscation of VM based software protection. In: SSTIC (2017)

    Google Scholar 

  19. Saudel, F., Salwan, J.: Triton: a dynamic symbolic execution framework. In: SSTIC (2015)

    Google Scholar 

  20. Scherzo: Inside code virtualizer (2007)

    Google Scholar 

  21. Sen, K., Marinov, D., Agha, G.: CUTE: a concolic unit testing engine for C. In: FSE (2005)

    Google Scholar 

  22. Sharif, M.I., Lanzi, A., Giffin, J.T., Lee, W.: Automatic reverse engineering of malware emulators. In: S&P. IEEE (2009)

    Google Scholar 

  23. VMprotect. (2003–2017). http://vmpsoft.com

  24. Yadegari, B., Debray, S.: Symbolic execution of obfuscated code. In: CCS (2015)

    Google Scholar 

  25. Yadegari, B., Johannesmeyer, B., Whitely, B., Debray, S.: A generic approach to automatic deobfuscation of executable code. In: S&P. IEEE (2015)

    Google Scholar 

  26. Vanegue, J., Heelan, S., Rolles, R.: SMT Solvers in Software Security. In: WOOT (2012)

    Google Scholar 

  27. Schwartz, E.J., Avgerinos, T., Brumley, D.: All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In: S&P. IEEE (2010)

    Google Scholar 

  28. Blazytko, T., Contag, M., Aschermann, C., Holz, T.: Syntia: Synthesizing the semantics of obfuscated code. In: USENIX Security Symposium. Usenix (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sébastien Bardin .

Editor information

Editors and Affiliations

A Detailed experiments

A Detailed experiments

Table 6. Average of all algorithms per protection
Table 7. Average of all protections per hash function

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Salwan, J., Bardin, S., Potet, ML. (2018). Symbolic Deobfuscation: From Virtualized Code Back to the Original. In: Giuffrida, C., Bardin, S., Blanc, G. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2018. Lecture Notes in Computer Science(), vol 10885. Springer, Cham. https://doi.org/10.1007/978-3-319-93411-2_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93411-2_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93410-5

  • Online ISBN: 978-3-319-93411-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics