Symbolic Deobfuscation: From Virtualized Code Back to the Original

Salwan, Jonathan; Bardin, Sébastien; Potet, Marie-Laure

doi:10.1007/978-3-319-93411-2_17

Jonathan Salwan¹⁶,
Sébastien Bardin¹⁷ &
Marie-Laure Potet¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10885))

Included in the following conference series:

International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment

2152 Accesses
23 Citations

Abstract

Software protection has taken an important place during the last decade in order to protect legit software against reverse engineering or tampering. Virtualization is considered as one of the very best defenses against such attacks. We present a generic approach based on symbolic path exploration, taint and recompilation allowing to recover, from a virtualized code, a devirtualized code semantically identical to the original one and close in size. We define criteria and metrics to evaluate the relevance of the deobfuscated results in terms of correctness and precision. Finally we propose an open-source setup allowing to evaluate the proposed approach against several forms of virtualization.

Work partially funded by ANR and PIA under grant ANR-15-IDEX-02.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Solving the Tigress Challenge was presented at the French industrial conference SSTIC’17 [18]. The work presented here adds a revisited description of the method, a strong systematic experimental evaluation as well as new metrics to evaluate the accuracy of the approach.
2.
Such as simplifycfg and instcombine.
3.
https://github.com/JonathanSalwan/Tigress_protection/blob/master/solve-vm.py.
4.
https://github.com/JonathanSalwan/Tigress_protection.
5.
Thanks to Christian Collberg for having provided us the original source codes.
6.
http://tigress.cs.arizona.edu.
7.
MD5 is one of the most involving examples in our benchmark.
8.
http://tigress.cs.arizona.edu/challenges.html#current.
9.
http://tigress.cs.arizona.edu/index.htm.

References

Bardin, S., David, R., Marion, J.-Y.: Backward-bounded DSE: targeting infeasibility questions on obfuscated codes. In: S&P, pp. 633–651. IEEE (2017)
Google Scholar
Banescu, S., Collberg, C., Ganesh, V., Newsham, Z., Pretschner, A.: Code obfuscation against symbolic execution attacks. In: ACSAC (2016)
Google Scholar
Codevirtualizer. https://oreans.com/codevirtualizer.php
Themida. https://www.oreans.com/themida.php
Tigress: C diversifier/obfuscator. http://tigress.cs.arizona.edu/
Clause, J., Li, W., Orso, A.: Dytan: a generic dynamic taint analysis framework. In: ISSTA. ACM (2007)
Google Scholar
Coogan, K., Lu, G., Debray, S.: Deobfuscation of virtualization-obfuscated software: a semantics-based approach. In: CCS. ACM (2011)
Google Scholar
Eyrolles, N., Guinet, A., Videau, M.: Arybo: Manipulation, canonicalization and identification of mixed Boolean-arithmetic symbolic expressions. In: GreHack (2016)
Google Scholar
Godefroid, P., de Halleux, J., Nori, A.V., Rajamani, S.K., Schulte, W., Tillmann, N., Levin, M.Y.: Automating software testing using program analysis. IEEE Softw. 25(5), 30–37 (2008)
Article Google Scholar
Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: PLDI. ACM (2005)
Article Google Scholar
Jha, S., Gulwani, S., Seshia, S.A., Tiwari, A.: Oracle-guided component-based program synthesis. In: ICSE. ACM/IEEE (2010)
Google Scholar
Kinder, J.: Towards static analysis of virtualization-obfuscated binaries. In: 19th Working Conference on Reverse Engineering, WCRE (2012)
Google Scholar
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis and transformation (2004)
Google Scholar
Maximus: Reversing a simple virtual machine. CodeBreakers 1.2 (2006)
Google Scholar
David, R., Bardin, S., Feist, J., Mounier, L., Potet, M-L., Thanh Dinh Ta, Marion, J-Y.: Specification of concretization and symbolization policies in symbolic execution. In: ISSTA. ACM (2016)
Google Scholar
Rolles, R.: Defeating HyperUnpackMe2 with an IDA processor module (2007)
Google Scholar
Rolles, R.: Unpacking virtualization obfuscators. In: WOOT (2009)
Google Scholar
Salwan, J., Bardin, S., Potet, M.L.: Deobfuscation of VM based software protection. In: SSTIC (2017)
Google Scholar
Saudel, F., Salwan, J.: Triton: a dynamic symbolic execution framework. In: SSTIC (2015)
Google Scholar
Scherzo: Inside code virtualizer (2007)
Google Scholar
Sen, K., Marinov, D., Agha, G.: CUTE: a concolic unit testing engine for C. In: FSE (2005)
Google Scholar
Sharif, M.I., Lanzi, A., Giffin, J.T., Lee, W.: Automatic reverse engineering of malware emulators. In: S&P. IEEE (2009)
Google Scholar
VMprotect. (2003–2017). http://vmpsoft.com
Yadegari, B., Debray, S.: Symbolic execution of obfuscated code. In: CCS (2015)
Google Scholar
Yadegari, B., Johannesmeyer, B., Whitely, B., Debray, S.: A generic approach to automatic deobfuscation of executable code. In: S&P. IEEE (2015)
Google Scholar
Vanegue, J., Heelan, S., Rolles, R.: SMT Solvers in Software Security. In: WOOT (2012)
Google Scholar
Schwartz, E.J., Avgerinos, T., Brumley, D.: All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In: S&P. IEEE (2010)
Google Scholar
Blazytko, T., Contag, M., Aschermann, C., Holz, T.: Syntia: Synthesizing the semantics of obfuscated code. In: USENIX Security Symposium. Usenix (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Quarkslab, Paris, France
Jonathan Salwan
CEA, LIST, University of Paris-Saclay, Paris, France
Sébastien Bardin
University of Grenoble Alpes, 38000, Grenoble, France
Marie-Laure Potet

Authors

Jonathan Salwan
View author publications
You can also search for this author in PubMed Google Scholar
Sébastien Bardin
View author publications
You can also search for this author in PubMed Google Scholar
Marie-Laure Potet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sébastien Bardin .

Editor information

Editors and Affiliations

Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Cristiano Giuffrida
CEA, Palaiseau, France
Sébastien Bardin
Université Paris-Saclay, Evry, France
Gregory Blanc

A Detailed experiments

Table 6. Average of all algorithms per protection

Full size table

Table 7. Average of all protections per hash function

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Salwan, J., Bardin, S., Potet, ML. (2018). Symbolic Deobfuscation: From Virtualized Code Back to the Original. In: Giuffrida, C., Bardin, S., Blanc, G. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2018. Lecture Notes in Computer Science(), vol 10885. Springer, Cham. https://doi.org/10.1007/978-3-319-93411-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-93411-2_17
Published: 08 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93410-5
Online ISBN: 978-3-319-93411-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Symbolic Deobfuscation: From Virtualized Code Back to the Original

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Detailed experiments

A Detailed experiments

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation