Abstract
This paper addresses the problem of recovering variable-like entities when analyzing executables in the absence of debugging information. We show that variable-like entities can be recovered by iterating Value-Set Analysis (VSA), a combined numeric-analysis and pointer-analysis algorithm, and Aggregate Structure Identification, an algorithm to identify the structure of aggregates. Our initial experiments show that the technique is successful in correctly identifying 88% of the local variables and 89% of the fields of heap-allocated objects. Previous techniques recovered 83% of the local variables, but 0% of the fields of heap-allocated objects. Moreover, the values computed by VSA using the variables recovered by our algorithm would allow any subsequent analysis to do a better job of interpreting instructions that use indirect addressing to access arrays and heap-allocated data objects: indirect operands can be resolved better at 4% to 39% of the sites of writes and up to 8% of the sites of reads. (These are the memory-access operations for which it is the most difficult for an analyzer to obtain useful results.)
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aigner, G., Hölzle, U.: Eliminating virtual function calls in C++ programs. In: Cointe, P. (ed.) ECOOP 1996. LNCS, vol. 1098, Springer, Heidelberg (1996)
Amme, W., Braun, P., Zehendner, E., Thomasset, F.: Data dependence analysis of assembly code. In: Int. J. Parallel Proc (2000)
Backes, W.: Programmanalyse des XRTL Zwischencodes. PhD thesis, Universitaet des Saarlandes (in German) (2004)
Balakrishnan, G., Reps, T.: Analyzing memory accesses in x86 executables. In: Duesterwald, E. (ed.) CC 2004. LNCS, vol. 2985, Springer, Heidelberg (2004)
Balakrishnan, G., Reps, T.: Recency-abstraction for heap-allocated storage. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 29–31. Springer, Heidelberg (2006)
Bergeron, J., Debbabi, M., Desharnais, J., Erhioui, M.M., Lavoie, Y., Tawbi, N.: Static detection of malicious code in executable programs. Int. J. of Req. Eng. (2001)
Bourdoncle, F.: Efficient chaotic iteration strategies with widenings. In: Int. Conf. on Formal Methods in Prog. and their Appl. (1993)
Cifuentes, C., Fraboulet, A.: Intraprocedural static slicing of binary executables. In: ICSM, pp. 188–195 (1997)
Cifuentes, C., Simon, D., Fraboulet, A.: Assembly to high-level language translation. In: ICSM (1998)
Cousot, P., Cousot, R.: Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: POPL (1977)
Debray, S.K., Muth, R., Weippert, M.: Alias analysis of executable code. In: POPL (1998)
Eidorff, P.H., Henglein, F., Mossin, C., Niss, H., Sørensen, M.H., Tofte, M.: Anno Domini: From type theory to year 2000 conversion tool. In: POPL (1999)
Gopan, D., Reps, T.: Lookahead widening. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 17–20. Springer, Heidelberg (2006)
Guo, B., Bridges, M.J., Triantafyllis, S., Ottoni, G., Raman, E., August, D.I.: Practical and accurate low-level pointer analysis. In: Int. Symp. on Code Gen. and Opt. (2005)
IDAPro disassembler, http://www.datarescue.com/idabase/
Lal, A., Reps, T., Balakrishnan, G.: Extended weighted pushdown systems. In: Etessami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 6–10. Springer, Heidelberg (2005)
Larus, J.R., Schnarr, E.: EEL: Machine-independent executable editing. In: PLDI (1995)
Miné, A.M.: Field-sensitive value analysis of embedded C programs with union types and pointer arithmetics. In: LCTES (2006)
Müller-Olm, M., Seidl, H.: Analysis of modular arithmetic. In: Sagiv, M. (ed.) ESOP 2005. LNCS, vol. 3444, Springer, Heidelberg (2005)
Mycroft, A.: Type-based decompilation. In: Swierstra, S.D. (ed.) ESOP 1999 and ETAPS 1999. LNCS, vol. 1576, Springer, Heidelberg (1999)
O’Callahan, R., Jackson, D.: Lackwit: A program understanding tool based on type inference. In: Int. Conf. on Softw. Eng. (1997)
Pande, H., Ryder, B.: Data-flow-based virtual function resolution. In: Cousot, R., Schmidt, D.A. (eds.) SAS 1996. LNCS, vol. 1145, Springer, Heidelberg (1996)
Ramalingam, G., Field, J., Tip, F.: Aggregate structure identification and its application to program analysis. In: POPL (1999)
Reps, T., Balakrishnan, G., Lim, J.: Intermediate representation recovery from low-level code. In: PEPM (2006)
Reps, T., Balakrishnan, G., Lim, J., Teitelbaum, T.: A next-generation platform for analyzing executables. In: Yi, K. (ed.) APLAS 2005. LNCS, vol. 3780, Springer, Heidelberg (2005)
Rival, X.: Abstract interpretation based certification of assembly code. In: Zuck, L.D., Attie, P.C., Cortesi, A., Mukhopadhyay, S. (eds.) VMCAI 2003. LNCS, vol. 2575, Springer, Heidelberg (2002)
Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis. In: (ed.) Program Flow Analysis: Theory and Applications, chapter 7, pp. 189–234. Prentice-Hall, Englewood Cliffs (1981)
Siff, M., Reps, T.W.: Program generalization for software reuse: From C to C++. In: Found. of Softw. Eng. (1996)
Srivastava, A., Eustace, A.: ATOM - A system for building customized program analysis tools. In: PLDI (1994)
van, D.A.: Moonen, L.: Type inference for COBOL systems. In: WCRE (1998)
Wilson, R.P., Lam, M.S.: Efficient context-sensitive pointer analysis for C programs. In: PLDI (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Balakrishnan, G., Reps, T. (2007). DIVINE: DIscovering Variables IN Executables. In: Cook, B., Podelski, A. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2007. Lecture Notes in Computer Science, vol 4349. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69738-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-69738-1_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69735-0
Online ISBN: 978-3-540-69738-1
eBook Packages: Computer ScienceComputer Science (R0)