Advertisement

Abstract

This paper addresses the problem of recovering variable-like entities when analyzing executables in the absence of debugging information. We show that variable-like entities can be recovered by iterating Value-Set Analysis (VSA), a combined numeric-analysis and pointer-analysis algorithm, and Aggregate Structure Identification, an algorithm to identify the structure of aggregates. Our initial experiments show that the technique is successful in correctly identifying 88% of the local variables and 89% of the fields of heap-allocated objects. Previous techniques recovered 83% of the local variables, but 0% of the fields of heap-allocated objects. Moreover, the values computed by VSA using the variables recovered by our algorithm would allow any subsequent analysis to do a better job of interpreting instructions that use indirect addressing to access arrays and heap-allocated data objects: indirect operands can be resolved better at 4% to 39% of the sites of writes and up to 8% of the sites of reads. (These are the memory-access operations for which it is the most difficult for an analyzer to obtain useful results.)

Keywords

Memory Location Array Element Program Point Assembly Code Address Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aigner, G., Hölzle, U.: Eliminating virtual function calls in C++ programs. In: Cointe, P. (ed.) ECOOP 1996. LNCS, vol. 1098, Springer, Heidelberg (1996)Google Scholar
  2. 2.
    Amme, W., Braun, P., Zehendner, E., Thomasset, F.: Data dependence analysis of assembly code. In: Int. J. Parallel Proc (2000)Google Scholar
  3. 3.
    Backes, W.: Programmanalyse des XRTL Zwischencodes. PhD thesis, Universitaet des Saarlandes (in German) (2004)Google Scholar
  4. 4.
    Balakrishnan, G., Reps, T.: Analyzing memory accesses in x86 executables. In: Duesterwald, E. (ed.) CC 2004. LNCS, vol. 2985, Springer, Heidelberg (2004)Google Scholar
  5. 5.
    Balakrishnan, G., Reps, T.: Recency-abstraction for heap-allocated storage. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 29–31. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Bergeron, J., Debbabi, M., Desharnais, J., Erhioui, M.M., Lavoie, Y., Tawbi, N.: Static detection of malicious code in executable programs. Int. J. of Req. Eng. (2001)Google Scholar
  7. 7.
    Bourdoncle, F.: Efficient chaotic iteration strategies with widenings. In: Int. Conf. on Formal Methods in Prog. and their Appl. (1993)Google Scholar
  8. 8.
    Cifuentes, C., Fraboulet, A.: Intraprocedural static slicing of binary executables. In: ICSM, pp. 188–195 (1997)Google Scholar
  9. 9.
    Cifuentes, C., Simon, D., Fraboulet, A.: Assembly to high-level language translation. In: ICSM (1998)Google Scholar
  10. 10.
    Cousot, P., Cousot, R.: Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: POPL (1977)Google Scholar
  11. 11.
    Debray, S.K., Muth, R., Weippert, M.: Alias analysis of executable code. In: POPL (1998)Google Scholar
  12. 12.
    Eidorff, P.H., Henglein, F., Mossin, C., Niss, H., Sørensen, M.H., Tofte, M.: Anno Domini: From type theory to year 2000 conversion tool. In: POPL (1999)Google Scholar
  13. 13.
    Gopan, D., Reps, T.: Lookahead widening. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 17–20. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Guo, B., Bridges, M.J., Triantafyllis, S., Ottoni, G., Raman, E., August, D.I.: Practical and accurate low-level pointer analysis. In: Int. Symp. on Code Gen. and Opt. (2005)Google Scholar
  15. 15.
  16. 16.
    Lal, A., Reps, T., Balakrishnan, G.: Extended weighted pushdown systems. In: Etessami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 6–10. Springer, Heidelberg (2005)Google Scholar
  17. 17.
    Larus, J.R., Schnarr, E.: EEL: Machine-independent executable editing. In: PLDI (1995)Google Scholar
  18. 18.
    Miné, A.M.: Field-sensitive value analysis of embedded C programs with union types and pointer arithmetics. In: LCTES (2006)Google Scholar
  19. 19.
    Müller-Olm, M., Seidl, H.: Analysis of modular arithmetic. In: Sagiv, M. (ed.) ESOP 2005. LNCS, vol. 3444, Springer, Heidelberg (2005)Google Scholar
  20. 20.
    Mycroft, A.: Type-based decompilation. In: Swierstra, S.D. (ed.) ESOP 1999 and ETAPS 1999. LNCS, vol. 1576, Springer, Heidelberg (1999)CrossRefGoogle Scholar
  21. 21.
    O’Callahan, R., Jackson, D.: Lackwit: A program understanding tool based on type inference. In: Int. Conf. on Softw. Eng. (1997)Google Scholar
  22. 22.
    Pande, H., Ryder, B.: Data-flow-based virtual function resolution. In: Cousot, R., Schmidt, D.A. (eds.) SAS 1996. LNCS, vol. 1145, Springer, Heidelberg (1996)Google Scholar
  23. 23.
    Ramalingam, G., Field, J., Tip, F.: Aggregate structure identification and its application to program analysis. In: POPL (1999)Google Scholar
  24. 24.
    Reps, T., Balakrishnan, G., Lim, J.: Intermediate representation recovery from low-level code. In: PEPM (2006)Google Scholar
  25. 25.
    Reps, T., Balakrishnan, G., Lim, J., Teitelbaum, T.: A next-generation platform for analyzing executables. In: Yi, K. (ed.) APLAS 2005. LNCS, vol. 3780, Springer, Heidelberg (2005)CrossRefGoogle Scholar
  26. 26.
    Rival, X.: Abstract interpretation based certification of assembly code. In: Zuck, L.D., Attie, P.C., Cortesi, A., Mukhopadhyay, S. (eds.) VMCAI 2003. LNCS, vol. 2575, Springer, Heidelberg (2002)Google Scholar
  27. 27.
    Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis. In: (ed.) Program Flow Analysis: Theory and Applications, chapter 7, pp. 189–234. Prentice-Hall, Englewood Cliffs (1981)Google Scholar
  28. 28.
    Siff, M., Reps, T.W.: Program generalization for software reuse: From C to C++. In: Found. of Softw. Eng. (1996)Google Scholar
  29. 29.
    Srivastava, A., Eustace, A.: ATOM - A system for building customized program analysis tools. In: PLDI (1994)Google Scholar
  30. 30.
    van, D.A.: Moonen, L.: Type inference for COBOL systems. In: WCRE (1998)Google Scholar
  31. 31.
    Wilson, R.P., Lam, M.S.: Efficient context-sensitive pointer analysis for C programs. In: PLDI (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Gogul Balakrishnan
    • 1
  • Thomas Reps
    • 1
  1. 1.Comp. Sci. Dept., University of Wisconsin 

Personalised recommendations