Software & Systems Modeling

, Volume 16, Issue 2, pp 523–557 | Cite as

Using contexts to extract models from code

  • Lucio Mauro Duarte
  • Jeff Kramer
  • Sebastian Uchitel
Regular Paper


Behaviour models facilitate the understanding and analysis of software systems by providing an abstract view of their behaviours and also by enabling the use of validation and verification techniques to detect errors. However, depending on the size and complexity of these systems, constructing models may not be a trivial task, even for experienced developers. Model extraction techniques can automatically obtain models from existing code, thus reducing the effort and expertise required of engineers and helping avoid errors often present in manually constructed models. Existing approaches for model extraction often fail to produce faithful models, either because they only consider static information, which may include infeasible behaviours, or because they are based only on dynamic information, thus relying on observed executions, which usually results in incomplete models. This paper describes a model extraction approach based on the concept of contexts, which are abstractions of concrete states of a program, combining static and dynamic information. Contexts merge some of the advantages of using either type of information and, by their combination, can overcome some of their problems. The approach is partially implemented by a tool called LTS Extractor, which translates information collected from execution traces produced by instrumented Java code to labelled transition systems (LTS), which can be analysed in an existing verification tool. Results from case studies are presented and discussed, showing that, considering a certain level of abstraction and a set of execution traces, the produced models are correct descriptions of the programs from which they were extracted. Thus, they can be used for a variety of analyses, such as program understanding, validation, verification, and evolution.


Behaviour models Model extraction  Model analysis 


  1. 1.
    Alur, R., Černý, P., Madhusudan, P., Nam, W.: Synthesis of interface specifications for java classes. In: ACM POPL, pp. 98–109. ACM, New York, NY, USA. doi: 10.1145/1040305.1040314
  2. 2.
    Ammons, G., Bodìk, R., Larus, J.R.: Mining specifications. In: ACM POPL, pp. 4–16. ACM, Portland, OR, USA (2002). doi: 10.1145/503272.503275
  3. 3.
    Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Ball, T., Rajamani, S.K.: The SLAM project: debugging system software via static analysis. In: ACM POPL, pp. 1–3. ACM, New York, NY, USA (2002). doi: 10.1145/565816.503274
  5. 5.
    Belinfante, A.: Jtorx: a tool for on-line model-driven test derivation and execution. In: Esparza, J., Majumdar, R. (eds.) TACAS, LNCS, vol. 6015, pp. 266–270. Springer, Berlin, Germany (2010)Google Scholar
  6. 6.
    Beschastnikh, I., Brun, Y., Ernst, M.D., Krishnamurthy, A.: Inferring models of concurrent systems from logs of their behavior with CSight. In: Jalote, P., Briand, L., van der Hoek, A. (eds.) ICSE, pp. 468–479. ACM, New York, NY, USA (2014). doi: 10.1145/2568225.2568246
  7. 7.
    Beschastnikh, I., Brun, Y., Schneider, S., Sloan, M., Ernst, M.D.: Leveraging existing instrumentation to automatically infer invariant-constrained models. In: ACM ESEC/FSE, pp. 267–277. ACM, Szeged, Hungary (2011). doi: 10.1145/2025113.2025151
  8. 8.
    Bodhuin, T., Pagnozzi, F., Santone, A.: Abstracting models from execution traces for performing formal verification. In: Šęzak, D., Kim, T-H., Kiumi, A., Jiang, T., Verner, J.,Abrahã, S. (eds.) ASEA 2009, CCIS, vol. 59, pp. 143–150. Springer, Berlin, Heidelberg (2009). doi: 10.1007/978-3-642-10619-4_18
  9. 9.
    Burtscher, M.: VPC3: a fast and effective trace-compression algorithm. ACM Perform. Eval. Rev. 32(1), 167–176 (2004)CrossRefGoogle Scholar
  10. 10.
    Chaki, S., Clarke, E., Ouaknine, J., Sharygina, N., Sinha, N.: Concurrent software verification with states, events, and deadlocks. Form. Asp. Comput. 17(4), 461–483 (2005)CrossRefMATHGoogle Scholar
  11. 11.
    Chaki, S., Clarke, E.M., Groce, A., Jha, S., Veith, H.: Modular verification of software components in C. IEEE TSE 30(6), 388–402 (2004)Google Scholar
  12. 12.
    Clarke, E., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided abstraction refinement for symbolic model checking. JACM 50(5), 752–794 (2003)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Clarke, E.M., Grumberg, O., Peled, D.A.: Model Checking. The MIT Press, Cambridge, Massachusetts, USA (1999)Google Scholar
  14. 14.
    Cook, J.E., Wolf, A.L.: Discovering models of software processes from event-based data. ACM ToSEM 7(3), 215–249 (1998)CrossRefGoogle Scholar
  15. 15.
    Corbett, J.C., Dwyer, M.B., Hatcliff, J., Laubach, S., Păsăreanu, C.S., Robby, Zheng, H.: Bandera: extracting finite-state models from java source code. In: ACM ICSE, pp. 439–448. IEEE, Limerick, Ireland (2000). doi: 10.1145/337180.337234
  16. 16.
    Cordy, J.R., Dean, T.R., Malton, A.J., Schneider, K.A.: Source transformation in software engineering using the TXL transformation system. IST 44(13), 827–837 (2002)Google Scholar
  17. 17.
    Dallmeier, V., Knopp, N., Mallon, C., Hack, S., Zeller, A.: Generating test cases for specification mining. In: ACM ISSTA, pp. 85–96. ACM, Trento, Italy (2010). doi: 10.1145/1831708.1831719
  18. 18.
    Duarte, L., Kramer, J., Uchitel, S.: Towards faithful model extraction based on contexts. In: Fiadeiro, J.L., Inverardi, P. (eds.) FASE, LNCS, vol. 4961, pp. 101–115. Springer, Berlin, Germany (2008)Google Scholar
  19. 19.
    Duarte, L.M.: Behaviour model extraction using context information. Ph.D. thesis, Imperial College London, University of London (2007)Google Scholar
  20. 20.
    Duarte, L.M., Kramer, J., Uchitel, S.: Model extraction using context information. In: Nierstrasz, O., Whittle, J., Harel, D., Reggio, G. (eds.) MoDELS, LNCS, vol. 4199, pp. 380–394. Springer, Berlin, Germany (2006)Google Scholar
  21. 21.
    Feng, L., Lundmark, S., Meinke, K., Niu, F., Sindhu, M., Wong, P.: Case studies in learning-based testing. In: Yenigün, H., Yilmaz, C., Ulrich, A. (eds.) Testing Software and Systems, LNCS, vol. 8254, pp. 164–179. Springer, Berlin, Heidelberg (2013)Google Scholar
  22. 22.
    Fernandez, J., Mounier, L., Pachon, C.: Property oriented test case generation. In: Petrenko, A., Ulrich, A. (eds.) FATES, LNCS, vol. 2931, pp. 147–163. Springer, Berlin, Heidelberg (2003)Google Scholar
  23. 23.
    Flanagan, C., Leino, K.R.M., Lillibridge, M., Nelson, G., Saxe, J.B., Stata, R.: Extended static checking for java. In: ACM PLDI, pp. 234–245. ACM, Berlin, Germany (2002). doi: 10.1145/512529.512558
  24. 24.
    Gao, X., Snavely, A., Carter, L.: Path grammar guided trace compression and trace approximation. In: IEEE HPDC, pp. 57–68. IEEE, Paris, France (2006). doi: 10.1109/HPDC.2006.1652136
  25. 25.
    Giannakopoulou, D., Magee, J.: Fluent model checking for event-based systems. In: ACM ESEC/FSE, pp. 257–266. ACM, Helsinki, Finland (2003). doi: 10.1145/940071.940106
  26. 26.
    Giannakopoulou, D., Rakamarić, Z., Raman, V.: Symbolic learning of component interfaces. In: Proceedings of the 19th International Conference on Static Analysis, pp. 248–264 (2012)Google Scholar
  27. 27.
    Gradara, S., Santone, A., Villani, M.L., Vaglini, G.: Model checking multithreaded programs by means of reduced models. ENTCS 110, 55–74 (2004)Google Scholar
  28. 28.
    Graf, S., Saidi, H.: Construction of abstract state graphs with PVS. LNCS 1254, 72–83 (1997)Google Scholar
  29. 29.
    Havelund, K., Pressburger, T.: Model checking java programs using java pathFinder. STTT 2(4), 366–381 (2000)CrossRefMATHGoogle Scholar
  30. 30.
    Henzinger, T., Jahla, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: ACM POPL, pp. 58–70. ACM, Portland, OR, USA (2002). doi: 10.1145/503272.503279
  31. 31.
    Hoare, C.: Communicating Sequential Processes. Prentice-Hall International, Englewood Cliffs, NJ, USA (1985)Google Scholar
  32. 32.
    Holzmann, G.: The model checker Spin. IEEE TSE 23(5), 279–295 (1997)Google Scholar
  33. 33.
    Holzmann, G.: From code to models. In: ACSD, pp. 3–10. IEEE Computer Society, Washington, DC, USA (2001).
  34. 34.
    Holzmann, G., Smith, M.: A practical method for verifying event-driven software. In: ACM ICSE, pp. 597–607. ACM, Los Angeles, USA (1999). doi: 10.1145/302405.302710
  35. 35.
    Holzmann, G.J., Smith, M.: Software model checking: extracting verification models from source code. STVR 11(2), 65–79 (2001)Google Scholar
  36. 36.
    Howar, F., Giannakopoulou, D., Rakamarić, Z.: Hybrid learning: interface generation through static, dynamic, and symbolic analysis. In: ACM ISSTA, pp. 268–279. ACM, Lugano, Switzerland (2013). doi: 10.1145/2483760.2483783
  37. 37.
    Ichii, M., Myojin, T., Nakagawa, Y., Chikahisa, M., Ogawa, H.: A rule-based automated approach for extracting models from source code. In: Oliveto, R., Poshyvanyk, D., Cordy, J., Dean, T. (eds.) WCRE, pp. 308–317. IEEE, Kingston, ON, Canada (2012). doi: 10.1109/WCRE.2012.40
  38. 38.
    Keller, R.: Formal verification of parallel programs. CACM 19(7), 371–384 (1976)MathSciNetCrossRefMATHGoogle Scholar
  39. 39.
    King, J.: Symbolic execution and program testing. CACM 19(7), 385–394 (1976)MathSciNetCrossRefMATHGoogle Scholar
  40. 40.
    Kwiatkowska, M., Norman, G., Parker, D.: PRISM: probabilistic symbolic model checker. In: P. Kemper (ed.) Proceedings of the Tools Session of Aachen 2001 International Multiconference on Measurement, Modelling and Evaluation of Computer-Communication Systems, pp. 7–12 (2001)Google Scholar
  41. 41.
    Leuschel, M., Massart, T., Currie, A.: How to make FDR Spin: LTL model checking of CSP by refinement. LNCS 2021, 99–118 (2001)MATHGoogle Scholar
  42. 42.
    Li, H.F., Rilling, J., Goswami, D.: Granularity-driven dynamic predicate slicing algorithms for message passing systems. ASE 11(1), 63–89 (2004)Google Scholar
  43. 43.
    Lorenzoli, D., Mariani, L., Pezzè, M.: Automatic generation of software behavioral models. In: ACM ICSE, pp. 501–510. ACM, Leipzig, Germany (2008). doi: 10.1145/1368088.1368157
  44. 44.
    Machado, P.D.L., Silva, D., Mota, A.C.: Towards property oriented testing. ENTCS 184, 3–19 (2007)MATHGoogle Scholar
  45. 45.
    Magee, J., Kramer, J.: Concurrency: State Models and Java Programming, 2nd edn. Wiley, Chichester, England (2006)MATHGoogle Scholar
  46. 46.
    Manna, Z., Pnueli, A.: The Temporal Logic of Reactive and Concurrent Systems. Springer, New York, USA (1992)CrossRefMATHGoogle Scholar
  47. 47.
    Mariani, L.: Behavior capture and test: dynamic analysis of component-based systems. Ph.d., Università degli Studi di Milano Bicocca (2005)Google Scholar
  48. 48.
    Meinke, K., Niu, F., Sindhu, M.: Learning-based software testing: a tutorial. In: Hähnle, R., Knoop, J., Margaria, T., Schreiner, D., Steffen, B. (eds.) Leveraging Applications of Formal Methods, Verification, and Validation, Communications in Computer and Information Science, pp. 200–219. Springer, Berlin, Germany (2012)Google Scholar
  49. 49.
    Meinke, K., Sindhu, M.: Incremental learning-based testing for reactive systems. In: Gogolla, M., Wolff, B. (eds.) Tests and Proofs, LNCS, vol. 6706, pp. 134–151. Springer, Berlin, Germany (2011)CrossRefGoogle Scholar
  50. 50.
    Milner, R.: An algebraic definition of simulation between programs. In: Society, B.C. (ed.) 2nd IJCAI, pp. 481–489. Morgan Kaufmann Publishers Inc., London, England (1971)Google Scholar
  51. 51.
    Milner, R.: Communication and Concurrency. Prentice-Hall Inc, Passau, Germany (1989)MATHGoogle Scholar
  52. 52.
    Nimmer, J., Ernst, M.: Automatic generation of program specifications. In: ISSTA, pp. 232–242. ACM, Rome, Italy (2002). doi: 10.1145/566172.566213
  53. 53.
    Odom, J., Hollingsworth, J., DeRose, L., Ekanadham, K., Sbaraglia, S.: Using dynamic tracing sampling to measure long running programs. In: ACM/IEEE SC, p. 59. IEEE, Seattle, WA, USA (2005). doi: 10.1109/SC.2005.77
  54. 54.
    Pradel, M., Gross, T.R.: Automatic generation of object usage specifications from large method traces. In: Ceballos, S. (ed.) IEEE/ACM ASE, pp. 371–382. IEEE Computer Society, Washington, DC, USA (2009). doi: 10.1109/ASE.2009.60
  55. 55.
    Raffelt, H., Steffen, B., Berg, T., Margaria, T.: LearnLib: a framework for extrapolating behavioral models. STTT 11(5), 393–407 (2009)CrossRefGoogle Scholar
  56. 56.
    Strom, R.E., Yemini, S.: Typestate: a programming language concept for enhancing software reliability. IEEE TSE 12(1), 157–171 (1986)MATHGoogle Scholar
  57. 57.
    Utting, M.: How to Design Extended Finite State Machine Test Models in Java. CRC Press, Boca Raton, FL, USA (2011)Google Scholar
  58. 58.
    Utting, M., Legeard, B.: Practical Model-Based Testing: A Tools Approach. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA (2007)Google Scholar
  59. 59.
    Visser, W., Havelund, K., Brat, G., Park, S., Lerda, F.: Model checking programs. ASE 10(2), 203–232 (2003)Google Scholar
  60. 60.
    Walkinshaw, N., Bogdanov, K., Holcombe, M., Salahuddin, S.: Reverse engineering state machines by interactive grammar inference. In: Penta, M.D., Maletic, J.I. (eds.) WCRE, pp. 209–218. IEEE Computer Society, Washington, DC, USA (2007). doi: 10.1109/WCRE.2007.45
  61. 61.
    Walkinshaw, N., Taylor, R., Derrick, J.: Inferring extended finite state machine models from software executions. In: Lähmmel, R., Oliveto, R., Robbes, R. (eds.) WCRE, pp.301–310. IEEE, Koblenz, Germany (2013). doi: 10.1109/WCRE.2013.6671305
  62. 62.
    Wasylkowski, A., Zeller, A., Lindig, C.: Detecting object usage anomalies. In: ACM ESEC/FSE, pp. 35–44. ACM, Dubrovnik, Croatia (2007). doi: 10.1145/1287624.1287632
  63. 63.
    Yu, Y., Wang, Y., Mylopoulos, J., Liaskos, S., Lapouchnian, A., Sampaio do Prado Leite, J.: Reverse engineering goal models from legacy code. In: IEEE RE, pp. 363–372. IEEE, Paris, France (2005). doi: 10.1109/RE.2005.61

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Lucio Mauro Duarte
    • 1
  • Jeff Kramer
    • 2
  • Sebastian Uchitel
    • 2
  1. 1.Institute of InformaticsFederal University of Rio Grande do Sul (UFRGS)Porto AlegreBrazil
  2. 2.Department of ComputingImperial College LondonLondonUK

Personalised recommendations