Skip to main content

Abstract

The maintenance and evolution of data-intensive systems should ideally rely on a complete and accurate database documentation. Unfortunately, this documentation is often missing, or, at best, outdated. Database redocumentation, a process also known as database reverse engineering, then comes to the rescue. This process typically involves the elicitation of implicit schema constructs, that is, data structures and constraints that have been incompletely translated into the operational database schema. In this context, the SQL statements executed by the programs may be a particularly rich source of information. SQL APIs come in two variants, namely static and dynamic. The latter is intensively used in object-oriented and web applications, notably through ODBC and JDBC APIs. While the static analysis of SQL queries has long been studied, coping with automatically generated SQL statements requires other weapons. This tutorial provides an in-depth exploration of the use of dynamic program analysis as a basis for reverse engineering relational databases. It describes and illustrates several automated techniques allowing to capture the trace of the SQL-related events occuring during the execution of data-intensive programs. It then presents and evaluates several heuristics and techniques supporting the automatic recovery of implicit schema constructs from SQL execution traces. Other applications of SQL execution trace analysis are also identified.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chikofsky, E.J., Cross, J.H.: Reverse engineering and design recovery: A taxonomy. IEEE Software 7(1), 13–17 (1990)

    Article  Google Scholar 

  2. Blaha, M.R., Premerlani, W.J.: Observed idiosyncracies of relational database designs. In: Proc. of the Second Working Conference on Reverse Engineering (WCRE 1995), p. 116. IEEE Computer Society, Washington, DC (1995)

    Chapter  Google Scholar 

  3. Petit, J.M., Kouloumdjian, J., Boulicaut, J.F., Toumani, F.: Using Queries to Improve Database Reverse Engineering. In: Loucopoulos, P. (ed.) ER 1994. LNCS, vol. 881, pp. 369–386. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  4. Andersson, M.: Searching for semantics in cobol legacy applications. In: Data Mining and Reverse Engineering: Searching for Semantics, IFIP TC2/WG2.6 Seventh Conference on Database Semantics (DS-7). IFIP Conference Proceedings, vol. 124, pp. 162–183. Chapman & Hall (1998)

    Google Scholar 

  5. Embury, S.M., Shao, J.: Assisting the comprehension of legacy transactions. In: Proc. of the 8th Working Conference on Reverse Engineering (WCRE 2001), p. 345. IEEE Computer Society, Washington, DC (2001)

    Google Scholar 

  6. Willmor, D., Embury, S.M., Shao, J.: Program slicing in the presence of a database state. In: ICSM 2004: Proceedings of the 20th IEEE International Conference on Software Maintenance, pp. 448–452. IEEE Computer Society, Washington, DC (2004)

    Chapter  Google Scholar 

  7. Cleve, A., Henrard, J., Hainaut, J.L.: Data reverse engineering using system dependency graphs. In: Proc. of the 13th Working Conference on Reverse Engineering (WCRE 2006), pp. 157–166. IEEE Computer Society, Washington, DC (2006)

    Chapter  Google Scholar 

  8. Cleve, A.: Program Analysis and Transformation for Data-Intensive System Evolution. PhD thesis, University of Namur (October 2009)

    Google Scholar 

  9. Cleve, A., Hainaut, J.L.: Dynamic analysis of SQL statements for data-intensive applications reverse engineering. In: Proc. of the 15th Working Conference on Reverse Engineering, pp. 192–196. IEEE Computer Society (2008)

    Google Scholar 

  10. Cleve, A., Meurisse, J.R., Hainaut, J.L.: Database semantics recovery through analysis of dynamic SQL statements. Journal on Data Semantics 15, 130–157 (2011)

    Article  Google Scholar 

  11. Hainaut, J.L.: Introduction to database reverse engineering. LIBD Publish. (2002), http://www.info.fundp.ac.be/~dbm/publication/2002/DBRE-2002.pdf

  12. Lämmel, R., De Schutter, K.: What does aspect-oriented programming mean to Cobol? In: Proc. of Aspect-Oriented Software Development (AOSD 2005), pp. 99–110. ACM Press (March 2005)

    Google Scholar 

  13. Kiczales, G., Hilsdale, E., Hugunin, J., Kersten, M., Palm, J., Griswold, W.G.: An Overview of AspectJ. In: Lee, S.H. (ed.) ECOOP 2001. LNCS, vol. 2072, pp. 327–353. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  14. Petit, J.M., Toumani, F., Kouloumdjian, J.: Relational database reverse engineering: A method based on query analysis. Int. J. Cooperative Inf. Syst. 4(2-3), 287–316 (1995)

    Article  Google Scholar 

  15. Lopes, S., Petit, J.M., Toumani, F.: Discovery of “Interesting” Data Dependencies from a Workload of SQL Statements. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 430–435. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  16. Tan, H.B.K., Ling, T.W., Goh, C.H.: Exploring into programs for the recovery of data dependencies designed. IEEE Trans. Knowl. Data Eng. 14(4), 825–835 (2002)

    Article  Google Scholar 

  17. Tan, H.B.K., Zhao, Y.: Automated elicitation of inclusion dependencies from the source code for database transactions. Journal of Software Maintenance 15(6), 379–392 (2003)

    Article  Google Scholar 

  18. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)

    Article  MATH  Google Scholar 

  19. DB-MAIN: The DB-MAIN official website (2011), http://www.db-main.be

  20. Zhu, H., Hall, P.A.V., May, J.H.R.: Software unit test coverage and adequacy. ACM Comput. Surv. 29, 366–427 (1997)

    Article  Google Scholar 

  21. Kapfhammer, G.M., Soffa, M.L.: A family of test adequacy criteria for database-driven applications. In: Proc. of the 9th European Software Engineering Conference Held Jointly with 11th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ESEC/FSE-11, pp. 98–107. ACM, New York (2003)

    Google Scholar 

  22. Casanova, M.A., De Sa, J.E.A.: Mapping uninterpreted schemes into entity-relationship diagrams: two applications to conceptual schema design. IBM J. Res. Dev. 28(1), 82–94 (1984)

    Article  Google Scholar 

  23. Davis, K.H., Arora, A.K.: A methodology for translating a conventional file system into an entity-relationship model. In: Proc. of the Fourth International Conference on Entity-Relationship Approach, pp. 148–159. IEEE Computer Society, Washington, DC (1985)

    Google Scholar 

  24. Navathe, S.B., Awong, A.M.: Abstracting relational and hierarchical data with a semantic data model. In: Proc. of the Sixth International Conference on Entity-Relationship Approach (ER 1987), pp. 305–333. North-Holland Publishing Co., Amsterdam (1988)

    Google Scholar 

  25. Johannesson, P.: A method for transforming relational schemas into conceptual schemas. In: Proc. of the Tenth International Conference on Data Engineering (ICDE 2004), pp. 190–201. IEEE Computer Society, Washington, DC (1994)

    Google Scholar 

  26. Hainaut, J.L., Englebert, V., Henrard, J., Hick, J.M., Roland, D.: Database reverse engineering: From requirements to care tools. Automated Software Engineering 3, 9–45 (1996)

    Article  MathSciNet  Google Scholar 

  27. Davis, K.H., Aiken, P.H.: Data reverse engineering: A historical survey. In: Proc. of the Seventh Working Conference on Reverse Engineering (WCRE 2000), p. 70. IEEE Computer Society, Washington, DC (2000)

    Chapter  Google Scholar 

  28. Hainaut, J.L., Chandelon, M., Tonneau, C., Joris, M.: Contribution to a theory of database reverse engineering. In: Proc. of the IEEE Working Conf. on Reverse Engineering, pp. 161–170. IEEE Computer Society Press, Baltimore (1993)

    Chapter  Google Scholar 

  29. Signore, O., Loffredo, M., Gregori, M., Cima, M.: Reconstruction of ER Schema from Database Applications: a Cognitive Approach. In: Loucopoulos, P. (ed.) ER 1994. LNCS, vol. 881, pp. 387–402. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  30. Yang, H., Chu, W.C.: Acquisition of entity relationship models for maintenance-dealing with data intensive programs in a transformation system. J. Inf. Sci. Eng. 15(2), 173–198 (1999)

    Google Scholar 

  31. Shao, J., Liu, X., Fu, G., Embury, S.M., Gray, W.A.: Querying Data-Intensive Programs for Data Design. In: Dittrich, K.R., Geppert, A., Norrie, M. (eds.) CAiSE 2001. LNCS, vol. 2068, pp. 203–218. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  32. Markowitz, V.M., Makowsky, J.A.: Identifying extended entity-relationship object structures in relational schemas. IEEE Trans. Softw. Eng. 16(8), 777–790 (1990)

    Article  Google Scholar 

  33. Premerlani, W.J., Blaha, M.R.: An approach for reverse engineering of relational databases. Commun. ACM 37(5), 42–49 (1994)

    Article  Google Scholar 

  34. Chiang, R.H.L., Barron, T.M., Storey, V.C.: Reverse engineering of relational databases: extraction of an eer model from a relational database. Data Knowl. Eng. 12(2), 107–142 (1994)

    Article  Google Scholar 

  35. Lopes, S., Petit, J.M., Toumani, F.: Discovering interesting inclusion dependencies: application to logical database tuning. Inf. Syst. 27(1), 1–19 (2002)

    Article  MATH  Google Scholar 

  36. Yao, H., Hamilton, H.J.: Mining functional dependencies from data. Data Min. Knowl. Discov. 16(2), 197–219 (2008)

    Article  MathSciNet  Google Scholar 

  37. Pannurat, N., Kerdprasop, N., Kerdprasop, K.: Database reverse engineering based on association rule mining. CoRR abs/1004.3272 (2010)

    Google Scholar 

  38. Choobineh, J., Mannino, M.V., Tseng, V.P.: A form-based approach for database analysis and design. Communications of the ACM 35(2), 108–120 (1992)

    Article  Google Scholar 

  39. Terwilliger, J.F., Delcambre, L.M.L., Logan, J.: The User Interface Is the Conceptual Model. In: Embley, D.W., Olivé, A., Ram, S. (eds.) ER 2006. LNCS, vol. 4215, pp. 424–436. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  40. Ramdoyal, R., Cleve, A., Hainaut, J.-L.: Reverse Engineering User Interfaces for Interactive Database Conceptual Analysis. In: Pernici, B. (ed.) CAiSE 2010. LNCS, vol. 6051, pp. 332–347. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  41. Di Lucca, G.A., Fasolino, A.R., de Carlini, U.: Recovering class diagrams from data-intensive legacy systems. In: Proc. of the 16th IEEE International Conference on Software Maintenance (ICSM 2000), p. 52. IEEE Computer Society (2000)

    Google Scholar 

  42. Henrard, J.: Program Understanding in Database Reverse Engineering. PhD thesis, University of Namur (2003)

    Google Scholar 

  43. van den Brink, H., van der Leek, R., Visser, J.: Quality assessment for embedded sql. In: Proc. of the 7th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2007), pp. 163–170. IEEE Computer Society (2007)

    Google Scholar 

  44. Ngo, M.N., Tan, H.B.K.: Applying static analysis for automated extraction of database interactions in web applications. Inf. Softw. Technol. 50(3), 160–175 (2008)

    Article  Google Scholar 

  45. Cornelissen, B., Zaidman, A., van Deursen, A., Moonen, L., Koschke, R.: A systematic survey of program comprehension through dynamic analysis. IEEE Trans. Software Eng. 35(5), 684–702 (2009)

    Article  Google Scholar 

  46. Debusmann, M., Geihs, K.: Efficient and Transparent Instrumentation of Application Components Using an Aspect-Oriented Approach. In: Brunner, M., Keller, A. (eds.) DSOM 2003. LNCS, vol. 2867, pp. 209–220. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  47. Del Grosso, C., Di Penta, M.: García Rodríguez de Guzmán, I.: An approach for mining services in database oriented applications. In: Proceedings of the 11th European Conference on Software Maintenance and Reengineering (CSMR 2007), pp. 287–296. IEEE Computer Society (2007)

    Google Scholar 

  48. Yang, Y., Peng, X., Zhao, W.: Domain feature model recovery from multiple applications using data access semantics and formal concept analysis. In: Proc. of the 16th International Working Conference on Reverse Engineering (WCRE 2009), pp. 215–224. IEEE Computer Society (2009)

    Google Scholar 

  49. Alalfi, M., Cordy, J., Dean, T.: WAFA: Fine-grained dynamic analysis of web applications. In: Proc. of the 11th International Symposium on Web Systems Evolution (WSE 2009), pp. 41–50. IEEE Computer Society (2009)

    Google Scholar 

  50. Cleve, A., Lemaitre, J., Hainaut, J.L., Mouchet, C., Henrard, J.: The role of implicit schema constructs in data quality. In: Proc. of the 6th International Workshop on Quality in Databases (QDB 2008), pp. 33–40 (2008)

    Google Scholar 

  51. Deursen, A.V., Kuipers, T.: Rapid system understanding: Two cobol case studies. In: Proc. of the 6th International Workshop on Program Comprehension (IWPC 1998), p. 90. IEEE Computer Society (1998)

    Google Scholar 

  52. Merlo, E., Letarte, D., Antoniol, G.: Insider and outsider threat-sensitive sql injection vulnerability analysis in php. In: Proc. Working Conf. Reverse Engineering (WCRE), pp. 147–156. IEEE Computer Society, Washington, DC (2006)

    Google Scholar 

  53. Halfond, W.G.J., Orso, A.: Combining static analysis and runtime monitoring to counter sql-injection attacks. In: WODA 2005: Proceedings of the Third International Workshop on Dynamic Analysis, pp. 1–7. ACM, New York (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Cleve, A., Noughi, N., Hainaut, JL. (2013). Dynamic Program Analysis for Database Reverse Engineering. In: Lämmel, R., Saraiva, J., Visser, J. (eds) Generative and Transformational Techniques in Software Engineering IV. GTTSE 2011. Lecture Notes in Computer Science, vol 7680. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35992-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35992-7_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35991-0

  • Online ISBN: 978-3-642-35992-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics