Skip to main content
Log in

Schema Evolution Survival Guide for Tables: Avoid Rigid Childhood and You’re En Route to a Quiet Life

  • Original Article
  • Published:
Journal on Data Semantics

Abstract

In this paper, we study the factors that relate to the survival of a table in the context of schema evolution in open-source software. We study the history of the schema of eight open-source software projects that include relational databases and extract patterns related to the survival or death of their tables. Our study shows that the probability of a table with a wide schema (i.e., a large number of attributes) being removed is systematically lower than average. Activity and duration are related to survival too. Rigid tables, without any change to their schema, are more likely to be removed than tables that sustain changes. Durations of dead and survival tables demonstrate a mirror image: dead tables’ durations are mostly short, whereas survivor tables gravitate toward higher durations. Our findings are mostly summarized by a pattern, which we call electrolysis pattern, due to its diagrammatic representation, stating that dead and survivor tables live quite different lives: tables typically die shortly after birth, with short durations and mostly no updates, whereas survivors mostly live quiet lives with few updates—except for a small group of tables with high update ratios that are characterized by high durations and survival. Equally important is the evidence that schema evolution suffers from the antagonism of gravitation to rigidity, i.e., the tendency to minimize evolution as much as possible in order to minimize the resulting impact to the surrounding code. Several factors contribute to this observation: the absence of long durations in removed tables, the low percentage of tables whose schema size is scaled up or down, and the low numbers of tables with a high rate of updates, contrasted to the high numbers of tables with zero or few updates. We complement our findings with explanations and recommendations to developers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

Notes

  1. Rigidity is used in its software engineering meaning, referring to software that is hard to evolve and maintain.

  2. http://www.cs.uoi.gr/~pvassil/projects/schemaBiographies/.

  3. An acute reader might express the concern whether it would be better to gather all the tables in one single set and average over them. We disagree: each data set comes with its own the requirements, development style and idiosyncrasy and putting all tables in a single data set, not only scandalously favors large data sets, but integrates different things. We average the behavior of schemata, not tables here.

References

  1. Cleve A, Gobert M, Meurice L, Maes J, Weber JH (2015) Understanding database schema evolution: a case study. Sci Comput Program 97:113–121

    Article  Google Scholar 

  2. Curino C, Moon HJ, Tanca L, Zaniolo C (2008) Schema evolution in wikipedia: toward a web information system benchmark. In: Proceedings of ICEIS 2008, Citeseer

  3. Curino C, Moon HJ, Deutsch A, Zaniolo C (2013) Automating the database schema evolution process. VLDB J 22(1):73–98

    Article  Google Scholar 

  4. Hartung M, Terwilliger JF, Rahm E (2011) Schema matching and mapping, chap recent advances in schema and ontology evolution. Springer, New York, pp 149–190

  5. Herrmann K, Voigt H, Behrend A, Lehner W (2015) Codel—a relationally complete language for database evolution. In: Proceedings of 19th East European conference on advances in databases and information systems (ADBIS 2015), Poitiers, France, September 8–11, 2015, pp 63–76

  6. Lehman MM, Fernandez-Ramil JC (2006) Software evolution and feedback: theory and practice. Chap Rules and tools for software evolution planning and management. Wiley, New York. ISBN-13: 978-0-470-87180-5

  7. Lin DY, Neamtiu I (2009) Collateral evolution of applications and databases. In: Proceedings of the joint international and annual ERCIM workshops on principles of software evolution (IWPSE) and software evolution (Evol) workshops, IWPSE-Evol ’09, pp 31–40

  8. Manousis P, Vassiliadis P, Zarras AV, Papastefanatos G (2015) Schema evolution for databases and data warehouses. In: 5th European Summer School on Business Intelligence (eBISS 2015), Barcelona, Spain, July 5–10, 2015, Lecture notes in business information processing (LNBIP), vol 253, pp 1–31

  9. Qiu D, Li B, Su Z (2013) An empirical analysis of the co-evolution of schema and code in database applications. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, ESEC/FSE 2013, pp 125–135

  10. Sjøberg D (1993) Quantifying schema evolution. Inf Softw Technol 35(1):35–44

    Article  Google Scholar 

  11. Skoulis I, Vassiliadis P, Zarras A (2014) Open-source databases: within, outside, or beyond Lehman’s laws of software evolution? In: Proceedings of 26th international conference on advanced information systems engineering—CAiSE 2014, pp 379–393

  12. Skoulis I, Vassiliadis P, Zarras AV (2015) Growing up with stability: how open-source relational databases evolve. Inf Syst 53:363–385

    Article  Google Scholar 

  13. Vassiliadis P, Zarras AV (2017) Survival in schema evolution: putting the lives of survivor and dead tables in counterpoint. In: Proceedings of 29th international conference on advanced information systems engineering (CAiSE 2017), Essen, Germany, June 12–16, 2017, pp 333–347

  14. Vassiliadis P, Zarras AV, Skoulis I (2015) How is life for a table in an evolving relational schema? Birth, Death and Everything in Between. In: Proceedings of 34th international conference on conceptual modeling (ER 2015), Stockholm, Sweden, October 19–22, 2015, pp 453–466

  15. Vassiliadis P, Zarras A, Skoulis I (2017) Gravitating to rigidity: patterns of schema evolution- and its absence- in the lives of tables. Inf Syst 63:24–46

    Article  Google Scholar 

  16. Wu S, Neamtiu I (2011) Schema evolution analysis for embedded databases. In: Proceedings of the 2011 IEEE 27th international conference on data engineering workshops, ICDEW ’11, pp 151–156

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Panos Vassiliadis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vassiliadis, P., Zarras, A.V. Schema Evolution Survival Guide for Tables: Avoid Rigid Childhood and You’re En Route to a Quiet Life. J Data Semant 6, 221–241 (2017). https://doi.org/10.1007/s13740-017-0083-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13740-017-0083-x

Keywords

Navigation