Skip to main content

Dynamical Symmetries and Model Validation

  • Conference paper
  • First Online:

Part of the book series: Fields Institute Communications ((FIC,volume 82))

Abstract

I introduce a new method for validating models—including stochastic models—that gets at the reliability of a model’s predictions under intervention or manipulation of its inputs and not merely at its predictive reliability under passive observation. The method is derived from philosophical work on natural kinds, and turns on comparing the dynamical symmetries of a model with those of its target, where dynamical symmetries are interventions on model variables that commute with time evolution. I demonstrate that this method succeeds in testing aspects of model validity for which few other tools exist.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    For an influential engineering perspective, see [1]. For a recent and comprehensive overview of both software and systems modeling aspects from the National Research Council, see [5]. For a pithy and very current overview of verification in the world of software design, see [26]. Finally, for an accessible and illuminating discussion of the state of the art from the perspective of applied mathematics, see [7].

  2. 2.

    Note that this terminology is at odds with machine learning, where each specific set of parameter values constitutes a model. What I’m calling a model is, in the context of machine learning, or statistical learning theory a space of hypotheses or class of models.

  3. 3.

    Most textbooks on machine learning include descriptions of cross-validation. An especially lucid presentation can be found in Flach [8, ch. 12].

  4. 4.

    The estimate of the generalization error of a model is biased for cross-validation, but in the direction of over-estimating the error (see [11, ch. 7.10]).

  5. 5.

    Attention to structural validation is curiously discipline dependent. Concepts (such as those pertaining to testing “white-box” models in systems engineering) seem to have relatively little penetration in other fields such as ecology. This is probably partly due to the quantity and precision of data available in these different fields. Structural tests tend to be data-hungry or to require manipulations of the target system that are not available to, e.g., field ecologists.

  6. 6.

    See [3] for a widely-cited review.

  7. 7.

    Balci [1] calls this “stress testing.”

  8. 8.

    As indicated in [13], I am using the term “intervention” in its technical sense as it appears in the literature on causation. In this context, “…an intervention on X (with respect to Y) is a causal process that directly changes the value of X in such a way that, if a change in the value of Y should occur, it will occur only through the change in the value of X and not in some other way”[27].

  9. 9.

    In principle, one could take a single long time series for each system and cut it in half to obtain two such curves, but for ease of exposition, I assume the time series are obtained separately.

  10. 10.

    For an English translation of the French, see [25].

  11. 11.

    Another equally old and venerable model is that of Gompertz [10]. This model also continues to be deployed for growth modeling.

  12. 12.

    Note that the initial value of the population, x 0 is fit independently in each case. That’s because, while the other parameters are presumed to be intrinsic features of the growing population, the initial population size is variable and assumed to have different (unknown) values in each case.

  13. 13.

    This is the line of reasoning presented in [29], where the Gompertz model is favored.

  14. 14.

    It’s generally possible to determine and fit symmetries numerically, without an analytic, closed form solution. But since one is available in this case, I use it to simplify the analysis.

  15. 15.

    This data was obtained from Connelly [6] and is used here with permission (and gratitude). The dataset can be found at https://zenodo.org/record/1171129. I am specifically considering the sixteenth row of the table.

References

  1. Balci O (1994) Validation, verification, and testing techniques throughout the life cycle of a simulation study. Ann Oper Res 53(1):121–173

    Article  MathSciNet  Google Scholar 

  2. Barlas Y (1989) Multiple tests for validation of system dynamics type of simulation models. Eur J Oper Res 42(1):59–87

    Article  MATH  Google Scholar 

  3. Barlas Y (1996) Formal aspects of model validity and validation in system dynamics. Syst Dyn Rev 12(3):183–210

    Article  Google Scholar 

  4. Buchanan RL, Whiting RC, Damert WC (1997) When is simple good enough: a comparison of the Gompertz, Baranyi, and three-phase linear models for fitting bacterial growth curves. Food Microbiol 14(4):313–326.

    Article  Google Scholar 

  5. Committee on Mathematical Foundations of Verification, Validation, and Uncertainty Quantification (2012) Assessing the reliability of complex models: mathematical and statistical foundations of verification, validation, and uncertainty quantification. National Academy Press, Washington

    Google Scholar 

  6. Connelly B (2014) Data set for ‘analyzing microbial growth with R’. https://doi.org/10.5281/zenodo.1171129

  7. Fillion N (2017) The vindication of computer simulations. In: Lenhard J, Carrier M (eds) Mathematics as a tool: tracing new roles of mathematics in the sciences. Springer, Cham, pp 137–155

    Chapter  Google Scholar 

  8. Flach P (2012) Machine learning: the art and science of algorithms that make sense of data. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  9. Fujikawa H, Kai A, Morozumi S (2004) A new logistic model for Escherichia coli growth at constant and dynamic temperatures. Food Microbiol 21(5):501–509

    Article  Google Scholar 

  10. Gompertz B (1825) XXIV. On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. In a letter to Francis Baily, Esq. F. R. S. &c. Philos Trans R Soc Lond 115:513–583

    Article  Google Scholar 

  11. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer series in statistics, 2nd edn. Springer, New York

    Google Scholar 

  12. Higham DJ (2001) An algorithmic introduction to numerical simulation of stochastic differential equations. SIAM Rev 43(3):525–546

    Article  MathSciNet  MATH  Google Scholar 

  13. Jantzen BC (2014) Projection, symmetry, and natural kinds. Synthese 192(11):3617–3646

    Article  Google Scholar 

  14. Jantzen BC (2017) Dynamical kinds and their discovery. In: Proceedings of the UAI 2016 workshop on causation: foundation to application. ArXiv: 1612.04933

    Google Scholar 

  15. Ling Y, Mahadevan S (2013) Quantitative model validation techniques: new insights. Reliab Eng Syst Saf 111:217–231

    Article  Google Scholar 

  16. Liu M, Fan M (2017) Permanence of stochastic Lotka–Volterra systems. J Nonlinear Sci 27(2):425–452

    Article  MathSciNet  MATH  Google Scholar 

  17. McCarthy MA, Broome LS (2000) A method for validating stochastic models of population viability: a case study of the mountain pygmy-possum (Burramys parvus). J Anim Ecol 69(4):599–607

    Article  Google Scholar 

  18. Miller JH (1998) Active nonlinear tests (ANTs) of complex simulation models. Manag Sci 44(6):820–830

    Article  MATH  Google Scholar 

  19. Rhinehart RR (2016) Nonlinear regression modeling for engineering applications: modeling, model validation, and enabling design of experiments. Wiley, Hoboken

    Book  Google Scholar 

  20. Skiadas CH (2010) Exact solutions of stochastic differential equations: Gompertz, generalized logistic and revised exponential. Methodol Comput Appl Probab 12(2):261–270

    Article  MathSciNet  MATH  Google Scholar 

  21. Sokal RR, Rohlf FJ (1994) Biometry: the principles and practices of statistics in biological research, 3rd edn. W. H. Freeman, New York

    MATH  Google Scholar 

  22. Spirtes P, Glymour CN, Scheines R (2000) Causation, prediction, and search. Adaptive computation and machine learning, 2nd edn. MIT Press, Cambridge

    MATH  Google Scholar 

  23. Tsoularis A, Wallace J (2002) Analysis of logistic growth models. Math Biosci 179(1):21–55

    Article  MathSciNet  MATH  Google Scholar 

  24. Verhulst PF (1838) Notice sur la loi que la populations suit dans son accroissement. Correspondence Mathématique et Physique. X:113–121

    Google Scholar 

  25. Vogels M et al (1975) P. F. Verhulst’s ‘Notice sur la loi que la populations suit dans son accroissement’ from correspondence mathematique et physique. Ghent, vol. X, 1838. J Biol Phys 3(4):183–192

    Article  Google Scholar 

  26. Wilcox JR (2018) Research for practice: highlights in systems verification. Commun ACM 61(2):48–49

    Google Scholar 

  27. Woodward J (2001) Law and explanation in biology: invariance is the kind of stability that matters. Philos Sci 68(1):1–20

    Article  Google Scholar 

  28. Zeigler BP, Praehofer H, Kim TG (2000) Theory of modeling and simulation, 2nd edn. Academic, San Diego

    MATH  Google Scholar 

  29. Zwietering MH et al (1990) Modeling of the bacterial growth curve. Appl Environ Microbiol 56(6):1875–1881

    Google Scholar 

Download references

Acknowledgements

I am grateful to the participants in the 2015 Algorithms and Complexity in Mathematics, Epistemology and Science (ACMES) conference for insightful discussion of an early algorithm for discovering dynamical kinds, to Cosmo Grant for pointing out a physical inconsistency in the first version of one of my examples, and to Nicolas Fillion for helpful comments on a previous draft of this paper. The work presented here was supported by the National Science Foundation under Grant No. 1454190.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin C. Jantzen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jantzen, B.C. (2019). Dynamical Symmetries and Model Validation. In: Fillion, N., Corless, R., Kotsireas, I. (eds) Algorithms and Complexity in Mathematics, Epistemology, and Science. Fields Institute Communications, vol 82. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-9051-1_6

Download citation

Publish with us

Policies and ethics