Abstract
This chapter discusses the issue of choosing the best computer model for simulating a real-world phenomenon through the process of validating the model’s output against the historical, real-world data. Four families of techniques are discussed that are used in the context of validation. One is based on the comparison of statistical summaries of the historical data and the model output. The second is used where the models and data are stochastic, and distributions of variables must be compared, and a metric is used to measure their closeness. After exploring the desirable properties of such a measure, the paper compares the third and fourth methods (from information theory) of measuring closeness of patterns, using an example from strategic market competition. The techniques can, however, be used for validating computer models in any domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For an overview of types of computer simulation modelling, see Gilbert and Troitzsch (2005).
- 2.
We distinguish between broader measure and narrower metric—a metric is a measure, but a measure is not necessarily a metric—as discussed in Sect. 13.2 below.
- 3.
- 4.
As Guerini and Moneta (2017) observe, the appearance of many measures to validate agent-based simulation models is an indication of “the vitality of the agent-based community.”
- 5.
This chapter, in effect, focuses on techniques of output validation (see Chap. 30, Sects. 4.2, 5.1 and 5.2 by Fagiolo et al. in this volume), going into greater detail about three of the six measures they discuss.
- 6.
This chapter puts the work of Marks (2013) into a wider context.
- 7.
A temperature of 100K is twice as hot as 50 K, but 100 \(^{\circ }\)C is not twice as hot as 50 \(^{\circ }\)C: K is a ratio scale, but \(^{\circ }\)C is only an interval scale (“by how much?”); “hotter” and “colder” is only an ordered scale.
- 8.
Lacking only symmetry, it is a quasi-metric; lacking only the identity of indiscernables, it is a semi-metric; lacking only the triangle inequality, it is a pseudo-metric.
- 9.
Guerini and Moneta (2017) present a new method of validation, based on comparing structures of vector autoregressive models estimated from both model and historical data.
- 10.
- 11.
It is a semi-quasi-metric.
- 12.
The K-L measure is defined only if \(p_i = 0\) whenever \(\pi _i = 0\).
- 13.
As Akaike (1973) first showed, the negative of K-L information is Boltzmann’s entropy. Hence minimizing the K-L distance is equivalent to maximizing the entropy; hence the term “maximum entropy principle.” But, as Burnham and Anderson (2002) point out, maximizing entropy is subject to a constraint—the model of the information in the data. A good model contains the information in the historical data, leaving only “noise.” It is the noise (or entropy or uncertainty) that is maximized under the concept of the entropy maximizing principle. Minimizing K-L information loss then results in an approximating model g that loses a minimum amount of information in the data f. The K-L information loss is averaged negative entropy, hence the expectation with respect to f. Fagiolo et al. (2007, p. 211) note further that “K-L distance can be an arbitrarily bad choice from a decision-theoretic perspective ... if the set of models does not contain the true underlying model ... then we will not want to select a model based on K-L distance.” This is because “K-L distance looks for where models make the most different predictions—even if these differences concern aspects of the data behaviour that are unimportant to us.”
- 14.
Although, as (Lamperti 2018b) points out, so long as the simulated data are always compared with the historical data, and not with simulated data from other models, GSL-div might still allow model choice.
- 15.
The three models differ in more than the frequencies of the eight states (Table 13.1): each model contains three distinct mappings from state to action, and, as deterministic finite automata (Marks 1992), they are ergodic, with emergent periodicities. Model A has a period of 13 weeks, Model B of 6 weeks, and Model C of 8 weeks. It is not clear that the historical data exhibit ergodicity, absence of which will make simulation initial conditions significant (Fagiolo et al. 2007). Initial conditions might determine the periodicity of the simulation model.
- 16.
- 17.
Figures 2 and 3 of Marks (2013) plot these behaviours. State 000 corresponds to all three players choosing High prices; State 001 corresponds to Players 1 and 2 choosing High prices and Player 3 choosing a Low price, etc.
- 18.
This number was determined by a Monte Carlo bootstrap simulation of 100,000 pairs of sets of four quasi-random time series, calculating the SSM between each pair, and examining the distribution. The lowest observed SSM of 64 appeared twice, that is, with a frequency of 2/100,000, or 0.002 percent.
- 19.
See further discussion in Marks (2013), Appendix 2.
- 20.
It is not correct to call the function r a possibility distribution function, since it does not distribute any fixed value among the elements of the set X: \(1 \le \sum _{x \in X} r ( x) \le |X| .\)
- 21.
It might be objected that this reordering loses information. But this overlooks the fact that the order of the states is arbitrary. It should not be forgotten that the definition of the states with more than 1 week’s memory captures dynamic elements of interaction.
- 22.
Normalization here means \(r_1 = 1\), not \(\sum r_i = 1\).
- 23.
For clarity, we have included the \((i=1)\)th element, \((r_1 - r_2 ) \log _2 1\), which is always zero, by construction, consistent with Eq. (13.2).
- 24.
We could also define a normalized GHM.
- 25.
Exploration of these differences awaits further research.
References
Akaike, H. (1973). Information theory as an extension of the maximum likelihood principle. In B. N. Petrov, & F. Csaki (Eds.), Second International Symposium on Information Theory (pp. 267–281). Budapest: Akademiai Kiado.
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach (2nd ed.). New York: Springer.
Chen, S.-H., Chang, C.-L., & Du, Y.-R. (2012). Agent-based economic models and econometrics. The Knowledge Engineering Review, 27(2), 187–219.
Fagiolo, G., Moneta, A., & Windrum, P. (2007). A critical guide to empirical validation of agent-based models in economics: Methodologies, procedures, and open problems. Computational Economics, 30(3), 195–226.
Fagiolo, G., Guerini, M., Lamperti, F., Moneta, A., & Roventini, A. (2019). Validation of agent-based models in economics and finance. pp. 763–787.
Ferson, S., Oberkampf, W. L., & Ginzburg, L. (2008). Model validation and predictive capability for the thermal challenge problem. Computer Methods in Applied Mechanics and Engineering, 197, 2408–2430.
Gilbert, N., & Troitzsch, K. G. (2005). Simulation for the social scientist (2nd ed.). Open University Press.
Guerini, M., & Moneta, A. (2017). A method for agent-based models validation. Journal of Economic Dynamics & Control, 82, 125–141.
Hartley, R. V. L. (1928). Transmission of information. The Bell System Technical Journal, 7(3), 535–563.
Klir, G. J. (2006). Uncertainty and information: Foundations of generalized information theory. New York: Wiley.
Krause, E. F. (1986). Taxicab geometry: An adventure in non-euclidean geometry, New York: Dover. (First published by Addison-Wesley in 1975.)
Kullback, J. L., & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 79–86.
Lamperti, F. (2018a). An information theoretic criterion for empirical validation of simulation models. Econometrics and Statistics, 5, 83–106.
Lamperti, F. (2018b). Empirical validation of simulated models through the GSL-div: An illustrative application. Journal of Economic Interaction and Coordination, 13, 143–171.
Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1), 145–151.
Liu Y., Chen W., Arendt P., & Huang H.-Z. (2010). Towards a better understanding of model validation metrics. In 13th AIAA/ISSMO Multidisciplinary Analysis Optimization Conference, Multidisciplinary Analysis Optimization Conferences.
Mankin, J. B., O’Neill, R. V., Shugart, H. H., & Rust, B. W. (1977). The importance of validation in ecosystem analysis. In G. S. Innis (Ed.), New Directions in the Analysis of Ecological Systems, Part 1, Simulation Council Proceedings Series, Simulation Councils, La Jolla, California (Vol. 5, pp. 63–71). Reprinted. In H. H. Shugart & R. V. O’Neill (Eds.), Systems ecology (pp. 309–317). Hutchinson and Ross, Stroudsburg, Pennsylvania: Dowden.
Marks, R. E. (1992). Breeding hybrid strategies: Optimal behaviour for oligopolists. Journal of Evolutionary Economics, 2, 17–38.
Marks, R. E. (2007). Validating simulation models: A general framework and four applied examples. Computational Economics, 30(3), 265–290. http://www.agsm.edu.au/bobm/papers/s1.pdf.
Marks, R. E. (2010). Comparing two sets of time-series: The state similarity measure. In V. A. Alexandria (Ed.), 2010 Joint Statistical Meetings Proceedings-Statistics: A Key to Innovation in a Data-centric World, Statistical Computing Section (pp. 539–551). American Statistical Association.
Marks, R. E. (2013). Validation and model selection: Three similarity measures compared. Complexity Economics, 2(1), 41–61. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.401.6982&rep=rep1&type=pdf.
Marks, R. E. (2016). Monte Carlo. In D. Teece, & M. Augier (Eds.), The palgrave encyclopedia of strategic management. London: Palgrave.
Marks, R. E., Midgley, D. F., & Cooper, L. G. (1995). Adaptive behavior in an oligopoly. In J. Biethahn, & V. Nissen (Eds.), Evolutionary algorithms in management applications (pp. 225–239). Berlin: Springer.
Midgley, D. F., Marks, R. E., & Cooper, L. G. (1997). Breeding competitive strategies. Management Science, 43(3), 257–275.
Midgley, D. F., Marks, R. E., & Kunchamwar, D. (2007). The building and assurance of agent-based models: An example and challenge to the field. Journal of Business Research, 60(8), 884–893. (Special Issue: Complexities in Markets).
Oberkampf, W. L., & Roy, C. J. (2010). Chapter 12: Model accuracy assessment. Verification and validation in scientific computing (pp. 469–554). Cambridge: Cambridge University Press.
Ramer, A. (1989). Conditional possibility measures. International Journal of Cybernetics and Systems, 20, 233–247. Reprinted in D. Dubois, H. Prade, & R. R. Yager, (Eds.). (1993). Readings in fuzzy sets for intelligent systems (pp. 233–240). San Mateo, California: Morgan Kaufmann Publishers.
Rényi, A. (1970). Probability theory. Amsterdam: North-Holland (Chapter 9, Introduction to information theory, pp. 540–616).
Roy, C. J., & Oberkampf, W. L. (2011). A comprehensive framework for verification, validation, and uncertainty quantification in scientific computing. Computer Methods in Applied Mechanics and Engineering, 200, 2131–2144.
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423, 623–656.
Acknowledgements
I should like to thank Dan MacKinlay for his mention of the K-L information loss measure, Arthur Ramer for his mention of the Hartley or U-uncertainty metric and his suggestions, and Vessela Daskalova for her mention of the “cityblock” metric. The efforts of the editors of this volume and anonymous referees were very constructive, and have greatly improved this chapter’s presentation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Marks, R.E. (2019). Validation Metrics: A Case for Pattern-Based Methods. In: Beisbart, C., Saam, N. (eds) Computer Simulation Validation. Simulation Foundations, Methods and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-70766-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-70766-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70765-5
Online ISBN: 978-3-319-70766-2
eBook Packages: Computer ScienceComputer Science (R0)