Mathematical Geosciences

, Volume 43, Issue 8, pp 971–993 | Cite as

Dependence of Bayesian Model Selection Criteria and Fisher Information Matrix on Sample Size

  • Dan Lu
  • Ming Ye
  • Shlomo P. Neuman


Geostatistical analyses require an estimation of the covariance structure of a random field and its parameters jointly from noisy data. Whereas in some cases (as in that of a Matérn variogram) a range of structural models can be captured with one or a few parameters, in many other cases it is necessary to consider a discrete set of structural model alternatives, such as drifts and variograms. Ranking these alternatives and identifying the best among them has traditionally been done with the aid of information theoretic or Bayesian model selection criteria. There is an ongoing debate in the literature about the relative merits of these various criteria. We contribute to this discussion by using synthetic data to compare the abilities of two common Bayesian criteria, BIC and KIC, to discriminate between alternative models of drift as a function of sample size when drift and variogram parameters are unknown. Adopting the results of Markov Chain Monte Carlo simulations as reference we confirm that KIC reduces asymptotically to BIC and provides consistently more reliable indications of model quality than does BIC for samples of all sizes. Practical considerations often cause analysts to replace the observed Fisher information matrix entering into KIC with its expected value. Our results show that this causes the performance of KIC to deteriorate with diminishing sample size. These results are equally valid for one and multiple realizations of uncertain data entering into our analysis. Bayesian theory indicates that, in the case of statistically independent and identically distributed data, posterior model probabilities become asymptotically insensitive to prior probabilities as sample size increases. We do not find this to be the case when working with samples taken from an autocorrelated random field.


Model uncertainty Model selection Variogram models Drift models Prior model probability Asymptotic analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Akaike H (1974) New look at statistical model identification. IEEE Trans Autom Control AC-19:716–722 CrossRefGoogle Scholar
  2. Berk R (1966) Limiting behavior of posterior distributions when the model is incorrect. Ann Math Stat 37:51–58 CrossRefGoogle Scholar
  3. Bernardo JM, Smith AFM (1994) Bayesian theory. Wiley, Chichester CrossRefGoogle Scholar
  4. Burnham KP, Anderson DR (2002) Model selection and multiple model inference: a practical information-theoretical approach, 2nd edn. Springer, New York Google Scholar
  5. Burnham KP, Anderson DR (2004) Multimodel inference—understanding AIC and BIC in model selection. Sociol Methods Res 33(2):261–304 conditions: 3. Application to synthetic and field data. Water Resour Res 22(2):228–242 CrossRefGoogle Scholar
  6. Chib S (1995) Marginal likelihood from the Gibbs output. J Am Stat Assoc 90:1313–1321 CrossRefGoogle Scholar
  7. Cressie N (1993) Statistics of spatial data. Wiley, New York Google Scholar
  8. Deutsch CV, Journel AG (1998) GSLIB: Geostatistical software library and user’s guide, 2nd edn. Oxford Univ Press, New York Google Scholar
  9. Draper D (1995) Assessment and propagation of model uncertainty. J R Stat Soc B 57(1):45–97 Google Scholar
  10. Draper D (2007) Bayesian multilevel analysis and MCMC. In: de Leeuw J (ed) Handbook of Quantitative Multilevel Analysis. Springer, New York, Chapter 3 Google Scholar
  11. Gelman A, Carlin JB, Stern HS, Rubin DB (1995) Bayesian data analysis, 1st edn. Chapman & Hall, USA Google Scholar
  12. Hastings W (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1):97–109 CrossRefGoogle Scholar
  13. Hoeksema RJ, Kitanidis PK (1985) Analysis of the spatial structure of properties of selected aquifers. Water Resour Res 21(4):563–572 CrossRefGoogle Scholar
  14. Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: A tutorial. Stat Sci 14(4):382–417 CrossRefGoogle Scholar
  15. Hurvich CM, Tsai CL (1989) Regression and time series model selection in small sample. Biometrika 76(2):99–104 CrossRefGoogle Scholar
  16. Jeffreys H (1961) Theory of probability, 3rd edn. Oxford University Press, Oxford Google Scholar
  17. Journel AG, Rossi ME (1989) When do we need a trend model in kriging? Math Geol 21(7):715–739 CrossRefGoogle Scholar
  18. Kashyap RL (1982) Optimal choice of AR and MA parts in autoregressive moving average models. IEEE Trans Pattern Anal Mach Intell 4(2):99–104 CrossRefGoogle Scholar
  19. Kass RE, Vaidyanathan SK (1992) Approximate Bayes factors and orthogonal parameters, with application to testing equality of two binomial proportions. J R Stat Soc, Ser B, Stat Methodol 54(1):129–144 Google Scholar
  20. Kass RE, Raftery AE (1995) Bayesian factor. J Am Stat Assoc 90:773–795 CrossRefGoogle Scholar
  21. Kitanidis PK, Lane RW (1985) Maximum likelihood parameter estimation of hydrologic spatial processes by the Gaussian-Newton method. J Hydrodyn 79:53–71 CrossRefGoogle Scholar
  22. Kyriakidis PC, Journel AG (1999) Geostatistical space–time models: a review. Math Geol 31(6):651–684 CrossRefGoogle Scholar
  23. Lewis SM, Raftery AE (1997) Estimating Bayes factor via posterior simulation with the Laplace–Metropolis estimator. J Am Stat Assoc 92(438):648–655 CrossRefGoogle Scholar
  24. Leuangthong O, Deutsch CV (2004) Transformation of residuals to avoid artifacts in geostatistical modeling with a trend. Math Geol 36(3):287–305 CrossRefGoogle Scholar
  25. Marchant BP, Lark RM (2004) Estimating variogram uncertainty. Math Geol 36(8):867–898 CrossRefGoogle Scholar
  26. Marchant BP, Lark RM (2007) The Matern variogram model: implications for uncertainty propagation and sampling in geostatistical surveys. Geoderma 140:337–345 CrossRefGoogle Scholar
  27. Marshall L, Nott D, Sharma A (2004) A comparative study of Markov chain Monte Carlo methods for conceptual rainfall-runoff modeling. Water Resour Res 40:W02501. doi: 10.1029/2003WR002378 CrossRefGoogle Scholar
  28. Marshall L, Nott D, Sharma A (2005) Hydrological model selection: A Bayesian alternative. Water Resour Res 41:W10422. doi: 10.1029/2004WR003719 CrossRefGoogle Scholar
  29. Matérn B (1986) Spatial variation. Springer, Berlin Google Scholar
  30. McBratney AB, Webster R (1986) Choosing functions for semi-variogram of soil properties and fitting them to sampling estimates. J Soil Sci 37:617–639 CrossRefGoogle Scholar
  31. Mosteller F, Wallace DL (1964) Inference and disputed authorship: the federalist. Addison-Wesley, Reading Google Scholar
  32. Neath AA, Cavanaugh JE (1997) Regression and time series model selection using variants of the Schwarz information criterion. Commun Stat, Theory Methods 26:559–580 CrossRefGoogle Scholar
  33. Neuman SP (2003) Maximum likelihood Bayesian averaging of alternative conceptual-mathematical models. Stoch Environ Res Risk Assess 17(5):291–305 CrossRefGoogle Scholar
  34. Neuman SP, Xue L, Ye M, Lu D (2011) Bayesian analysis of data-worth considering model and parameter uncertainties. Adv Water Resour doi: 10.1016/j.advwaters.2011.02007 Google Scholar
  35. Nowak W (2010) Measures of parameter uncertainty in geostatistical estimation and geostatistical optimal design. Math Geosci 42(2):199–221 CrossRefGoogle Scholar
  36. Nowak W, de Barros FPJ, Rubin Y (2010) Bayesian geostatistical design: task—driven optimal site investigation when the geostatistical model is uncertain. Water Resour Res 46:W03535. doi: 10.1029/2009WR008312 CrossRefGoogle Scholar
  37. Ortiz CJ, Deutsch CV (2002) Calculation of uncertainty in the variogram. Math Geol 34(2):169–183 CrossRefGoogle Scholar
  38. Pardo-Iguzquiza E, Dowd P (2001) Variance-covariance matrix of the experimental variogram: assessing variogram uncertainty. Math Geol 33(4):397–419 CrossRefGoogle Scholar
  39. Pardo-Iguzquiza E, Chico-Olmo M, Garcia-Soldado MJ, Luque-Espinar JA (2009) Using semivariogram parameter uncertainty in hydrogeological applications. Ground Water 47(1):25–34 CrossRefGoogle Scholar
  40. Poeter EP, Anderson DA (2005) Multimodel ranking and inference in ground water modeling. Ground Water 43(4):597–605 CrossRefGoogle Scholar
  41. Poeter EP, Hill MC (2007) MMA, A computer code for multi-model analysis. US Geological Survey Techniques and Methods TM6-E3 Google Scholar
  42. Raftery AE (1995) Bayesian model selection in social research. Sociol Method 25:111–163 CrossRefGoogle Scholar
  43. Riva M, Willmann M (2009) Impact of log-transmissivity variogram structure on groundwater flow and transport predictions. Adv Water Resour 32:1311–1322 CrossRefGoogle Scholar
  44. Riva M, Panzeri M, Guadagnini A, Neuman SP (2011) Role of model selection criteria in geostatistical inverse estimation of statistical data- and model-parameters. Water Resour Res 47:W07502. doi: 10.1029/2011WR010480 CrossRefGoogle Scholar
  45. Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471 CrossRefGoogle Scholar
  46. Rojas R, Feyen L, Dassargues A (2008) Conceptual model uncertainty in groundwater modeling: Combining generalized likelihood uncertainty estimation and Bayesian model averaging. Water Resour Res 44:W12418. doi: 10.1029/2008WR006908 CrossRefGoogle Scholar
  47. Rojas R, Batelaan O, Feyen L, Dassargues A (2010a) Assessment of conceptual model uncertainty for the regional aquifer Pampa del Tamarugal–North Chile. Hydrol Earth Syst Sci 14(2):171–192 CrossRefGoogle Scholar
  48. Rojas R, Kahunde S, Peeters L, Batelaan O, Feyen L, Dassargues A (2010b) Application of a multimodel approach to account for conceptual model and scenario uncertainties in groundwater modelling. J Hydrol 394(3–4):416–435 CrossRefGoogle Scholar
  49. Rojas R, Feyen L, Batelaan O, Dassargues A (2010c) On the value of conditioning data to reduce conceptual model uncertainty in groundwater modeling. Water Resour Res 46:W08520. doi: 10.1029/2009WR008822 CrossRefGoogle Scholar
  50. Samper FJ, Neuman SP (1989a) Estimation of spatial covariance structures by adjoint state maximum likelihood cross-validation: 1. Theory. Water Resour Res 25(3):351–362 CrossRefGoogle Scholar
  51. Samper FJ, Neuman SP (1989b) Estimation of spatial covariance structures by adjoint state maximum likelihood cross-validation: 2. Synthetic experiments. Water Resour Res 25(3):363–371 CrossRefGoogle Scholar
  52. Singh A, Walker DD, Minsker BS, Valocchi AJ (2010) Incorporating subjective and stochastic uncertainty in an interactive multi-objective groundwater calibration framework. Stoch Environ Res Risk Assess. doi: 10.1007/s00477-010-0384-1 Google Scholar
  53. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464 CrossRefGoogle Scholar
  54. Tsai FTC, Li X (2010) Reply to comment by Ming Ye et al. on “Inverse groundwater modeling for hydraulic conductivity estimation using Bayesian model averaging and variance window”. Water Resour Res 46:W02802. doi: 10.1029/2009WR008591 CrossRefGoogle Scholar
  55. Tsai FTC, Li X (2008a) Inverse groundwater modeling for hydraulic conductivity estimation using Bayesian model averaging and variance window. Water Resour Res 44:W09434. doi: 10.1029/2007WR006576 CrossRefGoogle Scholar
  56. Tsai FTC, Li X (2008b) Multiple parameterization for hydraulic conductivity identification. Ground Water 46(6):851–864 Google Scholar
  57. Ye M, Neuman SP, Meyer PD (2004) Maximum Likelihood Bayesian averaging of spatial variability models in unsaturated fractured tuff. Water Resour Res 40:W05113. doi: 10.1029/2003WR002557 CrossRefGoogle Scholar
  58. Ye M, Neuman SP, Meyer PD, Pohlmann KF (2005) Sensitivity analysis and assessment of prior model probabilities in MLBMA with application to unsaturated fractured tuff. Water Resour Res 41:W12429. doi: 10.1029/2005WR004260 CrossRefGoogle Scholar
  59. Ye M, Meyer PD, Neuman SP (2008a) On model selection criteria in multimodel analysis. Water Resour Res 44:W03428. doi: 10.1029/2008WR006803 CrossRefGoogle Scholar
  60. Ye M, Pohlmann KF, Chapman JB (2008b) Expert elicitation of recharge model probabilities for the Death Valley regional flow system. J Hydrol 354:102–115. doi: 10.1016/j.jhydrol.2008.03.001 CrossRefGoogle Scholar
  61. Ye M, Pohlmann KF, Chapman JB, Pohll GM, Reeves DM (2010a) A model-averaging method for assessing groundwater conceptual model uncertainty. Ground Water. doi: 10.1111/j.1745-6584.2009.00633.x Google Scholar
  62. Ye M, Lu D, Neuman SP, Meyer PD (2010b) Comment on “Inverse groundwater modeling for hydraulic conductivity estimation using Bayesian model averaging and variance window” by Frank T.-C. Tsai and Xiaobao Li. Water Resour Res 46:W02801. doi: 10.1029/2009WR008501 CrossRefGoogle Scholar

Copyright information

© International Association for Mathematical Geosciences 2011

Authors and Affiliations

  1. 1.Department of Scientific ComputingFlorida State UniversityTallahasseeUSA
  2. 2.Department of Hydrology and Water ResourcesUniversity of ArizonaTucsonUSA

Personalised recommendations