# Sampling Strategies for Uncertainty Reduction in Categorical Random Fields: Formulation, Mathematical Analysis and Application to Multiple-Point Simulations

- 105 Downloads

## Abstract

The task of optimal sampling for the statistical simulation of a discrete random field is addressed from the perspective of minimizing the posterior uncertainty of non-sensed positions given the information of the sensed positions. In particular, information theoretic measures are adopted to formalize the problem of optimal sampling design for field characterization, where concepts such as information of the measurements, average posterior uncertainty, and the resolvability of the field are introduced. The use of the entropy and related information measures are justified by connecting the task of simulation with a source coding problem, where it is well known that entropy offers a fundamental performance limit. On the application, a one-dimensional Markov chain model is explored where the statistics of the random object are known, and then the more relevant case of multiple-point simulations of channelized facies fields is studied, adopting in this case a training image to infer the statistics of a non-parametric model. In both contexts, the superiority of information-driven sampling strategies is proved in different settings and conditions, with respect to random or regular sampling.

## Keywords

Sampling strategies Optimal sampling design Information theory Entropy and conditional entropy Uncertainty reduction Multiple-point simulations Channelized facies models Geostatistics## Notes

### Acknowledgements

This material is based on work supported by Grants of Conicyt-Chile (PhD Schollarship 2013), Fondecyt Grants 1170854, 1151029 and 1181823, the Biomedical Neuroscience Institute (ICM, P09-015-F), and the Advanced Center for Electrical and Electronic Engineering (AC3E), Basal Project FB0008.

## References

- Abellan A, Noetinger B (2010) Optimizing subsurface field data acquisition using information theory. Math Geosci 42(6):603–630. https://doi.org/10.1007/s11004-010-9285-6 CrossRefGoogle Scholar
- Afshari S, Pishvaie M, Aminshahidy B (2014) Well placement optimization using a particle swarm optimization algorithm, a novel approach. Pet Sci Technol 32(2):170–179CrossRefGoogle Scholar
- Arpat B, Caers J (2007) Conditional simulations with patterns. Math Geol 39(2):177–203CrossRefGoogle Scholar
- Aspie D, Barnes RJ (1990) Infill-sampling design and the cost of classification errors. Math Geol 22(8):915–932CrossRefGoogle Scholar
- Bangerth W, Klie H, Matossian V, Parashar M, Wheeler M (2005) An autonomic reservoir framework for the stochastic optimization of well placement. Clust Comput 8:255–269CrossRefGoogle Scholar
- Bangerth W, Klie H, Wheeler MF, Stoffa P, Sen M (2006) On optimization algorithms for the reservoir oil well placement problem. Comput Geosci 10:303–319CrossRefGoogle Scholar
- Baraniuk RG, Davenport M, DeVore R, Wakin M (2008) A simple proof of the restricted isometry property for random matrices. Constr Approx 28(3):253–263CrossRefGoogle Scholar
- Bittencourt AC, Horne RN (1997) Reservoir development and design optimization. In: SPE annual technical conference and exhibition, society of petroleum engineers, San Antonio, Texas, SPE, vol 38895, pp 1–14Google Scholar
- Boyko N, Karamemis G, Kuzmenko V, Uryasev S (2014) Sparse signal reconstruction: LASSO and cardinality approaches. Springer, Cham, pp 77–90Google Scholar
- Brus DJ, Heuvelink GBM (2007) Optimization of sample patterns for universal kriging of environmental variables. Geoderma 138:86–95CrossRefGoogle Scholar
- Bui H, La C, Do M (2015) A fast tree-based algorithm for compressed sensing with sparse-tree prior. Signal Process 108(Complete):628–641. https://doi.org/10.1016/j.sigpro.2014.10.026 CrossRefGoogle Scholar
- Candes EJ (2008) The restricted isometry property and its applications for compressed sensing. C R Acad Sci Paris I 346:589–592CrossRefGoogle Scholar
- Candes EJ, Romberg J, Tao T (2006a) Robust uncertanty principle: exact signal reconstruction from highly imcomplete frequency information. IEEE Trans Inf Theory 52(2):489–509CrossRefGoogle Scholar
- Candes EJ, Romberg J, Tao T (2006b) Stable signal recovery from incomplete and inaccurate measurements. Commun Pure Appl Math 59:1207–1223CrossRefGoogle Scholar
- Christakos G, Killam BR (1993) Sampling design for classifying contaminant level using annealing search algorithms. Water Resour Res 29(12):4063–4076CrossRefGoogle Scholar
- Christodoulou S, Gagatsis A, Xanthos S, Kranioti S, Agathokleous A, Fragiadakis M (2013) Entropy-based sensor placement optimization for waterloss detection in water distribution networks. Water Resour Manag Int J Pub Eur Water Resour Assoc (EWRA) 27(13):4443–4468. https://EconPapers.repec.org/RePEc:spr:waterr:v:27:y:2013:i:13:p:4443-4468
- Cohen A, Dahmen W, DeVore R (2009) Compressed sensing and best \(k\)-term approximation. J Am Math Soc 22(1):211–231CrossRefGoogle Scholar
- Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley Interscience, New YorkGoogle Scholar
- Cressie N, Gotway C, Grondona M (1990) Spatial prediction for networks. Tech Rep 7:251–271, Chermometr Intell Lab. SystGoogle Scholar
- Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52:1289–1306CrossRefGoogle Scholar
- Eldar YC (2015) Sampling theory: beyond bandlimited systems, 1st edn. Cambridge University Press, New YorkGoogle Scholar
- Elfeki A, Dekking M (2001) A markov chain model for subsurface characterization: theory and applications. Math Geol 33(5):569–589. https://doi.org/10.1023/A:1011044812133 CrossRefGoogle Scholar
- Founcart S, Lai M (2009) Sparsest solutions of underdetermined linear systems via \(\ell _p\)-minimization. Appl Comput Harmon Anal 26:395–407CrossRefGoogle Scholar
- Gao H, Wang J, Zhao P (1996) The updated kriging variance and optimal sample design. Math Geol 28(3):295–313CrossRefGoogle Scholar
- Goodchild M, Buttenfield B, Wood J (1994) Introduction to visualizing data validity. In: Hearnshaw HM, Unwin DJ (eds) Visualization in geographic information systems. Wiley, Chichester, pp 141–149Google Scholar
- Goovaerts P (2001) Geostatistical modelling of uncertainty in soil science. Geoderma 103:3–26CrossRefGoogle Scholar
- Gray R, Davisson LD (2004) Introduction to statistical signal processing. Cambridge University Press, CambridgeGoogle Scholar
- Guardiano F, Srivastava M (1993) Multivariate geostatistics: beyond bivariate methods. Geostatistics-Troia. Kluwer Academic, Amsterdam, pp 133–144CrossRefGoogle Scholar
- Guestrin C, Krause A, Singh A (2005) Near-optimal sensor placements in Gaussian processes. In: International conference on machine learning (ICML)Google Scholar
- Gutjahr A (1991) Geostatistics for sampling designs and analysis. In: Nash R (ed) Groundwater residue sampling design. American Chemical Society, ACS symposium series, Washington, DC, pp 48–90Google Scholar
- Huang T, Lu DT, Li X, Wang L (2013) Gpu-based snesim implementation for multiple-point statistical simulation. Comput Geosci 54:75–87. https://doi.org/10.1016/j.cageo.2012.11.022 CrossRefGoogle Scholar
- Kennedy BA (1990) Surface mining, 2nd edn. Society of mining. Metallurgy and Exploration Inc, EnglewoodGoogle Scholar
- Krause A, Guestrin C, Gupta A, Kleinberg J (2006) Near-optimal sensor placements: maximizing information while minimizing communication cost. In: Proc. of information processing in sensor networks (IPSN)Google Scholar
- Krause A, Leskovec J, Guestrin C, VanBriesen J, Faloutsos C (2008a) Efficient sensor placement optimization for securing large water distribution networks. J Water Resour Plan Manag 134(6):516–526CrossRefGoogle Scholar
- Krause A, Singh A, Guestrin C (2008b) Near-optimal sensor placements in gaussian processes: theory, efficient algorithms and empirical studies. J Mach Learn Res 9:235–284Google Scholar
- Krause A, Guestrin C, Gupta A, Kleinberg J (2011) Robust sensor placements at informative and communication-efficient locations. ACM Trans Sens Netw. https://doi.org/10.1145/1921621.1921625
- MacKay DJC (2002) Information theory, inference & learning algorithms. Cambridge University Press, New YorkGoogle Scholar
- Magnant Z (2011) Numerical methods for optimal experimental design of ill-posed problems. PhD thesis, Emory University, https://search.proquest.com/docview/881634811?accountid=14621
- Marchant B, Lark R (2007) Optimized sample scheme for geostatistics surveys. Math Geol 39:113–134CrossRefGoogle Scholar
- Mariethoz G, Caers J (2015) Multiple-points geostatistics. Wiley Blackwell, HobokenGoogle Scholar
- McBratney A, Webster R, Burgess T (1981a) The design of optimal sampling schemes for local estimation and mapping of regionalized variables—I: theory and method. Comput Geosci 7(4):331–334CrossRefGoogle Scholar
- McBratney A, Webster R, Burgess T (1981b) The design of optimal sampling schemes for local estimation and mapping of regionalized variables—II: program and examples. Comput Geosci 7(4):335–365CrossRefGoogle Scholar
- Norrena KP, Deutsch CV (2002) Automatic determination of well placement subject to geostatistical and economic constraints. In: SPE international thermal operations and heavy oil symposium and international horizontal well technology conference, society of petroleum engineers, Calgary, AB, Canada, SPE , vol 78996, pp 1–12Google Scholar
- Norris J (1997) Markov chains. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Olea RA (1984) Sampling design optimization for spatial functions. Math Geol 16(4):369–392CrossRefGoogle Scholar
- Ortiz JM, Deutsch CV (2004) Indicator simulation accounting for multiple-point statistics. Math Geol 36(5):545–565CrossRefGoogle Scholar
- Ostroumov V, Rachold V, Vasiliev A, Sorokovikov V (2005) An application of a markov-chain model of shore erosion for describing the dynamics of sediment flux. Geo-Mar Lett 25(2):196–203. https://doi.org/10.1007/s00367-004-0201-2 CrossRefGoogle Scholar
- Peschel GJ, Mokosch M (1991) Interrelations between geostatistics and information theory and their practical use. Math Geol 23(1):3–7. https://doi.org/10.1007/BF02065960 CrossRefGoogle Scholar
- Remy N, Boucher A, Wu J (2009) Applied geostatistics with SGeMS : a user’s guide. Cambridge University Press, formerly CIP, CambridgeCrossRefGoogle Scholar
- Rossi ME, Deutsch CV (2014) Mineral resource estimation. Springer, BerlinCrossRefGoogle Scholar
- Scheidt C, Caers J (2009) Representing spatial uncertainty using distances and kernels. Math Geosci 41(4):397–419. https://doi.org/10.1007/s11004-008-9186-0 CrossRefGoogle Scholar
- Schweizer D, Blum P, Butscher C (2017) Uncertainty assessment in 3-d geological models of increasing complexity. Solid Earth 8(2):515–530. https://doi.org/10.5194/se-8-515-2017, https://www.solid-earth.net/8/515/2017/
- Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(379–423):623–656CrossRefGoogle Scholar
- Strebelle S (2002) Conditional simulation of complex geological structures using multiple points statistics. Math Geol 34(1):1–22CrossRefGoogle Scholar
- Strebelle S, Zhang T (2004) Non-stationary multiple-point geostatistical models. In: Leuangthong O, Deutsch CV (eds) Geostatistics Banff. Springer, Berlin, pp 235–244Google Scholar
- van Groenigen J, Siderius W, Stein A (1999) Constrained optimisation of soil sampling for minimisation of the kriging variance. Geoderma 87:239–259CrossRefGoogle Scholar
- Vašat R, Heuvelink G, Borůvka L (2010) Sampling design optimization for multivariate soil mapping. Geoderma 155(3—-4):147–153Google Scholar
- Vershynin R (2012) Introduction to the non-asymtotic analysis of random matrices (chap 5). In: Eldar Y, Kutyniok G (eds) Compressed sensing, theory and applications, 1st edn. Cambridge University Press, Cambridge, pp 210–268CrossRefGoogle Scholar
- Wellmann JF (2013) Information theory for correlation analysis and estimation of uncertanties reduction in maps and model. Entropy 15:1464–1485CrossRefGoogle Scholar
- Wellmann JF, Regenauer-Lieb K (2012) Uncertainties have a meaning: Information entropy as a quality measure for 3-d geological models. Tectonophysics 526(Supplement C):207–216. https://doi.org/10.1016/j.tecto.2011.05.001. http://www.sciencedirect.com/science/article/pii/S0040195111001788, modelling in Geosciences
- Wellmann JF, Horowitz FG, Schill E, Regenauer-Lieb K (2010) Towards incorporating uncertainty of structural data in 3d geological inversion. Tectonophysics 490(3):141–151. https://doi.org/10.1016/j.tecto.2010.04.022. http://www.sciencedirect.com/science/article/pii/S0040195110001691
- Wellmer FW (1998) Statistical evaluations in exploration for mineral deposits. Springer, BerlinCrossRefGoogle Scholar
- Wu J, Boucher A, Zhang T (2008) A SGeMS code for pattern simulation of continuous and categorical variables: FILTERSIM. Comput Geosci 34(12):1863–1876CrossRefGoogle Scholar
- Xu C, Hu C, Liu X, Wang S (2017) Information entropy in predicting location of observation points for long tunnel. Entropy 19(7). https://doi.org/10.3390/e19070332. http://www.mdpi.com/1099-4300/19/7/332
- Yeung RW (2002) A first course in information theory. Springer, BerlinCrossRefGoogle Scholar
- Zhang C, Li W (2008) A comparative study of nonlinear markov chain models for conditional simulation of multinomial classes from regular samples. Stoch Environ Res Risk Assess 22(2):217–230. https://doi.org/10.1007/s00477-007-0109-2 CrossRefGoogle Scholar
- Zidek J, Sun W, Le D (2000) Designing and integrating composite networks for monitoring multivarite gaussian pollution fields. Appl Stat 49:63–79Google Scholar