Parallel heuristic search strategy based on a Bayesian approach for simultaneous recognition of contaminant sources and aquifer parameters at DNAPL-contaminated sites

Abstract

In this study, we develop a parallel heuristic search strategy based on a Bayesian approach for simultaneously recognizing groundwater contaminant sources and aquifer parameters (unknown variables) at sites contaminated with dense non-aqueous phase liquids (DNAPLs). The parallel search strategy is time-consuming because thousands of simulation models must run in order to calculate the likelihood. Various stand-alone surrogate systems for the simulation models have been established, but they also have unavoidable limitations. Thus, we develop an optimal combined surrogate system by combining Gaussian process, kernel extreme learning machine, and support vector regression methods using a differential evolution algorithm with a variable mutation rate based on the rand-to-best/1/bin strategy, thereby improving the approximation accuracy of the surrogate system to the simulation model and significantly decreasing the high computational cost. Utilizing the optimal combined surrogate system reduced the CPU time by more than 400 times. In the iterative parallel heuristic search process, each round of iteration involves determining the candidate points and state transitions. The Monte Carlo approach is used widely for selecting candidate point, but this approach does not readily converge to the posterior distribution for unknown variables when the probability density function types are complex with weak search ergodicity. In order to improve the search ergodicity, we develop a particle swarm optimization algorithm with a non-linear decreasing inertia weight and Metropolis criterion, which is more suitable for unknown variables with complex probability density functions. The recognition results are obtained simultaneously when the iterative process terminates. We assess our proposed approaches based on a hypothetical case study at a three-dimensional site contaminated with DNAPLs. The results demonstrate that the parallel heuristic search strategy is helpful for the simultaneous recognition of DNAPL contaminant sources in groundwater and aquifer parameters.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Data availability

All data, models, and code generated used during the study are available from the corresponding author by request.

References

  1. Amirabdollahian M, Datta B (2013) Identification of contaminant source characteristics and monitoring network design in groundwater aquifers: an overview. J Environ Prot 4(5):26–41

    Google Scholar 

  2. Botev ZI, Grotowski JF, Kroese DP (2010) Kernel density estimation via diffusion. Ann Stat 38(5):2916–2957

    Google Scholar 

  3. Bowman AW (1985) A comparative study of some kernel-based nonparametric density estimators. J Stat Comput Simul 21(3–4):313–327

    Google Scholar 

  4. Breiman L, Meisel W, Purcell E (1977) Variable kernel estimates of multivariate densities. Technometrics 19(2):135–144

    Google Scholar 

  5. Chen C, Li W, Su H, Liu K (2014) Spectral-spatial classification of hyperspectral image based on kernel extreme learning machine. Remote Sens 6(6):5795–5814

    Google Scholar 

  6. Chu H, Lu W (2015) Optimization design based on ensemble surrogate models for DNAPLs-contaminated groundwater remediation. J Water Supply Res T 64(6):697–707

    Google Scholar 

  7. Datta B, Chakrabarty D, Dhar A (2009) Simultaneous identification of unknown groundwater pollution sources and estimation of aquifer parameters. J Hydrol 376(1–2):48–57

    CAS  Google Scholar 

  8. Dekker TJ, Abriola LM (2000) The influence of field-scale heterogeneity on the surfactant-enhanced remediation of entrapped nonaqueous phase liquids. J Contam Hydrol 42(2–4):219–251

    CAS  Google Scholar 

  9. Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472

    Google Scholar 

  10. Gorelick SM, Evans B, Remson I (1983) Identifying sources of groundwater pollution: an optimization approach. Water Resour Res 19(3):779–790

    CAS  Google Scholar 

  11. Guozhen W, Zhang C, Li Y, Haixing L, Zhou H (2016) Source identification of sudden contamination based on the parameter uncertainty analysis. J Hydroinf 18(6):919–927

    Google Scholar 

  12. Haario H, Laine M, Mira A, Saksman E (2006) DRAM: efficient adaptive MCMC. Stat Comput 16(4):339–354

    Google Scholar 

  13. He L, Huang GH, Lu HW (2010) A stochastic optimization model under modeling uncertainty and parameter certainty for groundwater remediation design-Part I. Model development. J Hazard Mater 176(1–3):521–526

    CAS  Google Scholar 

  14. Hou Z, Lu W (2018) Comparative study of surrogate models for groundwater contamination source identification at DNAPL-contaminated sites. Hydrogeol J 26(3):923–932

    Google Scholar 

  15. Hou Z, Lu W, Chu H, Luo J (2015) Selecting parameter-optimized surrogate models in DNAPL-contaminated aquifer remediation strategies. Environ Eng Sci 32(12):1016–1026

    CAS  Google Scholar 

  16. Hou Z, Lu W, Chen M (2016) Surrogate-based sensitivity analysis and uncertainty analysis for DNAPL-contaminated aquifer remediation. J Water Resour Plan Manag 142(11):04016043

    Google Scholar 

  17. Hu JN, Hu JJ, Lin HB, Li XP, Jiang CL, Qiu XH, Li WS (2014) State-of-charge estimation for battery management system using optimized support vector machine for regression. J Power Sources 269:682–693

    CAS  Google Scholar 

  18. Jiang X, Lu W, Hou Z, Zhao H, Na J (2015) Ensemble of surrogates-based optimization for identifying an optimal surfactant-enhanced aquifer remediation strategy at heterogeneous DNAPL-contaminated sites. Comput Geosci 84:37–45

    CAS  Google Scholar 

  19. Laloy E, Vrugt JA (2012) High-dimensional posterior exploration of hydrologic models using multiple-try DREAM (ZS) and high-performance computing. Water Resour Res 50(3):182–205

    Google Scholar 

  20. Laloy E, Rogiers B, Vrugt JA, Mallants D, Jacques D (2013) Efficient posterior exploration of a high-dimensional groundwater model from two-stage Markov chain Monte Carlo simulation and polynomial chaos expansion. Water Resour Res 49(5):2664–2682

    Google Scholar 

  21. Lapworth DJ, Baran N, Stuart ME, Ward RS (2012) Emerging organic contaminants in groundwater: a review of sources, fate and occurrence. Environ Pollut 163:287–303

    CAS  Google Scholar 

  22. Leichombam L, Bhattacharjya RK (2016) Identification of unknown groundwater pollution sources and determination of optimal well locations using ANN-GA based simulation-optimization model. J Water Resour Prot 8(3):411–424

    Google Scholar 

  23. Li M, Huang X, Liu H, Liu B, Wu Y, Xiong A, Dong T (2013) Prediction of gas solubility in polymers by back propagation artificial neural network based on self-adaptive particle swarm optimization algorithm and chaos theory. Fluid Phase Equilib 356:11–17

    CAS  Google Scholar 

  24. Liu X, Cardiff MA, Kitanidis PK (2010) Parameter estimation in nonlinear environmental problems. Stoch Environ Res Risk A 24(7):1003–1022

    Google Scholar 

  25. Luo J, Lu W (2014) Comparison of surrogate models with different methods in groundwater remediation process. J Earth Syst Sci 123(7):1579–1589

    Google Scholar 

  26. Luo J, Lu W, Xin X, Chu H (2013) Surrogate model application to the identification of an optimal surfactant-enhanced aquifer remediation strategy for DNAPL-contaminated sites. J Earth Sci-China 24(6):1023–1032

    Google Scholar 

  27. Mason AR, Kueper BH (1996) Numerical simulation of surfactant flooding to remove pooled DNAPL from porous media. Environ Sci Technol 30(11):3205–3215

    CAS  Google Scholar 

  28. Mirghani BY, Zechman EM, Ranjithan RS, Mahinthakumar G (2012) Enhanced simulation-optimization approach using surrogate modeling for solving inverse problems. Environmen Forens 13(4):348–363

    Google Scholar 

  29. Mo S, Zabaras N, Shi X, Wu J (2019) Deep autoregressive neural networks for high-dimensional inverse problems in groundwater contaminant source identification. Water Resour Res 55(1):703–728

    Google Scholar 

  30. Parzen E (1962) On estimation of probability density function and mode. Annals Math Stats 33(3):1065–1076

    Google Scholar 

  31. Prakash O, Datta B (2012) Sequential optimal monitoring network design and iterative spatial estimation of pollutant concentration for identification of unknown groundwater pollution source locations. Environ Monit Assess 185(7):5611–5626

    Google Scholar 

  32. Qin Z, Yu F, Shi Z, Wang Y (2006) Adaptive inertia weight particle swarm optimization. International Conference on Artificial Intelligence & Soft Computing. Springer, Berlin, Heidelberg 450–459

  33. Queipo NV, Haftka RT, Wei S (2005) Surrogate-based analysis and optimization. Prog Aeosp Sci 41(1):1–28

    Google Scholar 

  34. Shen C (2018) A trans-disciplinary review of deep learning research and its relevance for water resources scientists. Water Resour Res 54(11):8558–8593

    Google Scholar 

  35. Shi Y, Zhao LJ, Tang J (2014) Recognition model based feature extraction and kernel extreme learning machine for high dimensional data. Adv Mater Res 875:2020–2024

    Google Scholar 

  36. Srivastava D, Singh RM (2015) Groundwater system modeling for simultaneous identification of pollution sources and parameters with uncertainty characterization. Water Resour Manag 29(13):4607–4627

    Google Scholar 

  37. Tripathy RK, Bilionis I (2018) Deep UQ: learning deep neural network surrogate models for high dimensional uncertainty quantification. J Comput Phys 375:565–588

    Google Scholar 

  38. Wang X, Han M (2014) Online sequential extreme learning machine with kernels for nonstationary time series prediction. Neurocomputing 145:90–97

    Google Scholar 

  39. Wang FK, Huang PR (2013) Implementing particle swarm optimization algorithm to estimate the mixture of two Weibull parameters with censored data. J Stat Comput Simul 84(9):1975–1989

    Google Scholar 

  40. Woodbury A, Sudicky E, Ulrych TJ, Ludwig R (1998) Three-dimensional plume source reconstruction using minimum relative entropy inversion. J Contam Hydrol 32(1–2):131–158

    CAS  Google Scholar 

  41. Worton BJ (1989) Optimal smoothing parameters for multivariate fixed and adaptive kernel methods. J Stat Comput Simul 32(1–2):45–57

    Google Scholar 

  42. Xie H, Zhang C, Feng S, Wang Q, Yan H (2018a) Analytical model for degradable organic contaminant transport through a GMB/GCL/AL system. J Environ Eng 144(3):04018006

    Google Scholar 

  43. Xie H, Wang Q, Bouazza A, Feng S (2018b) Analytical model for vapour-phase VOCs transport in four-layered landfill composite cover systems. Comput Geotech 101:80–94

    Google Scholar 

  44. Xing Z, Qu R, Zhao Y, Fu Q, Ji Y, Lu W (2019) Identifying the release history of a groundwater contaminant source based on an ensemble surrogate model. J Hydrol 572:501–516

  45. Yang H, Huang K, Chan L, King I, Lyu MR (2004) Outliers treatment in support vector regression for financial time series prediction. Neural Information Processing, 11th International Conference, ICONIP, Calcutta, India, November 22-25, Proceedings. Springer, Berlin Heidelberg 1260-1265.

  46. Zanini A, Woodbury AD (2016) Contaminant source reconstruction by empirical Bayes and Akaike’s Bayesian Information Criterion. J Contam Hydrol 185–186:74–86

    Google Scholar 

  47. Zhang Y, Kimberg DY, Coslett HB, Schwartz MF, Wang Z (2014) Multivariate lesion-symptom mapping using support vector regression. Hum Brain Mapp 35(12):5861–5876

    Google Scholar 

  48. Zhang J, Zeng L, Chen C, Chen D, Wu L (2015) Efficient Bayesian experimental design for contaminant source identification. Water Resour Res 51(1):576–598

    Google Scholar 

  49. Zhang J, Li W, Zeng L, Wu L (2016) An adaptive Gaussian process-based method for efficient Bayesian experimental design in groundwater contaminant source identification problems. Water Resour Res 52(8):5971–5984

    Google Scholar 

  50. Zhang J, Li W, Lin G, Zeng L, Wu L (2017) Efficient evaluation of small failure probability in high-dimensional groundwater contaminant transport modeling via a two-stage Monte Carlo method. Water Resour Res 53(3):1948–1962

    Google Scholar 

  51. Zhao Y, Lu W, An Y (2015) Surrogate model-based simulation-optimization approach for groundwater source identification problems. Environ Forensic 16(3):296–303

    Google Scholar 

  52. Zhao Y, Lu W, Xiao C (2016) A Kriging surrogate model coupled in simulation-optimization approach for identifying release history of groundwater sources. J Contam Hydrol 185–186:51–60

    Google Scholar 

  53. Zhu Y, Zabaras N (2018) Bayesian deep convolutional encoder–decoder networks for surrogate modeling and uncertainty quantification. J Comput Phys 366:415–447

    Google Scholar 

Download references

Funding

This research was supported by the National Key Research and Development Program of China (No. 2018YFC1800405) and the National Natural Science Foundation of China (No. 41972252).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Wenxi Lu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Responsible editor: Marcus Schulz

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lu, W., Wang, H. & Li, J. Parallel heuristic search strategy based on a Bayesian approach for simultaneous recognition of contaminant sources and aquifer parameters at DNAPL-contaminated sites. Environ Sci Pollut Res (2020). https://doi.org/10.1007/s11356-020-09382-z

Download citation

Keywords

  • Bayesian approach
  • Dense non-aqueous phase liquid
  • Optimal combined surrogate system
  • Parallel heuristic search
  • Simultaneous recognition