An ensemble method to improve prediction of earthquake-induced soil liquefaction: a multi-dataset study


Evaluation of earthquake-induced liquefaction potential is crucial in the design phase of construction projects. Although several machine learning models achieve good prediction accuracy on their particular datasets, they may not perform well in other liquefaction datasets. To address this issue, we proposed a novel hybrid classifier ensemble to improve generalizability by combining the predictions of seven base classifiers using the weighted voting method. The applied base classifiers include back propagation neural network, support vector machine, decision tree, k-nearest neighbours, logistic regression, multiple linear regression and naïve Bayes. The hyperparameters and weights of the base classifiers were tuned using the genetic algorithm. To verify the robustness of the classifier ensemble, its performance was tested on three datasets collected from previous published researches. The results show that the proposed classifier ensemble outperforms the base classifiers in terms of a variety of performance metrics including accuracy, Kappa, precision, recall, F1 score, AUC and ROC on the three datasets. In addition, the importance of influencing variables was achieved by the classifier ensemble on the three datasets to facilitate the future data collecting work. This robust ensemble method can be extended to solve other classification problems in civil engineering.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. 1.

    Lashkari A, Karimi A, Fakharian K, Kaviani-Hamedani F (2017) Prediction of undrained behavior of isotropically and anisotropically consolidated Firoozkuh sand: instability and flow liquefaction. Int J Geomech 17:04017083

    Article  Google Scholar 

  2. 2.

    Dobry R, Abdoun T (2017) Recent findings on liquefaction triggering in clean and silty sands during earthquakes. J Geotech Geoenviron Eng 143:04017077

    Article  Google Scholar 

  3. 3.

    Hazout L, Zitouni ZE-A, Belkhatir M, Schanz T (2017) Evaluation of static liquefaction characteristics of saturated loose sand through the mean grain size and extreme grain sizes. Geotech Geol Eng 35:2079–2105

    Article  Google Scholar 

  4. 4.

    Shivaprakash B, Dinesh S (2017) Dynamic properties of sand–fines mixtures. Geotech Geol Eng 35:2327–2337

    Article  Google Scholar 

  5. 5.

    Johari A, Pour JR, Javadi A (2015) Reliability analysis of static liquefaction of loose sand using the random finite element method. Eng Comput 32:2100–2119

    Article  Google Scholar 

  6. 6.

    Huang S, Huang M, Lyu Y (2019) A novel approach for sand liquefaction prediction via local mean-based pseudo nearest neighbor algorithm and its engineering application. Adv Eng Inform 41:100918

    Article  Google Scholar 

  7. 7.

    Sun Y, Li G, Zhang J et al (2020) Experimental and numerical investigation on a novel support system for controlling roadway deformation in underground coal mines. Energy Sci Eng, 8(2):490–500

    Article  Google Scholar 

  8. 8.

    Yazdi J, Moss R (2016) Nonparametric liquefaction triggering and postliquefaction deformations. J Geotech Geoenviron Eng 143:04016105

    Article  Google Scholar 

  9. 9.

    Andrus RD, Stokoe KH II (2000) Liquefaction resistance of soils from shear-wave velocity. J Geotech Geoenviron Eng 126:1015–1025

    Article  Google Scholar 

  10. 10.

    Wu J, Seed RB (2004) Estimation of liquefaction-induced ground settlement (case studies). In: Proceedings of the 5th international conference on case histories in geotechnical engineering, vol 6. Springer

  11. 11.

    Tsaparli V, Kontoe S, Taborda DM, Potts DM (2017) An energy-based interpretation of sand liquefaction due to vertical ground motion. Comput Geotech 90:1–13

    Article  Google Scholar 

  12. 12.

    Gao X, Sun Q, Xu H, Gao J (2020) Sparse and collaborative representation based kernel pairwise linear regression for image set classification. Expert Syst Appl 140:112886

    Article  Google Scholar 

  13. 13.

    Gao J, Li L (2019) A robust geometric mean-based subspace discriminant analysis feature extraction approach for image set classification. Optik 199:163368

    Article  Google Scholar 

  14. 14.

    Gao J, Li L, Guo B (2020) A new extend face representation method for face recognition. Neural Process Lett 51:473–486

    Article  Google Scholar 

  15. 15.

    Gao X, Sun Q, Xu H, Wei D, Gao J (2019) Multi-model fusion metric learning for image set classification. Knowl Based Syst 164:253–264

    Article  Google Scholar 

  16. 16.

    Hanna AM, Ural D, Saygili G (2007) Neural network model for liquefaction potential in soil deposits using Turkey and Taiwan earthquake data. Soil Dyn Earthq Eng 27:521–540

    Article  Google Scholar 

  17. 17.

    Chern S-G, Lee C-Y, Wang C-C (2008) CPT-based liquefaction assessment by using fuzzy-neural network. J Mar Sci Technol 16:139–148

    Google Scholar 

  18. 18.

    Kayadelen C (2011) Soil liquefaction modeling by genetic expression programming and neuro-fuzzy. Expert Syst Appl 38:4080–4087

    Article  Google Scholar 

  19. 19.

    Xue X, Yang X (2016) Seismic liquefaction potential assessed by support vector machines approaches. Bull Eng Geol Env 75:153–162

    Article  Google Scholar 

  20. 20.

    Hu J-L, Tang X-W, Qiu J-N (2016) Assessment of seismic liquefaction potential based on Bayesian network constructed from domain knowledge and history data. Soil Dyn Earthq Eng 89:49–60

    Article  Google Scholar 

  21. 21.

    Ardakani A, Kohestani V (2015) Evaluation of liquefaction potential based on CPT results using C4.5 decision tree. J AI Data Min 3:85–92

    Google Scholar 

  22. 22.

    Kohestani V, Hassanlourad M, Ardakani A (2015) Evaluation of liquefaction potential based on CPT data using random forest. Nat Hazards 79:1079–1089

    Article  Google Scholar 

  23. 23.

    Zhang W, Goh AT, Zhang Y, Chen Y, Xiao Y (2015) Assessment of soil liquefaction based on capacity energy concept and multivariate adaptive regression splines. Eng Geol 188:29–37

    Article  Google Scholar 

  24. 24.

    Xu H, Caramanis C, Mannor S (2011) Sparse algorithms are not stable: a no-free-lunch theorem. IEEE Trans Pattern Anal Mach Intell 34:187–193

    Google Scholar 

  25. 25.

    Zhou Z-H, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137:239–263

    MathSciNet  MATH  Article  Google Scholar 

  26. 26.

    Zhang J, Li D, Wang Y (2020) Predicting tunnel squeezing using a hybrid classifier ensemble with incomplete data. Bull Eng Geol Environ.

    Article  Google Scholar 

  27. 27.

    Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139

    MathSciNet  MATH  Article  Google Scholar 

  28. 28.

    Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

    MATH  Google Scholar 

  29. 29.

    Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259

    Article  Google Scholar 

  30. 30.

    Kuncheva LI, Rodríguez JJ (2014) A weighted voting framework for classifiers ensembles. Knowl Inf Syst 38:259–275

    Article  Google Scholar 

  31. 31.

    Werbin-Ofir H, Dery L, Shmueli E (2019) Beyond majority: label ranking ensembles based on voting rules. Expert Syst Appl 136:50–61

    Article  Google Scholar 

  32. 32.

    Ekbal A, Saha S (2011) A multiobjective simulated annealing approach for classifier ensemble: named entity recognition in Indian languages as case studies. Expert Syst Appl 38:14760–14772

    Article  Google Scholar 

  33. 33.

    Ekbal A, Saha S (2011) Weighted vote-based classifier ensemble for named entity recognition: a genetic algorithm-based approach. ACM Trans Asian Lang Inf Process 10:9

    Article  Google Scholar 

  34. 34.

    Kim H, Kim H, Moon H, Ahn H (2011) A weight-adjusted voting algorithm for ensembles of classifiers. J Korean Stat Soc 40:437–449

    MathSciNet  MATH  Article  Google Scholar 

  35. 35.

    Yang XS (2010) Nature-inspired metaheuristic algorithms. Luniver Press

  36. 36.

    Zhang J, Huang Y, Wang Y et al (2020) Multi-objective optimization of concrete mixture proportions using machine learning and metaheuristic algorithms. Constr Build Mater 253:119208

    Article  Google Scholar 

  37. 37.

    Zhang J, Huang Y, Ma G et al (2020) Multi-objective beetle antennae search algorithm. arXiv:2002.10090

  38. 38.

    Goh AT, Goh S (2007) Support vector machines: their use in geotechnical engineering as illustrated using seismic liquefaction data. Comput Geotech 34:410–421

    Article  Google Scholar 

  39. 39.

    Juang CH, Chen CJ (2000) A rational method for development of limit state for liquefaction evaluation based on shear wave velocity measurements. Int J Numer Anal Methods Geomech 24:1–27

    Article  Google Scholar 

  40. 40.

    Koo TK, Li MY (2016) A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropractic Med 15:155–163

    Article  Google Scholar 

  41. 41.

    Shalev-Shwartz S, Ben-David S (2014) Understanding machine learning: From theory to algorithms. Cambridge University Press, Cambridge

    Google Scholar 

  42. 42.

    Flach P (2012) Machine learning: the art and science of algorithms that make sense of data. Cambridge University Press, Cambridge

    Google Scholar 

  43. 43.

    Hecht-Nielsen R (1992) Theory of the backpropagation neural network, neural networks for perception. Elsevier, Amsterdam, pp 65–93

    Google Scholar 

  44. 44.

    Sun Y, Zhang J, Li G et al (2019) Optimized neural network using beetle antennae search for predicting the unconfined compressive strength of jet grouting coalcretes. Int J Numer Anal Methods Geomech 43(4):801–813

    Article  Google Scholar 

  45. 45.

    Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge

    Google Scholar 

  46. 46.

    Zhang J, Huang Y, Ma G et al (2020) A metaheuristic-optimized multi-output model for predicting multiple properties of pervious concrete. Constr Build Mater 249:118803

    Article  Google Scholar 

  47. 47.

    Sun Y, Zhang J, Li G et al (2019) Determination of young's modulus of jet grouted coalcretes using an intelligent model. Eng Geol 252:43–53

    Article  Google Scholar 

  48. 48.

    Sun J, Zhang J, Gu Y et al (2019) Prediction of permeability and unconfined compressive strength of pervious concrete using evolved support vector regression. Constr Build Mater 207:440–449

    Article  Google Scholar 

  49. 49.

    Quinlan JR (2014) C4.5: programs for machine learning. Elsevier, Amsterdam

    Google Scholar 

  50. 50.

    Breiman L (2017) Classification and regression trees. Routledge, Abingdon

    Google Scholar 

  51. 51.

    Zhang J, Ma G, Huang Y et al (2019) Modelling uniaxial compressive strength of lightweight self-compacting concrete using random forest regression. Constr Build Mater 210:713–719

    Article  Google Scholar 

  52. 52.

    Zhang J, Li D, Wang Y (2020) Predicting uniaxial compressive strength of oil palm shell concrete using a hybrid artificial intelligence model. J Build Eng 30:101282

    Article  Google Scholar 

  53. 53.

    Zhang J, Li D, Wang Y (2020) Toward intelligent construction: prediction of mechanical properties of manufactured-sand concrete using tree-based models. J Clean Prod 258:120665

    Article  Google Scholar 

  54. 54.

    Cunningham P, Delany SJ (2007) k-Nearest neighbour classifiers. Mult Classif Syst 34:1–17

    Google Scholar 

  55. 55.

    Berger JO (2013) Statistical decision theory and Bayesian analysis. Springer, Berlin

    Google Scholar 

  56. 56.

    Menard S (2002) Applied logistic regression analysis. Sage, Thousand Oaks

    Google Scholar 

  57. 57.

    Kutner MH, Nachtsheim CJ, Neter J, Li W (2005) Applied linear statistical models. McGraw-Hill, Boston

    Google Scholar 

  58. 58.

    Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45:427–437

    Article  Google Scholar 

  59. 59.

    Kraemer HC (2014) Kappa coefficient, Wiley StatsRef: Statistics Reference Online, pp 1–4

  60. 60.

    Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874

    Article  Google Scholar 

  61. 61.

    Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI, Montreal, pp 1137–1145

    Google Scholar 

  62. 62.

    Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT Press, Amsterdam

    Google Scholar 

  63. 63.

    Schmitt LM (2001) Theory of genetic algorithms. Theoret Comput Sci 259:1–61

    MathSciNet  MATH  Article  Google Scholar 

  64. 64.

    Goldberg DE (2006) Genetic algorithms. Pearson Education India, Bengaluru

    Google Scholar 

  65. 65.

    Goldberg DE, Deb K (1991) A comparative analysis of selection schemes used in genetic algorithms, foundations of genetic algorithms. Elsevier, Amsterdam, pp 69–93

    Google Scholar 

  66. 66.

    Baker JE (1985) Adaptive selection methods for genetic algorithms. In: Proceedings of an international conference on genetic algorithms and their applications. Hillsdale, New Jersey, pp 101–111

  67. 67.

    Hancock PJ (1994) An empirical comparison of selection methods in evolutionary algorithms. AISB workshop on evolutionary computing. Springer, pp 80–94

  68. 68.

    Gupta S (2009) Relative fitness scaling for improving efficiency of proportionate selection in genetic algorithms. In: Proceedings of the 11th annual conference companion on genetic and evolutionary computation conference: late breaking papers, 2009, pp 2741–2744

  69. 69.

    Vose MD (1999) The simple genetic algorithm: foundations and theory. MIT Press, Cambridge

    Google Scholar 

  70. 70.

    Hassanat AB, Alkafaween EA (2017) On enhancing genetic algorithms using new crossovers. Int J Comput Appl Technol 55:202–212

    Article  Google Scholar 

  71. 71.

    Hassanat A, Almohammadi K, Alkafaween E, Abunawas E, Hammouri A, Prasath V (2019) Choosing mutation and crossover ratios for genetic algorithms—a review with a new dynamic approach. Information 10:390

    Article  Google Scholar 

  72. 72.

    Segura C, Coello CAC, Segredo E, Aguirre AH (2015) A novel diversity-based replacement strategy for evolutionary algorithms. IEEE Trans Cybern 46:3233–3246

    Article  Google Scholar 

  73. 73.

    Tóth N, Pataki B (2008) Classification confidence weighted majority voting using decision tree classifiers. Int J Intell Comput Cybern 1:169–192

    MathSciNet  MATH  Article  Google Scholar 

  74. 74.

    Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174

    MATH  Article  Google Scholar 

  75. 75.

    Genuer R, Poggi J-M, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recogn Lett 31:2225–2236

    Article  Google Scholar 

  76. 76.

    Hoang N-D, Bui DT (2018) Predicting earthquake-induced soil liquefaction based on a hybridization of kernel Fisher discriminant analysis and a least squares support vector machine: a multi-dataset study. Bull Eng Geol Environ 77:191–204

    Article  Google Scholar 

  77. 77.

    Gandomi AH, Fridline MM, Roke DA (2013) Decision tree approach for soil liquefaction assessment. Sci World J 2013:346285

    Article  Google Scholar 

  78. 78.

    Zhou J, Li E, Wang M, Chen X, Shi X, Jiang L (2019) Feasibility of stochastic gradient boosting approach for evaluating seismic liquefaction potential based on SPT and CPT case histories. J Perform Constr Facil 33:04019024

    Article  Google Scholar 

  79. 79.

    Muduli PK, Das SK (2014) CPT-based seismic liquefaction potential evaluation using multi-gene genetic programming approach. Indian Geotech J 44:86–93

    Article  Google Scholar 

  80. 80.

    Muduli PK, Das SK (2015) Evaluation of liquefaction potential of soil based on shear wave velocity using multi-gene genetic programming, handbook of genetic programming applications. Springer, Berlin, pp 309–343

    Google Scholar 

Download references


The first author is supported by the China Scholarship Council (Grant Number: 201706460008).

Author information



Corresponding author

Correspondence to Junfei Zhang.

Ethics declarations

Conflict of interest

All the authors declare that there is no conflict of interests regarding the publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Wang, Y. An ensemble method to improve prediction of earthquake-induced soil liquefaction: a multi-dataset study. Neural Comput & Applic (2020).

Download citation


  • Soil liquefaction
  • Earthquake
  • Machine learning
  • Classifier ensemble
  • Genetic algorithm
  • Prediction