Enhancing the accuracy of rainfall-induced landslide prediction along mountain roads with a GIS-based random forest classifier

  • Viet-Hung Dang
  • Tien Bui Dieu
  • Xuan-Linh Tran
  • Nhat-Duc Hoang
Original Paper
  • 15 Downloads

Abstract

Along mountain roads, rainfall-triggered landslides are typical disasters that cause significant human casualties. Thus, to establish effective mitigation measures, it would be very useful were government agencies and practicing land-use planners to have the capability to make an accurate landslide evaluation. Here, we propose a machine learning methodology for the spatial prediction of rainfall-induced landslides along mountain roads which is based on a random forest classifier (RFC) and a GIS-based dataset. The RFC is used as a supervised learning technique to generalize the classification boundary that separates the input information of ten landslide conditioning factors (slope, aspect, relief amplitude, toposhape, topographic wetness index, distance to roads, distance to rivers, lithology, distance to faults, and rainfall) into two distinctive class labels: ‘landslide’ and ‘non-landslide’. Experimental results with a cross validation process and sensitivity analysis on the RFC model parameters reveal that the proposed model achieves a superior prediction accuracy with an area under the curve  of 0.92. The RFC significantly outperforms other benchmarking methods, including discriminant analysis, logistic regression, artificial neural networks, relevance vector machines, and support vector machines. Based on our experimental outcome and comparative analysis, we strongly recommend the RFC as a very capable tool for spatial modeling of rainfall-induced landslides.

Keywords

Landslide prediction Mountain road Random forest classifier Machine learning Geographic information system 

Notes

Acknowledgements

Data for this research are from the project 71 /GV-VKHĐCKS with the title “Combination of Structural Geology, Remote Sensing, and GIS for the Study of Current Status and Prediction of Flash Floods and Landslides at the National Road No.32 Section from the Yen Bai to the Lai Chau Provinces”, Vietnam Institude of Geosciences and Mineral Resources. We would like to thank Dr. Ho Tien Chung for providing the data for this research.

References

  1. Althuwaynee OF, Pradhan B, Lee S (2012) Application of an evidential belief function model in landslide susceptibility mapping. Comput Geosci 44:120–135.  https://doi.org/10.1016/j.cageo.2012.03.003 CrossRefGoogle Scholar
  2. Althuwaynee O, Pradhan B, Park H-J, Lee J (2014) A novel ensemble decision tree-based CHi-squared automatic interaction detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping. Landslides 11:1063–1078.  https://doi.org/10.1007/s10346-014-0466-0 CrossRefGoogle Scholar
  3. Ayalew L, Yamagishi H (2005) The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology 65:15–31.  https://doi.org/10.1016/j.geomorph.2004.06.010 CrossRefGoogle Scholar
  4. Beale MH, Hagan MT, Demuth HB (2012) Neural network toolbox user’s guide. MathWorks, Inc., NatickGoogle Scholar
  5. Breiman L (2001) Random forests. Mach Learn 45:5–32.  https://doi.org/10.1023/A:1010933404324 CrossRefGoogle Scholar
  6. Cascini L, Cuomo S, Guida D (2008a) Typical source areas of may 1998 flow-like mass movements in the Campania region, southern Italy. Eng Geol 96:107–125.  https://doi.org/10.1016/j.enggeo.2007.10.003 CrossRefGoogle Scholar
  7. Cascini L, Cuomo S, Pastor M (2008b) The role played by mountain tracks on rainfall-induced shallow landslides: a case study. In: In Proc. of the International Congress on Environmental Modelling & Software (iEMSs 2008), Barcelona, Catalonia, July 6–10, 2008. International Environmental Modelling & Software Society (iEMSs), pp 1484–1491. https://scholarsarchive.byu.edu/iemssconference/2008/
  8. Chauhan S, Sharma M, Arora MK, Gupta NK (2010) Landslide susceptibility zonation through ratings derived from artificial neural network. Int J Appl Earth Obs 12:340–350CrossRefGoogle Scholar
  9. Cheng M-Y, Hoang N-D (2015a) A swarm-optimized fuzzy instance-based learning approach for predicting slope collapses in mountain roads. Knowl-Based Syst 76:256–263.  https://doi.org/10.1016/j.knosys.2014.12.022 CrossRefGoogle Scholar
  10. Cheng M-Y, Hoang N-D (2015b) Typhoon-induced slope collapse assessment using a novel bee colony optimized support vector classifier. Nat Hazards 78:1961–1978.  https://doi.org/10.1007/s11069-015-1813-8 CrossRefGoogle Scholar
  11. Cheng M-Y, Hoang N-D (2016) Slope collapse prediction using Bayesian framework with K-nearest neighbor density estimation: case study in Taiwan. J Comput Civ Eng 30:04014116.  https://doi.org/10.1061/(ASCE)CP.1943-5487.0000456 CrossRefGoogle Scholar
  12. Chung C-J, Fabbri AG (2008) Predicting landslides for risk analysis—spatial models tested by a cross-validation technique. Geomorphology 94:438–452.  https://doi.org/10.1016/j.geomorph.2006.12.036 CrossRefGoogle Scholar
  13. Crozier MJ (2010) Deciphering the effect of climate change on landslide activity: a review. Geomorphology 124:260–267.  https://doi.org/10.1016/j.geomorph.2010.04.009 CrossRefGoogle Scholar
  14. Cuomo S, Della Sala M, Novità A (2015a) Physically based modelling of soil erosion induced by rainfall in small mountain basins. Geomorphology 243:106–115.  https://doi.org/10.1016/j.geomorph.2015.04.019 CrossRefGoogle Scholar
  15. Cuomo S, Della SM, Pierri M (2015b) Physically-based modeling of runoff and soil erosion in slopes with mountain tracks. In: Manzanal D, Sfriso AO (eds) In Proc. of the Pan. Conf. on Soil Mechanics and Geotechnical Engineering. IOS Press, pp 3143–3150. https://www.iospress.nl/book/from-fundamentals-to-applications-in-geotechnics/
  16. Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874.  https://doi.org/10.1016/j.patrec.2005.10.010 CrossRefGoogle Scholar
  17. Feizizadeh B, Blaschke T, Nazmfar H (2014) GIS-based ordered weighted averaging and Dempster–Shafer methods for landslide susceptibility mapping in the Urmia Lake Basin, Iran. Int J Digit Earth 7:688–708.  https://doi.org/10.1080/17538947.2012.749950 CrossRefGoogle Scholar
  18. Fischer MM, Getis A, Gorsevski P, Gessler P, Jankowski P (2010) A fuzzy k-means classification and a Bayesian approach for spatial prediction of landslide hazard. In: Handbook of applied spatial analysis. Springer, Berlin Heidelberg, pp 653–684.  https://doi.org/10.1007/978-3-642-03647-7_31 CrossRefGoogle Scholar
  19. Formetta G, Rago V, Capparelli G, Rigon R, Muto F, Versace P (2014) Integrated physically based system for modeling landslide susceptibility. Proc Earth Planet Sci 9:74–82.  https://doi.org/10.1016/j.proeps.2014.06.006 CrossRefGoogle Scholar
  20. Gislason PO, Benediktsson JA, Sveinsson JR (2006) Random forests for land cover classification. Pattern Recogn Lett 27:294–300.  https://doi.org/10.1016/j.patrec.2005.08.011 CrossRefGoogle Scholar
  21. Glade T, Anderson M, Crozier MJ (2005) Landslide hazard and risk. Wiley, West SussexCrossRefGoogle Scholar
  22. Ho TC (2008) Application of structural geology methods, remote sensing, and GIS for the assessment and prediction of landslide and flood along the National Road 32 in the Yen Bai and Lai Chau provinces of Vietnam, technical report. Vietnam Institute of Geosciences and Mineral Resources, Hanoi CityGoogle Scholar
  23. Ho TC et al. (2010) Combination of structural geology, remote sensing, and GIS for the study of current status and prediction of flash floods and landslides at the National Road No.32 section from the Yen Bai to the Lai Chau Provinces, technical Report. Vietnam Institude of Geosciences and Mineral Resources, Hanoi CityGoogle Scholar
  24. Hoang N-D, Pham A-D (2016) Hybrid artificial intelligence approach based on metaheuristic and machine learning for slope stability assessment: a multinational data analysis. Expert Syst Appl 46:60–68.  https://doi.org/10.1016/j.eswa.2015.10.020 CrossRefGoogle Scholar
  25. Hoang N-D, Tien-Bui D (2016) A novel relevance vector machine classifier with cuckoo search optimization for spatial prediction of landslides. J Comput Civ Eng 30:04016001.  https://doi.org/10.1061/(ASCE)CP.1943-5487.0000557 CrossRefGoogle Scholar
  26. Hoang N-D, Tien Bui D (2018) GIS-based landslide spatial modeling using batch-training back-propagation artificial neural network: a study of model parameters. In: Tien Bui D, Ngoc Do A, Bui H-B, Hoang N-D (eds) Advances and applications in geospatial technology and earth resources: Proc Int Conf on Geo-Spatial Technologies and Earth Resources 2017. Springer International Publ AG, Cham, pp 239–254.  https://doi.org/10.1007/978-3-319-68240-2_15
  27. Hong H, Pradhan B, Xu C, Tien Bui D (2015) Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 133:266–281.  https://doi.org/10.1016/j.catena.2015.05.019 CrossRefGoogle Scholar
  28. Huggel C, Clague JJ, Korup O (2012) Is climate change responsible for changing landslide activity in high mountains? Earth Surf Process Landf 37:77–91.  https://doi.org/10.1002/esp.2223 CrossRefGoogle Scholar
  29. Lee S, Won J-S, Jeon S, Park I, Lee M (2015) Spatial landslide hazard prediction using rainfall probability and a logistic regression. Model Math Geol 47:565–589.  https://doi.org/10.1007/s11004-014-9560-z Google Scholar
  30. López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141.  https://doi.org/10.1016/j.ins.2013.07.007 CrossRefGoogle Scholar
  31. Mandal S, Mandal K (2018) Modeling and mapping landslide susceptibility zones using GIS based multivariate binary logistic regression (LR) model in the Rorachu river basin of eastern Sikkim Himalaya, India. Model Earth Syst Environ.  https://doi.org/10.1007/s40808-018-0426-0
  32. Martinez WL, Martinez AR (2005) Exploratory data analysis with MATLAB. Chapman & Hall/CRC Press, Boca RatonGoogle Scholar
  33. MathWorks Inc. (2015) Statistics and machine learning toolbox. The MathWorks, Inc., NatickGoogle Scholar
  34. Meinhardt M, Fink M, Tünschel H (2015) Landslide susceptibility analysis in Central Vietnam based on an incomplete landslide inventory: comparison of a new method to calculate weighting factors by means of bivariate statistics. Geomorphology 234:80–97.  https://doi.org/10.1016/j.geomorph.2014.12.042 CrossRefGoogle Scholar
  35. Naghibi SA, Pourghasemi HR, Dixon B (2015) GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environ Monit Assess 188:1–27.  https://doi.org/10.1007/s10661-015-5049-6 Google Scholar
  36. Nguyen Q-K, Tien Bui D, Hoang N-D, Trinh P, Nguyen V-H, Yilmaz I (2017) A novel hybrid approach based on instance based learning classifier and rotation forest ensemble for spatial prediction of rainfall-induced shallow landslides using GIS. Sustainability 9:813.  https://doi.org/10.3390/su9050813
  37. Park HJ, Lee JH, Woo I (2013) Assessment of rainfall-induced shallow landslide susceptibility using a GIS-based probabilistic approach. Eng Geol 161:1–15.  https://doi.org/10.1016/j.enggeo.2013.04.011 CrossRefGoogle Scholar
  38. Park I, Lee S (2014) Spatial prediction of landslide susceptibility using a decision tree approach: a case study of the Pyeongchang area, Korea. Int J Remote Sens 35:6089–6112.  https://doi.org/10.1080/01431161.2014.943326 CrossRefGoogle Scholar
  39. Pham BT, Tien Bui D, Pourghasemi H, Indra P, Dholakia MB (2015) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol122(3):1–19. doi: https://doi.org/10.1007/s00704-015-1702-9
  40. Pham BT, Tien Bui D, Prakash I, Nguyen LH, Dholakia MB (2017) A comparative study of sequential minimal optimization-based support vector machines, vote feature intervals, and logistic regression in landslide susceptibility assessment using GIS. Environ Earth Sci 76:371.  https://doi.org/10.1007/s12665-017-6689-3 CrossRefGoogle Scholar
  41. Pham BT, Tien Bui D, Prakash I (2018) Bagging based support vector machines for spatial prediction of landslides. Environ Earth Sci 77:146.  https://doi.org/10.1007/s12665-018-7268-y CrossRefGoogle Scholar
  42. Pradhan B, Sezer EA, Gokceoglu C, Buchroithner MF (2010) Landslide susceptibility mapping by neuro-fuzzy approach in a landslide-prone area (Cameron highlands, Malaysia). IEEE Trans Geosci Remote Sens 48:4164–4177.  https://doi.org/10.1109/tgrs.2010.2050328 CrossRefGoogle Scholar
  43. Prasad A, Iverson L, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9:181–199.  https://doi.org/10.1007/s10021-005-0054-1 CrossRefGoogle Scholar
  44. Rahmati O, Pourghasemi HR, Melesse AM (2016) Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: a case study at Mehran region, Iran. Catena 137:360–372.  https://doi.org/10.1016/j.catena.2015.10.010 CrossRefGoogle Scholar
  45. Rianna G, Zollo A, Tommasi P, Paciucci M, Comegna L, Mercogliano P (2014) Evaluation of the effects of climate changes on landslide activity of Orvieto clayey slope. Procedia Earth Planet Sci 9:54–63.  https://doi.org/10.1016/j.proeps.2014.06.017 CrossRefGoogle Scholar
  46. Riedmiller M, Braun H (1993) A direct adaptive method for faster back-propagation learning: the RPROP algorithm. In: Proc of the IEEE International Conference on Neural Networks, San Francisco, CA, 28 Mar 1993–01 Apr 1993. The Institute of Electrical and Electronics Engineers, San Francisco, vol 1, pp 586–591. http://ieeexplore.ieee.org/document/298623/
  47. Rokach L (2016) Decision forest: twenty years of research. Inform Fusion 27:111–125.  https://doi.org/10.1016/j.inffus.2015.06.005 CrossRefGoogle Scholar
  48. Rokach L, Maimon O (2010) Datamining and knowledge discovery handbook. Springer, New York..  https://doi.org/10.1007/978-0-387-09823-4 Google Scholar
  49. Santacana N, Baeze B, Corominas J, Paz AD, Marturia J (2003) A GIS-based multivariate statistical analysis for shallow landslide susceptibility mapping in La Pobla de Lillet area (eastern Pyrenees, Spain). Nat Hazards 30:281–295CrossRefGoogle Scholar
  50. Shahabi H, Hashim M (2015) Landslide susceptibility mapping using GIS-based statistical models and remote sensing data in tropical environment. Sci Rep 5:9899.  https://doi.org/10.1038/srep09899 CrossRefGoogle Scholar
  51. Song Y, Gong J, Gao S, Wang D, Cui T, Li Y, Wei B (2012) Susceptibility assessment of earthquake-induced landslides using Bayesian network: a case study in Beichuan, China. Comput Geosci 42:189–199CrossRefGoogle Scholar
  52. Stumpf A, Kerle N (2011) Object-oriented mapping of landslides using random forests. Remote Sens Environ 115:2564–2577.  https://doi.org/10.1016/j.rse.2011.05.013 CrossRefGoogle Scholar
  53. Süzen ML, Kaya BŞ (2012) Evaluation of environmental parameters in logistic regression models for landslide susceptibility mapping. Int J Digit Earth 5:338–355.  https://doi.org/10.1080/17538947.2011.586443 CrossRefGoogle Scholar
  54. Tien Bui D, Ho CT, Revhaug I (2012a) GIS-based landslide susceptibility assessment along the National road 32 (Vietnam) using logistic regression and index of entropy models In: Proc Int Symp on Geoinformatics for Spatial Infrastructure Development in Earth and Allied Sciences, 16–20 October 2012, Ho Chi Minh City, Vietnam. Ho Chi Minh City University of Technology and Ho Chi Minh City Institute of Resources Geography, Ho Chi Minh City. http://gisws.media.osaka-cu.ac.jp/gisideas12/
  55. Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick OB (2012b) Spatial prediction of landslide hazards in Hoa Binh province (Vietnam): a comparative assessment of the efficacy of evidential belief functions and fuzzy logic models. Catena 96:28–40.  https://doi.org/10.1016/j.catena.2012.04.001 CrossRefGoogle Scholar
  56. Tien Bui D, Ho TC, Revhaug I, Pradhan B, Nguyen D (2014) Landslide susceptibility mapping along the national road 32 of Vietnam using GIS-based J48 decision tree classifier and its ensembles. In: Buchroithner M, Prechtel N, Burghardt D (eds) Cartography from pole to pole. Lecture Notes in geoinformation and cartography. Springer, Berlin Heidelberg, pp 303–317.  https://doi.org/10.1007/978-3-642-32618-9
  57. Tien Bui D, Tran AT, Klempe H, Pradhan B, Revhaug I (2015) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13(2):361–378.  https://doi.org/10.1007/s10346-015-0557-6
  58. Tien Bui D, Ho T-C, Pradhan B, Pham B-T, Nhu V-H, Revhaug I (2016) GIS-based modeling of rainfall-induced landslides using data mining based functional trees classifier with AdaBoost, bagging, and MultiBoost ensemble frameworks. Environ Earth Sci 75:1101–1123CrossRefGoogle Scholar
  59. Tipping ME (2001) Sparse bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244.  https://doi.org/10.1162/15324430152748236 Google Scholar
  60. Tsangaratos P, Benardos A (2014) Estimating landslide susceptibility through a artificial neural network classifier. Nat Hazards 74(3): 1489–1516.  https://doi.org/10.1007/s11069-014-1245-x
  61. van Erkel AR, Pattynama PMT (1998) Receiver operating characteristic (ROC) analysis: basic principles and applications in radiology. Eur J Radiol 27:88–94.  https://doi.org/10.1016/S0720-048X(97)00157-5 CrossRefGoogle Scholar
  62. Verikas A, Gelzinis A, Bacauskiene M (2011) Mining data with random forests: a survey and results of new tests. Pattern Recogn 44:330–349.  https://doi.org/10.1016/j.patcog.2010.08.011 CrossRefGoogle Scholar
  63. Yalcin A, Reis S, Aydinoglu AC, Yomralioglu T (2011) A GIS-based comparative study of frequency ratio, analytical hierarchy process, bivariate statistics and logistics regression methods for landslide susceptibility mapping in Trabzon, NE Turkey. Catena 85:274–287.  https://doi.org/10.1016/j.catena.2011.01.014 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Viet-Hung Dang
    • 1
  • Tien Bui Dieu
    • 2
  • Xuan-Linh Tran
    • 3
  • Nhat-Duc Hoang
    • 4
  1. 1.Faculty of Information Technology, Institute of Research and DevelopmentDuy Tan UniversityDa NangVietnam
  2. 2.Geographic Information System Group, Department of Business and IT, School of BusinessUniversity College of Southeast NorwayBø I TelemarkNorway
  3. 3.Institute of Research and Development, Duy Tan University, Faculty of Civil EngineeringDuy Tan UniversityDa NangVietnam
  4. 4.Faculty of Civil Engineering, Institute of Research and DevelopmentDuy Tan UniversityDa NangVietnam

Personalised recommendations