Irrigation Science

, Volume 37, Issue 1, pp 11–23 | Cite as

Combining imaging techniques with nonparametric modelling to predict seepage hotspots in irrigation channels

  • S. Akbar
  • A. Kathuria
  • B. MaheshwariEmail author
Original Paper


Using the Murrumbidgee Irrigation Area, Australia as a case study, we present an integrated approach for identifying seepage hotspots and predicting seepage losses from open channel. The approach is particularly important to facilitate investments for improving irrigation conveyance efficiencies, thus enabling sustainable agricultural water use. A qualitative assessment is used for capturing seepage hotspots with electromagnetic inductance (EM31) imaging techniques, followed by actual seepage measurements. Based on data from major irrigation systems in the southern Murrumbidgee Irrigation Area, a case is made for cost-effective methodology to locate seepage hotspots and quantify seepage losses in channels. In particular, a predictive model was developed based on EM31 survey and direct measured channel seepage data. The main input data for the model were EM values, soil types, water depth in channels, wetted perimeter of channels and whether water is flowing in channels. The output from the model was a seepage loss value in channels. The three different modelling techniques considered were the Generalised Linear Mixed (GLM) model, Random Forest (RF) model and Generalized Boosted Regression Model (GBM), and a best performing model for seepage prediction was identified. The RF model was found to the most reliable, explaining 60% of the variability in the data and with the least mean absolute error. The study indicated that the RF model can be used to locate seepage hotspots in channels and determine the magnitude of seepage losses.



The author acknowledges technical contributions from a number of his colleagues including Professor Shahbaz Khan, Dr. Akhtar Abbas, and Dr. Mohsin Hafeez. Data from the NSW Department of Primary Industries and Murrumbidgee Irrigation Limited and Coleambally Irrigation Co-operative Limited are acknowledged.


  1. Akbar S (2005) Measurement of losses from on-farm channels and drains. Accessed 22 March 2018
  2. Akbar S, Abbas A, Hanjra MA, Khan S (2013) Structured analysis of seepage losses in irrigation supply channels for cost-effective investments: case studies from the southern Murray-Darling Basin of Australia. Irrig Sci 31(1):11–25CrossRefGoogle Scholar
  3. Ayaru L, Ypsilantis PP, Nanapragasam A, Choi RCH, Thillanathan A, Min-Ho L, Montana G (2014) Prediction of outcome in acute lower gastrointestinal bleeding using gradient boosting. PLoS One 10:7Google Scholar
  4. Breiman L (1996) Bagging predictors. Machine Learn 24(2):123–140Google Scholar
  5. Breiman L (2001) Random Forests. Mach Learn 45(1):5–32CrossRefGoogle Scholar
  6. Diggle PJ (1990) Time Series: a biostatistical introduction. Clarendon Press, OxfordGoogle Scholar
  7. Dodd S, Berk M, Kelin K, Zhang Q, Eriksson E, Deberdt W, Nelson JC (2014) Application of the gradient boosted method in randomised clinical trials: participant variables that contribute to depression treatment efficacy of duloxetine, SSRIs or placebo. J Affect Disord 168:284–293PubMedCrossRefGoogle Scholar
  8. Eman A, Akram A, Ghorbani MA, Farzin S (2013) Estimation of channels seepage using Seep/w and evolutionary polynomial regression (EPR) modelling (case study: Qazvin and Isfahan Channels) 2013. Sci Publ J Civil Eng Urban 3(4):211–215Google Scholar
  9. Evans JS, Murphy MA, Holden ZA, Cushman SA (2011) Modeling species distribution and change using random forest. In: Predictive species and habitat modelling in landscape ecology. Springer, New York, pp 139–159CrossRefGoogle Scholar
  10. Freund Y, Schapire R (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139CrossRefGoogle Scholar
  11. Friedman J (2001) Greedy boosting approximation: a gradient boosting machine. Ann Stat 29:1189–1232CrossRefGoogle Scholar
  12. Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28:337–407CrossRefGoogle Scholar
  13. Grömping U (2009) Variable importance assessment in regression: linear regression versus random forest. Am Stat 63(4):308–319CrossRefGoogle Scholar
  14. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Data mining, inference, and prediction. Springer, New YorkGoogle Scholar
  15. Khan S, Akbar S, Rana T, Abbas A, Robinson D, Dassanayake D, Hirsi I, Blackwell J, Xevi E, Carmichael A (2005) Hydrologic economic ranking of water savings options. Murrumbidgee valley water efficiency feasibility project. Consultancy report to Pratt Water Group. CSIRO Land and Water, Griffith, NSW, Australia.
  16. Khan S, Rana T, Dassanayake D, Abbas A, Blackwell J, Akbar S, Gabriel HF (2009) Spatially distributed assessment of channel seepage using geophysics and artificial intelligence. ANN Irrig Drain 58(3):307–320CrossRefGoogle Scholar
  17. Kinzli KD, Martinez M, Oad R, Prior A, Gensler D (2010) Using an ADCP to determine canal seepage loss in an irrigation district. Agric Water Manag 97:801–810CrossRefGoogle Scholar
  18. Liaw A, Wiener M (2002) Classification and regression by Random Forest. R News 2:18–22Google Scholar
  19. Littell RC, Milliken GA, Stroup WW, Wolfinger RD (2006) SAS system for mixed models, 2nd edn. SAS Institute Inc, CaryGoogle Scholar
  20. Niswonger RG, Prudic DE, Fogg GE, Stonestrom DA, Buckland EM (2008) Method for estimating spatially variable seepage loss and hydraulic conductivity in intermittent and ephemeral streams. Water Resour Res 44:W05418Google Scholar
  21. Pinheiro JC, Bates DM (2000) Mixed-effects models in S and S-Plus. Springer, New YorkCrossRefGoogle Scholar
  22. Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9(2):181–199CrossRefGoogle Scholar
  23. R Development Core Team (2015) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
  24. Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge (ISBN 0-521-46086-7) CrossRefGoogle Scholar
  25. Tukey JW (1980) We need both exploratory and confirmatory. Am Stat 34(1):23–25Google Scholar
  26. Vincenzi S, Zucchetta M, Franzoi P, Pellizzato M, Pranovi F, De Leo GA, Torricelli P (2011) Application of a Random Forest algorithm to predict spatial distribution of the potential yield of Ruditapes philippinarum in the Venice lagoon, Italy. Ecol Model 222(8):1471–1478CrossRefGoogle Scholar
  27. Wang L, Liu ZP, Zhang XS, Chen L (2012) Prediction of hot spots in protein interfaces using a random forest model with hybrid features. Protein Eng Des Sel 25(3):119–126PubMedCrossRefGoogle Scholar
  28. Watt J, Khan S (2007) The use of geophysics to model channel seepage. In: Proceedings: MODSIM07—Land, water and environmental management: integrated systems for sustainability.
  29. Wheeler S, Garrick D, Loch A, Bjornlund H (2013) Evaluating water market products to acquire water for the environment in Australia. Land Use Policy 30(1):427–436CrossRefGoogle Scholar
  30. Zhang Y, Haghani A (2015) A gradient boosting method to improve travel time prediction. Transp Res Part C Emerg Technol 58:308–324CrossRefGoogle Scholar
  31. Zuur AF, Ieno EN, Walker N, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with R. Springer, New YorkCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Primary IndustriesParramattaAustralia
  2. 2.School of Science and HealthWestern Sydney UniversityPenrithAustralia

Personalised recommendations