Abstract
Supplying adequate water to individuals and maintaining water supplies to support human life, particularly to rapidly urbanizing communities, are of paramount importance in the development of urban areas in each country worldwide. In turn, maintaining water resource quality and avoiding permanent damage as a consequence of environmental pollution and unsustainable off-take from sources such as rivers and aquifers should be considered as important as the water supply quantity. In this study, random forest (RF) and M5 model tree (M5) models were used to predict water biochemical oxygen demand (BOD). Having decomposed the input variables by wavelet transform, based on the feature selection algorithms (FS) (relief (RA), correlation (CA), principal component analysis (PCA), and ant colony optimization (ACO) algorithms), the important components were recognized and inserted into the RF and M5 models. The proposed approach was applied to Karun River in Ahvaz station on a monthly basis from 2006 to 2018. The results showed that the RF model had better performance with R = 0.872, MAE = 0.0312, and RMSE = 0.0332 values for the variable of BOD compared with the M5 model with R = 0.751, MAE = 0.0377, and RMSE = 0.0468 values. In addition, comparing RF and hybrid models, the purposed hybrid models were considered as viable options to improve the prediction accuracy of BOD. The findings also showed that, among the hybrid models, the WRF-PCA model with R = 0.927, MAE = 0.0198, and RMSE = 0.0241 values was the best model for the prediction of BOD values.
Similar content being viewed by others
References
Abba SI, Hadi SJ, Abdullahi J (2017) River water modelling prediction using multi-linear regression, artificial neural network, and adaptive neuro-fuzzy inference system techniques. Proc Comput Sci 120:75–82
Adeniran KA, Adelodun B, Ogunshina M (2016) Artificial neural network modelling of biochemical oxygen demand and dissolved oxygen of rivers: case study of Asa River. Environ Eng Manag J 72(3):59–74
Ahmed AAM, Shah MA (2015) Application of adaptive neuro-fuzzy inference system (ANFIS) to estimate the biochemical oxygen demand (BOD) of Surma River. J King Saud Univ Eng Sci 29:237–243
Akilandeswari S, Kavitha B (2013) Determination of biochemical oxygen demand by adaptive neuro fuzzy inference system. Adv Appl Sci Res 4(1):101–104
Alizadeh MJ, Kavianpour MR (2015) Development of wavelet-ANN models to predict water quality parameters in Hilo Bay, Pacific Ocean. Mar Pollut Bull 98(1–2):171–178
Areerachakul S (2012) Comparison of ANFIS and ANN for estimation of biochemical oxygen demand parameter in surface water. Int J Chem Biol Eng 6:286–290
Bhardwaj V, Singh DS, Singh AK (2010) Water quality of the Chhoti Gandak River using principal component analysis, Ganga Plain, India. J Earth Syst Sci 119:117–127
Bi J, Bennett K (2003) Regression error characteristic curves, in Proceedings of the twentieth international conference on machine learning. pp. 43–50
Breiman L (2017) Classification and regression trees; Routledge: Routledge, UK.4
Chen JC, Chang NB, Shieh WK (2003) Assessing wastewater reclamation potential by neural network model. Eng Appl Artif Intell 16:149–157
Dara F, Devolli A, Kodra A (2018) An artificial neural networks modell for predicting BOD of Ishem River. International Agricultural, Biological & Life Science Conference, Edirne, Turkey. 225-232
Dillon WR, Goldstein R (1984) Multivariate analysis methods and application. John Wiley and Sons. 453 pp
Dogan E, Lent Sengorur B, Koklu R (2009) Modeling biological oxygen demand of the Melen River in Turkey using an artificial neural network technique. J Environ Manag 90:1219–1235
Dorigo M, Stuzle T (2004) Ant colony optimization. MTI Press
Dufour JM (2011) Coefficients of determination. McGil University
Etemad-Shahidi A, Taghipour M (2012) Predicting longitudinal dispersion coefficient in natural streams using M5’ model tree. J Hydraul Eng 138(6):542–554
Farhadian M, Haddad O, Seifollahi-Aghmiuni S, Loáiciga H (2014) Assimilative capacity and flow dilution for water quality protection in rivers. J Hazard Toxic Radioact Waste 19(2):04014027
Gawali NU, Hasabe R, Vaidya A (2015) A comparison of different mother wavelet for fault detection & classification of series compensated transmission line. Int J Innov Res Sci Technol 1(9):57–63
Guo L, Chehata N, Mallet C, Boukir S (2011) Relevance of airborne lidar and multispectral image data for urban scene classification using Random Forests. ISPRS J Photogramm Remote Sens 66(1):56–66
Hall MA (1999) Correlation-based feature selection for machine learning, phd thesis, University of Waikato.
Hu YC (2010) Analytic network process for pattern classification problems using genetic algorithms. Inf Sci 180(13):2528–2539
Kasem R, ALabdeh D, Noori R, Karbassi A (2018) A software sensor for in-situ monitoring of the 5-day biochemical oxygen demand. Mining-Geology-Petroleum Eng Bull 33(1):15–22
Khaled B, Abdellah A, Noureddine D, Salim H, Sabeha A (2018) Modelling of biochemical oxygen demand from limited water quality variable by ANFIS using two partition methods. Water Qual Res J 53:24–40
Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. AAAI-92 Proceedings of the tenth national conference on Artificial intelligence.129-134
Kotti IP, Sylaios GK, Tsihrintzis VA (2013) Fuzzy logic models for BOD removal prediction in free-water surface constructed wetlands. Ecol Eng 51:66–74
Kuo J, Hsieh M, Lung W, She N (2007) Using artificial neural network for reservoir eutriphication prediction. Ecol Model 200:171–177
Kurunç A, Yürekli K, Çevik O (2005) Performance of two stochastic ap proaches for forecasting water quality and streamflow data from Yeşilιrmak River, Turkey. Env Model Software 20(9):1195–1200
Mallat SG (1998) A wavelet tour of signal processing, San Diego
Meglen RR (1992) Examining large databases: a chemometric approach using principal component analysis. Mar Chem 39:217–237
Mellinger M (1987) Multivariate data analysis: its methods. Chemom Intell Lab Syst 2:29–36
Merry RJE (2005) Wavelet theory and applications. A literature study. Eindhoven University of Technology Department of Mechanical Engineering Control Systems Technology Group
Misiti M, Misiti Y, Oppenheim G, Poggi JM (1996) Wavelet Toolbox
Noori R, Sabahi MS, Karbassi AR, Baghvand A, Tatti Zadeh H (2010) Multivariate statistical analysis of surface water quality based on correlations and variations in the data set. Desalination. 260:129–136
Nourani V, Komasi M, Mano A (2009) A multivariate ANN-wavelet approach for rainfall–runoff modeling. Water Resour Manag 23:2877–2894
Olyaie E, Banejad H, Samadi MT, Rahmani AR, Saghi MH (2010) Performance evaluation of artificial neural networks for predicting rivers water quality indices (BOD and DO) in Hamadan Morad Beik River. Water Soil Sci 20.1(3):200–210
Parinet B, Lhote A, Legube B (2004) Principal component analysis: an appropriate tool for water quality evaluation and management-application to a tropical lake system. Ecol Model 178:295–311
Parmar KS, Bhardwaj R (2013) Analysis of water parameters using daubechies wavelet (level 5) (Db5). Am J Math Stat 2(3):57–63
Pejman AH, Nabi Bidhendi GR, Karbassi AR, Mehrdadi N, Esmaeili B (2009) Evaluation of spatial and seasnal variation in surface water quality using multivariate statistical techniques. Int J Environ Sci Technol 6(3):467–476
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Quinlan JR (1992) Learning with continuous classes. Proc., 5th Australian Joint Conf. on Artificial Intelligence, World Scientific, Singapore, 343–348
Radmanesh F, Golabi MR, Khodabakhshi F, Farzi S, Zeinali M (2020) Modeling aquifer hydrograph: performance review of conceptual MODFLOW and simulator models. Arab J Geosci 13:240
Resnikov HL, Wells RO (1998) Wavelet analysis: the scalable structure of information. Springer
Rodriguez-Galiano V, Mendes MP, Garcia-Soldado MJ, Chica-Olmo M, Ribeiro L (2014) Predictive modeling of groundwater nitrate pollution using random forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (Southern Spain). Sci Total Environ 476:189–206
Safavi HR (2010) Prediction of river water quality by adaptive neuro fuzzy inference system (ANFIS). J Environ Stud 36(53):1–10
Salami ES, Ehteshami M (2015) Simulation, evaluation and prediction modeling of river water quality properties (case study: Ireland Rivers). Int J Environ Sci Technol 12(10):3235–3242
Sarkara A, Pandeyb P (2015) River water quality modelling using artificial neural network technique. Aquat Proc 4:1070–1077
Singh KP, Basant A, Malik A, Jain G (2009) Artificial neural network modeling of the river water quality—a case study. Ecol Model 220(6):888–895
Solgi A, Pourhaghi A, Bahmani R, Zarei H (2017) Improving SVR and ANFIS performance using wavelet transform and PCA algorithm for modeling and predicting biochemical oxygen demand (BOD). Ecohydrol Hydrobiol 17(2):164–175
Teshite TB (2018) Validation of fao-frame remote sensing based agricultural water productivity estimates in the upper Awash River basin, Ethiopia. MSc, University of Twente
Wang Y, Witten IH (1997) Induction of model trees for predicting continuous classes. European Conference on Machine Learning, University of Economics, Faculty of Informatics and Statistics, Prague, Czech Republic
Wen X, Fang J, Diao M, Zhang C (2013) Artificial neural network modeling of dissolved oxygen in the Heihe River, Northwestern China. Environ Monit Assess 185(5):4361–4371
Wu ML, Wang YS (2007) Using chemometrics to evaluate anthropogenic effects in Daya Bay, China. Estuar Coast Shelf Sci 72:732–742
Zhang BL, Dong ZY (2001) An adaptive neural wavelet model for short term load forecasting. Electr Power Syst Res 59:121–129
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: Marcus Schulz
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Golabi, M.R., Farzi, S., Khodabakhshi, F. et al. Biochemical oxygen demand prediction: development of hybrid wavelet-random forest and M5 model tree approach using feature selection algorithms. Environ Sci Pollut Res 27, 34322–34336 (2020). https://doi.org/10.1007/s11356-020-09457-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-020-09457-x