Abstract
Water is a primary natural resource and its quality is negatively affected by various anthropogenic activities. Deterioration of water bodies has triggered serious management efforts by many countries. BOD is an important water quality parameter as it measures the amount of biodegradable organic matter in water. Testing for BOD is a time-consuming task as it takes 5 days from data collection to analyzing with lengthy incubation of samples. Also, interpolations of BOD results and their implications are mired in uncertainties. So, there is a need for suitable secondary (indirect) method for predicting BOD. A model tree for predicting BOD in river water from a data mining perspective is proposed in this paper. The proposed model is also compared with two other tree based predictive methods namely decision stump and regression trees. The predictive accuracy of the models is evaluated using two metrics namely correlation coefficient and RMSE. Results show that the model tree has a correlation coefficient of 0.9397 which is higher than the other two methods. It also has the least RMSE of 0.5339 among these models.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ajibade, W.A., Ayodele, I.A., Agbede, S.A.: Water quality parameters in the major rivers of Kainji Lake National Park. Niger. Afr. J. Environ. Sci. Technol. 2(7), 185–1996 (2008)
Boyd, C.E.: Water Quality in Warm Water Fish Ponds. Auburn University/Craftmaster Printers, Inc., Auburn/Opelika (1981)
Singh, K.P., Basant, A., Malik, A., Jain, G.: Artificial neural network modeling of the river water quality—a case study. Ecol. Model. 220, 888–895 (2009)
Talib, A., Abu Hasan, Y., Abdul Rahman, N.N.: Predicting biochemical oxygen demand as indicator of river pollution using artificial neural networks. In: 18th World IMACS/MODSIM Congress, Cairns, Australia 13–17 July 2009
Alam, M.J.B., Islam, M.R., Muyen, Z., Mamun, M., Islam, S.: Water quality parameters along rivers. Int. J. Environ. Sci. Technol. 4(1), 159–167 (2007)
Palani, S., Liong, S.-Y., Tkalich, P.: An ANN application for water quality forecasting. Mar. Pollut. Bull. 56, 1586–1597 (2008)
Maier, H.R., Dandy, G.C.: Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ. Model Softw. 15, 101–124 (2000)
Areerachakul, S., Sanguansintukul, S.: Classification and regression trees and MLP neural network to classify water quality of canals in Bangkok, Thailand. Int. J. Intell. Comput. Res. 1(1), 43–50 (2010)
Xiang, Y., Jiang, L.: Water quality prediction using LS-SVM and particle swarm optimization. In: 2009 International Conference on Knowledge Discovery and Data Mining, pp. 900–904 (2009)
Dutta, P., Chaki, R.: A survey of data mining applications in water quality management. In: CUBE International Information Technology Conference, pp. 470–475 (2012)
Tan, P., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Education, Upper Saddle River (2006)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1, 81–106 (1986)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Burlington (1993)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman and Hall/CRC, London (1984)
Wayne, I., Pat, L.: Induction of one-level decision trees. In: Proceedings of the Ninth International Conference on Machine Learning, Aberdeen, Scotland, 1–3 July 1992, pp. 233–240. Morgan Kaufmann, San Francisco (1992)
Soman, K.P., Diwakar, S.: Insight into Data Mining: Theory and Practise. PHI, Delhi (2006)
Roiger, R.J., Geatz, M.W.: Data Mining: A Tutorial Based Primer. Addison Wesley, Boston (2003)
Quinlan, J.R.: Learning with continuous classes. In: Proceedings of 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348. World Scientific, Singapore (1992)
Wang, Y., Witten, I.H.: Induction of Model Trees for Predicting Continuous Classes. Working Paer Series. University of Waikato, New Zealand (1996)
Department of Environment, Food and Rural Affairs (DEFRA). UK Government website- http://data.gov.uk/dataset/river-water-quality-regions
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Witten, I.H., Frank, E.: Data Mining-Practical Machine Learning Tools and Technology with Java Implementations. Morgan Kauffman Publications, San Francisco (2000)
Jain, J., Alamelu Mangai, J., Gulyani, B.B.: Water quality modeling using LM and BR based ANN with sensitivity analysis. In: Proceedings of the International Conference on Computational Methods and Software Engineering, 28–30 December 2015, pp. 73–88. Anna University, Chennai (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Mangai, J.A., Gulyani, B.B. (2016). Induction of Model Trees for Predicting BOD in River Water: A Data Mining Perspective. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2016. Lecture Notes in Computer Science(), vol 9728. Springer, Cham. https://doi.org/10.1007/978-3-319-41561-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-41561-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41560-4
Online ISBN: 978-3-319-41561-1
eBook Packages: Computer ScienceComputer Science (R0)