Predictive Analysis of Lake Water Quality Using an Evolutionary Algorithm

  • Mrunalini Jadhav
  • Kanchan KhareEmail author
  • Sayali Apte
  • Rushikesh Kulkarni
Part of the Algorithms for Intelligent Systems book series (AIS)


Lakes are water bodies having considerable availability of water. Hence, it could be convincing and promising water resource in the area of acute shortage of water. Throughout the world, the water quality of lakes, natural or human-made, has been deteriorating because of urban, agricultural, industrial and other impacts. Widespread eutrophication of lakes leads to the overgrowth of plants and algae; the bacterial degradation of their biomass consumes more oxygen from water resulting in the state of hypoxia [Azhagesan in Water quality parameters and water quality standards for different uses. National Water Academy Report, 1]. Water quality monitoring of reservoirs is essential in the exploitation of aquatic resources and its conservation. Continuous monitoring of water quality of lakes and prediction of water quality will help to conserve the lakes. The authors offer an application of an evolutionary algorithm for predictive analysis of water quality of reservoirs/lakes and to discover a functional relationship between features in data (symbolic regression). We have used a nature-inspired technique of genetic programming (GP) as it can evolve the best individual (program). GP has also been used to discover a functional relationship between features in data (symbolic regression) and to group data into categories (classification). We have used monthly water quality data observed by Maharashtra Water Resources Department, Hydrological Data Users Group (HDUG) for the present study. Case study of Gangapur reservoir (an artificial lake) located in Nashik district in the state of Maharashtra, India, is used for demonstration of strengths of the evolutionary algorithm. We have developed cause-effect models for biochemical oxygen demand (BOD), chemical oxygen demand (COD) and faecal coliform bacteria (F-col). In situations of a scarcity of observed data, hybrid cause-effect models will be useful. We have used spectral analysis to eliminate the problem of time lag in case of hybrid models. Cross-validation is executed to ensure the overfitting of the models. We have performed performance evaluation by classification accuracy, mean absolute error and mean squared error. The evolved hybrid cause-effect model with spectral analysis is of immense use for small data sets.


  1. 1.
    Azhagesan R (1999) Water quality parameters and water quality standards for different uses. National Water Academy ReportGoogle Scholar
  2. 2.
    Banzhaf W, Nordin P, Keller RE, Francone FD (1998) Genetic Programming: an introduction, an automatic evolution of computer programs and its applications. Morgan Kaufmann Publishers, Inc., San Francisco, CaliforniaCrossRefGoogle Scholar
  3. 3.
    Bartram J, Ballance R, World Health Organization & United Nations Environment Programme (1996). Water quality monitoring: a practical guide to the design and implementation of freshwater quality studies and monitoring programs. In: Bartram J, Ballance R (eds). E & FN Spon, London.
  4. 4.
    Brameier M (2004) On linear genetic programming; Ph.D. thesis, University of Dortmund
  5. 5.
    Chavan A, Sharma MP, Bhargava R (2009) Water quality assessment of the Godavari river National conference on hydraulics. HydroNepal J Water Energy Environ 1:31–34. Scholar
  6. 6.
    Coppola E, Rana A, Poultonx M, Szidarovszky F, Uhl V (2005) A neural network model for predicting aquifer water level elevations. Ground Water 43(2):231–243CrossRefGoogle Scholar
  7. 7.
    Dawson CW, Wilby RL (1999) Hydrological modelling using artificial neural networks. Prog Phys Geogr Earth Environ 25(01):80–108.
  8. 8.
    Dogan E, Koklu R, Sengorur B (2009) Modelling biological oxygen demand of the Melen River in Turkey using an artificial neural network technique. J Environ Manage 90(2):1229–1235CrossRefGoogle Scholar
  9. 9.
    Francone FD, Markus C, Banzhaf W, Nordin P (1999) Homologous crossover in genetic programming. Proc Genet Evol Comput Conf 2:1021–1026Google Scholar
  10. 10.
    Guven A (2009) Linear genetic programming for time-series modelling of daily flow rate. J Earth Syst Sci 118(02):137–146CrossRefGoogle Scholar
  11. 11.
    Hitoshi I, Yoshihiko H, Topon KP (2009) Applied genetic programming and machine learning. CRC Press International Series on Computational Intelligence, Boca RatonzbMATHGoogle Scholar
  12. 12.
    Jadhav MS, Khare KC, Warke AS (2015) Water quality prediction of Gangapur reservoir (India) using LS-SVM and genetic programming. Lakes Reservoirs Res Manag 20(04):275–284. Scholar
  13. 13.
    Jadhav MS, Khare KC, Warke AS (2014) Selection of significant input parameters for water quality prediction-a comparative approach. Int J Res Advent Technol 2(03):81–90Google Scholar
  14. 14.
    Khovanova NA, Shaikhina T, Mallick KK (2015) Neural networks for analysis of trabecular bone in osteoarthritis. Bioinspired, Biomimetic Nanobiomaterials 4(1):90–100CrossRefGoogle Scholar
  15. 15.
    Koza JR (1992) Genetic programming: on the programming of computers using natural selection. A Bradford book. MIT Press, Cambridge, Massachusetts, London, EnglandzbMATHGoogle Scholar
  16. 16.
    Lebaron B, Weigend AS (1998) A bootstrap evaluation of the effect of data splitting on financial time series, IEEE Trans Neural Networks 213–220Google Scholar
  17. 17.
    Lermontov A, Yokoyama L, Lermontov M, Machado MAS (2009) River quality analysis using fuzzy water quality index: Ribeira do Iguape river watershed, Brazil. Ecol Ind 9(6):1188–1197Google Scholar
  18. 18.
    Londhe SN, Dixit PR (2012) Genetic programming—new approaches and successful applications. In: Soto SV (ed) 8/12. In Tech PublicationsGoogle Scholar
  19. 19.
    Londhe S, Charhate S (2010) Comparison of data-driven modelling techniques for river flow forecasting. Hydrol Sci J 55(7):1163–1174Google Scholar
  20. 20.
    Muttil N, Chau K (2007) Machine learning paradigms for selecting ecologically significant input variable. Eng Appl Artif Intell 20(06):735–744. Scholar
  21. 21.
    Muttil N, Chau K (2006) Neural network and genetic programming for modelling coastal algal blooms. Int J Environ Pollut 28(3–4):223–238. Scholar
  22. 22.
    Muttil N, Lee JHW (2005) Genetic programming for analysis and real-time prediction of coastal algal blooms. Ecol Model 189(03):363–376. Scholar
  23. 23.
    Muttil N, Lee JHW, Jayawardena AW (2004) Real-time prediction of coastal algal blooms using genetic programming. In: 6th international conference on hydro informatics. Singapore, pp 890–897.
  24. 24.
    Najah A, Elshafie A, Karim OA, Jaffar O (2009) Prediction of johor river water quality parameters using artificial neural networks. Eur J Sci Res 28(3):422–435Google Scholar
  25. 25.
    Nordin JP (1997). Evolutionary program induction of binary machine code and its application. Ph.D. dissertation, Department of Computer Science, University of DortmundGoogle Scholar
  26. 26.
    Palani S, Liong S-Y, Tkalich P (2008) An ANN application for water quality forecasting. Mar Pollut Bull 56:1586–1597CrossRefGoogle Scholar
  27. 27.
    Preis A, Ostfeld A (2008) A coupled model tree–genetic algorithm scheme for flow and water quality predictions in watersheds. J Hydrol 349:364–375CrossRefGoogle Scholar
  28. 28.
    Recknagel F, Cao H, Kim B, Takamura N, Welk A (2006) Unravelling and forecasting algal population dynamics in two lakes different in morphometry and eutrophication by neural and evolutionary computation. Ecol Inform 1(2):133-151Google Scholar
  29. 29.
    Sawant R (2015). A comprehensive study of polluted river stretches and preparation of action plan of river Godavari from Nashik downstream to Paithan. The report, Aavanira Biotech P. Ltd., Maharashtra Pollution Control Board.
  30. 30.
    Shiklomanov I (1993) Water in crisis: a guide to the world’s freshwater resources. In: Gleick PH (ed). Oxford University Press, New York, pp 13–25,
  31. 31.
    Tikhe SS, Khare KC, Londhe SN (2015) Multicity seasonal air quality index forecasting using soft computing techniques. Adv Environ Res 4(02):83–104.
  32. 32.
    Shaikhina T, Khovanova NA (2017) Handling limited datasets with neural networks in medical applications: a small-data approach. Artif Intell Med 75:51–63.
  33. 33.
    US Environmental Protection Agency (2009). Technical assistant document for reporting of daily air quality—air quality index. Research Triangle Park, North CarolinaGoogle Scholar
  34. 34.
    Wang W, Chau K, Xu D, Chen X (2015) Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition. Water Resour Manage 29(08):2655–2675. Scholar
  35. 35.
  36. 36.
    Whigham PA, Recknagel F (1999) Predictive modelling of plankton dynamics in freshwater lakes using genetic programming. The Information Science Discussion Paper Series. Department of Information Science, University of Otago, Dunedin, New Zealand, pp 1–7Google Scholar
  37. 37.
    Wu CL, Chau KW, Li YS (2009) Methods to improve neural network performance in daily flows. J Hydrol 372(1–4):80–93CrossRefGoogle Scholar
  38. 38.
    Xiang, Y, Jiang L (2009) Water quality prediction using LS-SVM and particle swarm optimization. In: Conference proceedings of the second international workshop on knowledge discovery and data mining, WKDD 2009, Moscow, Russia.

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Mrunalini Jadhav
    • 1
  • Kanchan Khare
    • 2
    Email author
  • Sayali Apte
    • 2
  • Rushikesh Kulkarni
    • 2
  1. 1.SVC PolytechniquePuneIndia
  2. 2.Symbiosis Institute of TechnologySymbiosis International (Deemed University)PuneIndia

Personalised recommendations