An integrated approach of machine algorithms with multi-objective optimization in performance analysis of event detection

Abstract

Challenges in the provision of a safe water distribution system have become one of the major concerns to the society. Various models and algorithms have been developed so far to incorporate in the early warning systems. This study focuses on the use of machine learning (ML) algorithms on different contaminated datasets. Fine tree (FT) and linear support vector machine (LSVM) were chosen to classify the events. To select the best combination of event and nonevent data, nondominated sorting genetic algorithm-II is integrated with the algorithms that helps to obtain an optimal solution of minimized false positive rate (FPR) and minimized false negative rate (FNR). Results suggest that both FT and LSVM minimized FPR and FNR very effectively. However, FT performed better than LSVM in a supervised and laboratory-controlled dataset, and it showed its superiority in securing robustness over LSVM and fuzziness-based methods in different uncertain scenarios of the study datasets. Moreover, the study initiated a novel approach by executing FT and LSVM models to classify contamination events in a combination of two datasets of various contaminants. It produced better results compared to the Pearson correlation–Euclidean distance (PE) method applied in the same dataset. In addition, the ML algorithms showed their consistency in detecting most of the simulated events using different ranges of spikes.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Allgeier, S., Murray, R., Mckenna, S. A., & Shalvi, D. (2005). Overview of event detection systems for water sentinel. Washington, DC: US EPA.

    Google Scholar 

  2. Arad, J., Housh, M., Perelman, L., & Ostfeld, A. (2013). A dynamic thresholds scheme for contaminant event detection in water distribution systems. Water Research, 47(5), 1899–1908.

    CAS  Google Scholar 

  3. Arad, J., Perelman, L., & Ostfeld, A. (2011). Water distribution systems contamination event detection through classification and regression trees. In The 11th international conference on computing and control for the water industry (CCWI) 2011. Urban water management – challenges and opportunities—Proceedings of the 2011 computing and control for the water industry conference (pp. 725–730).

  4. Arad, J., Perelman, L. & Ostfeld, A. (2012). A coupled decision trees Bayesian approach for water distribution systems event detection. In: World environmental and water resources congress 2012: Crossing boundaries, proceedings of the 2012 congress.

  5. Banik, B. K., Di Cristo, C., Leopardi, A., & De Marinis, G. (2016). Illicit intrusion characterization in sewer systems. Urban Water Journal, 14, 416–426.

    Google Scholar 

  6. Berry, J. W., Hart, W. E., Phillips, C. A., Uber, J. G., & Watson, J. P. (2006). Sensor placement in municipal water networks with temporal integer programming models. Journal of Water Resources and Planning Management, 132(4), 218–224.

    Google Scholar 

  7. Blackburn, B. G., Craun, G. F., Yoder, J. S., Hill, V. H., Calderon, R. L., Chen, N., et al. (2004). Surveillance for waterborne-disease outbreaks associated with drinking water—United States, 2001–2002. MMWR CDC Surveillance Summaries, 53(8), 23–45.

    Google Scholar 

  8. Brunkard, J. M., Ailes, E., Roberts, V. A., Hill, V., Hilborn, E. D., Craun, G. F., et al. (2011). Surveillance for waterborne-disease outbreaks associated with drinking water—United States, 2007–2008. MMWR CDC Surveillance Summaries, 60(12), 38–74.

    Google Scholar 

  9. Brussen, M. (2007). On-line water quality monitoring. Review of Sydney’s Current Status and Future Needs Sydney Water Report. (Sydney).

  10. CANARY. (2013) A water quality event detection tool. https://software.sandia.gov/trac/canary. Accessed December 20, 2017.

  11. Chen, F., Deng, P., Wan, J., Zhang, D., Vasilakos, A. V., & Rong, X. (2015). Data mining for the internet of things: Literature review and challenges. International Journal of Distributed Sensor Network, 2015, 431047.

  12. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines. New York, NY: Cambridge University Press.

    Google Scholar 

  13. Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. (2002). A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computations, 6(2), 182–197.

    Google Scholar 

  14. Di Cristo, C., & Leopardi, A. (2008). Pollution source identification of accidental contamination in water distribution networks. Journal of Water Resources Planning and Management, 134(2), 197.

    Google Scholar 

  15. Epa, U. S. (2005). Water sentinel system architecture. Washington, DC: US EPA, Office of Ground Water and Drinking Water.

    Google Scholar 

  16. Hart, D., McKenna, S. A., Klise, K., Cruz, V., & Wilson, M. (2007). CANARY: A Water Quality Event Detection Algorithm Development Tool. World Environmental and Water Resources Congress. https://doi.org/10.1061/40927(243)517.

  17. Hasan, J., States, S., & Deininger, R. (2004). Safeguarding the security of public water supplies using early warning systems: A brief review. Journal of Contemporary Water Research and Education, 129, 27–33.

    Google Scholar 

  18. Helbling, D. E., & VanBriesen, J. (2009). Modeling residual chlorine response to a microbial contamination event in drinking water distribution systems. Journal of Environmental Engineering, 135(10), 918–927.

    CAS  Google Scholar 

  19. Herwaldt, B. L., Craun, G. F., Stokes, S. L., & Juranek, D. D. (1991). Waterborne-disease outbreaks, 1989–1990. MMWR CDC Surveillance Summaries, 40(3), 1–22.

    CAS  Google Scholar 

  20. Housh, M., & Ohar, Z. (2016). Integrating physically based simulators with event detection systems: Multi-site detection approach. Water Research, 110, 180–191.

    Google Scholar 

  21. Iman, R. L., & Helton, J. C. (1988). An investigation of uncertainty and sensitivity analysis techniques for computer models. Risk Analysis, 8(1), 71–90.

    Google Scholar 

  22. Klise, K. A., & McKenna, S. A. (2006, May). Water quality change detection: Multivariate algorithms. In Optics and Photonics in Global Homeland Security II, 6203, 62030J. International Society for Optics and Photonics.

  23. Klise, K. A., & McKenna, S. A. (2008). Multivariate applications for detecting anomalous water quality. In: Water distribution systems analysis symposium 2006 (pp. 1–11). https://doi.org/10.1061/40941(247)130.

  24. Kramer, M. H., Herwaldt, B. L., Calderon, R. L., & Juranek, D. D. (1996). Surveillance for waterborne-disease outbreaks—United States, 1993–1994. MMWR CDC Surveillance Summaries, 45(1), 1–33.

    CAS  Google Scholar 

  25. Lee, S. H., Levy, D. A., Craun, G. F., Beach, M. J., & Calderon, R. L. (2002). Surveillance for waterborne-disease outbreaks—United States, 1999–2000. MMWR CDC Surveillance Summaries, 51(8), 1–45.

    CAS  Google Scholar 

  26. Levy, D. A., Bens, M. S., Craun, G. F., Calderon, R. L., & Herwaldt, B. L. (1998). Surveillance for waterborne-disease outbreaks—United States, 1995–1996. MMWR CDC Surveillance Summaries, 47(5), 1–34.

    CAS  Google Scholar 

  27. Liang, J. L., Dziuban, E. J., Craun, G. F., Hill, V., Moore, M. R., Gelting, R. J., et al. (2006). Surveillance for waterborne-disease outbreaks associated with drinking water—United States, 2003–2004. MMWR CDC Surveillance Summaries, 55(12), 31–65.

    Google Scholar 

  28. Liu, S., Che, H., Smith, K., & Chang, T. (2015a). Contaminant classification using cosine distances based on multiple conventional sensors. Environmental Science: Processes and Impacts, 17(2), 343–350.

    CAS  Google Scholar 

  29. Liu, S., Che, H., Smith, K., & Chen, L. (2014a). Contamination event detection using multiple types of conventional water quality sensors in source water. Environmental Science: Processes & Impacts, 16(8), 2028–2038.

    CAS  Google Scholar 

  30. Liu, S., Che, H., Smith, K., & Chen, C. (2014b). A method of detecting contamination events using multiple conventional water quality sensors. Environmental Monitoring and Assessment, 187, 4189.

    Google Scholar 

  31. Liu, S., Che, H., Smith, K., Lei, M., & Li, R. (2015b). Performance evaluation for three pollution detection methods using data from a real contamination accident. Journal of Environmental Management, 161, 385–391.

    CAS  Google Scholar 

  32. Liu, S., Li, R., Smith, K., & Che, H. (2016). Why conventional detection methods fail in identifying the existence of contamination events. Water Research, 93, 222–229.

    CAS  Google Scholar 

  33. Liu, S., Smith, K., & Che, H. (2015c). A multivariate based event detection method and performance comparison with two conventional methods. Water Research, 80, 109–118.

    CAS  Google Scholar 

  34. Mac Kenzie, W. R., Hoxie, N. J., Proctor, M. E., Gradus, M. S., Blair, K. A., Peterson, D. E., et al. (1994). A massive outbreak in Milwaukee of Cryptosporidium infection transmitted through the public water supply. The New England Journal of Medicine, 331, 161–167.

    CAS  Google Scholar 

  35. Maskey, S., Guinot, V., & Price, R. K. (2004). Treatment of precipitation uncertainty in rainfall-runoff modelling: A fuzzy set approach. Advances in Water Resources, 27(9), 889–898.

    Google Scholar 

  36. McKenna, S. A., Wilson, M., & Klise, K. A. (2008). Detecting changes in water quality data. Journal of American Water Works Association, 100(1), 74–85.

    CAS  Google Scholar 

  37. Melching, C. S. (1992). An improved-first-order reliability approach for assessing uncertainties in hydrologic modeling. Journal of Hydrology, 132(1–4), 157–177.

    Google Scholar 

  38. Moore, A. C., Herwaldt, B. L., Craun, G. F., Calderon, R. L., Highsmith, A. K., & Juranek, D. D. (1993). Surveillance for waterborne-disease outbreaks—United States, 1991–1992. MMWR CDC Surveillance Summaries, 42(5), 1–22.

    CAS  Google Scholar 

  39. Murray, R., Haxton, T., McKenna, S. A., Hart, D. B., Klise, K., Koch, M., et al. (2010). Water quality event detection systems for drinking water contamination warning systems—development, testing, and application of CANARY. EPAI600IR-lOI036, US.

  40. Oliker, N., & Ostfeld, A. (2013). Classification–optimization model for contamination event detection in water distribution systems. In: World environmental and water resources congress 2013 (pp. 626–636). American Society of Civil Engineers.

  41. Oliker, N., & Ostfeld, A. (2014). A coupled classification—Evolutionary optimization model for contamination event detection in water distribution systems. Water Research, 51, 234–245.

    CAS  Google Scholar 

  42. Osmani, S. A., Banik, B. K., & Ali, H. (2019). Integrating fuzzy logic with Pearson correlation to optimize contaminant detection in water distribution system with uncertainty analyses. Environmental Monitoring and Assessment, 191, 441. https://doi.org/10.1007/s10661-019-7533-x.

    Article  Google Scholar 

  43. Ostfeld, A., & Salomons, E. (2004). Optimal layout of early warning detection stations for water distribution systems security. Journal of Water Resources of Planning and Management, 130(5), 377–385.

    Google Scholar 

  44. Ostfeld, A., et al. (2008). The battle of the water sensor networks: A design challenge for engineers and algorithms. Journal of Water Resources of Planning and Management, 134(6), 556–568.

    Google Scholar 

  45. Perelman, L., Arad, J., Housh, M., & Ostfeld, A. (2012). Event detection in water distribution systems from multivariate water quality time series. Environmental Science and Technology, 46(15), 8212–8219.

    CAS  Google Scholar 

  46. Preis, A., & Ostfeld, A. (2008a). A genetic algorithm for contaminant source characterization using imperfect sensors. Civil Engineering and Environmental Systems, 25(1), 29–39.

    Google Scholar 

  47. Preis, A., & Ostfeld, A. (2008b). Multi-objective contaminant sensor network design for water distribution systems. Journal of Water Resources of Planning and Management, 134(4), 366–377.

    Google Scholar 

  48. Preis, A., & Ostfeld, A. (2011). Hydraulic uncertainty inclusion in water distribution systems contamination source identification. Urban Water Journal, 00, 1–12.

    CAS  Google Scholar 

  49. Raciti, M., Cucurull, J., & Nadjm-Tehrani, S. (2012). Anomaly detection in water management systems. In Critical infrastructure protection (pp. 98–119). Berlin, Heidelberg: Springer.

  50. Shrestha, D. L., Kayastha, N., & Solomatine, D. P. (2009). A novel approach to parameter uncertainty analysis of hydrological models using neural networks. Hydrology and Earth System Sciences, 13(7), 1235–1248.

    Google Scholar 

  51. Shrestha, D. L., & Solomatine, D. P. (2008). Data-driven approaches for estimating uncertainty in rainfall-runoff modelling. International Journal of River Basin Management, 6(2), 109–122.

    Google Scholar 

  52. St Louis, M. E. (1988). Water-related disease outbreaks, 1985. MMWR CDC Surveillance Summaries, 37(2), 15–24.

    CAS  Google Scholar 

  53. Storey, M. V., Van der Gaag, B., & Burns, B. P. (2011). Advances in on-line drinking water quality monitoring and early warning systems. Water Research, 45(2), 741–747.

    CAS  Google Scholar 

  54. Tung, Y.-K. (1996). Uncertainty and reliability analysis. In L. W. Mays (Ed.), Water resources handbook (pp. 71–764). New York: McGraw-Hill Book Company.

    Google Scholar 

  55. Vapnik, V. N. (1995). The nature of statistical learning theory. New York: Springer.

    Google Scholar 

  56. Vinet, L., & Zhedanov, A. (2011). A ‘missing’ family of classical orthogonal polynomials. Journal of Physics A: Mathematical and Theoretical, 44(8), 085201.

    Google Scholar 

  57. Vrugt, J. A., Gupta, H. V., Bouten, W., & Sorooshian, S. (2003). A shuffled complex evolution metropolis algorithm for optimization and uncertainty assessment of hydrologic model parameters. Water Resources Research, 39, 1201.

    Google Scholar 

  58. Walker, W. E., Harremoes, P., Rotmans, J., Van der Sluijs, J. P., van Asselt, M. B. A., Janssen, P., et al. (2003). Defining uncertainty: A conceptual basis for uncertainty management in model-based decision support. Integrated Assessment, 4(1), 5–17.

    Google Scholar 

  59. Wang, C., Feng, Y., Zhao, S., & Li, B.-L. (2012). A dynamic contaminant fate model of organic compound: A Case study of nitrobenzene pollution in Songhua River, China. Chemosphere, 88(1), 69–76.

    CAS  Google Scholar 

  60. Wang, Z., & Xue, X. (2014). Multi-class support vector machine. In Y. Ma & G. Guo (Eds.), Support vector machines applications (1st ed., pp. 23–48). Cham: Springer.

    Google Scholar 

  61. Whelton, A. J., Mc Millan, L., Connell, M., Kelley, K. M., Gill, J. P., White, K. D., et al. (2015). Residential tap water contamination following the freedom industries chemical spill: Perceptions, water quality, and health impacts. Environmental Science and Technology, 49(2), 813–823.

    CAS  Google Scholar 

  62. Yang, J., Bi, J., Zhang, H.-Y., Li, F.-Y., Zhou, J.-B., & Liu, B.-B. (2010). Evolvement of the relationship between environmental pollution accident and economic growth in China. China Environmental Science, 30(4), 571–576.

    Google Scholar 

Download references

Acknowledgements

The authors thank the editors and anonymous referees for their constructive comments and suggestions to improve the quality of an earlier version of the manuscript. The authors are cordially expressing their gratitude to the Department of Civil Engineering, Leading University, Sylhet, Bangladesh, for providing the opportunity to conduct the research work. Special thanks and gratitude are due to the faculty members and reviewers for their valuable suggestions.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Shabbir Ahmed Osmani.

Ethics declarations

Conflict of interest

The authors do not have any conflicts of interest in terms of financial or personal involvement that may influence the statements expressed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Osmani, S.A., Mahmud, F. An integrated approach of machine algorithms with multi-objective optimization in performance analysis of event detection. Environ Dev Sustain 23, 1976–1993 (2021). https://doi.org/10.1007/s10668-020-00659-4

Download citation

Keywords

  • Machine learning algorithms
  • Contaminant detection
  • False positive rate
  • False negative rate
  • Water quality
  • NSGA-II
  • Coefficients of variations
  • Monte Carlo simulation