An approach to tackling management problems of water resources is the introduction of the advanced ICTs (Information and Communication Technologies). In IWM (Integrated Water Management) and, in general, in the environmental field the power of these technologies has allowed for large-scale data collection campaigns. The number of parameters that must be measured to monitor an ecosystem is potentially high. Systematic measurement of those parameters generates huge amounts of data that should be suitably interpreted and used. A pragmatic approach has to be used to get the best information = knowledge from all this data. Within such amounts of data there is a lot of hidden information, in terms of models, patterns and trends. But information is difficult to be extracted since data is of varying quantity and quality. As a consequence, semi-automatic knowledge extraction from data has gained great importance within the economic and scientific community. Knowledge Discovery from Databases (KDD) has emerged as a framework where a plethora of techniques for identifying useful and understandable patterns in data have flourished. Most of those techniques can be used with success in the environmental field and, in particular, in IWM. In this paper we give an overview of what KDD is and mention some applications in these areas.
Keywords: Integrated Water Management, Data Mining, Environmental Databases
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abbot, M. B., Babovic, V. M., and Cunge, J. A., 2001, Towards the hydraulics of the hydroinformatics era, Journal of Hydraulic Research 39(4):339-349.
Anctil, F., and Tape, D. G., 2004, An exploration of artificial neural network rainfall-runoff forecasting combined with wavelet decomposition, J. Environ. Eng. Sci. 3:S121-S128.
Babovic, V., Drècourt, J., Keijzer, M., and Hansen, P., 2002, Modelling of water supply assets: A data mining approach, Urban Water 4(4):401-414.
Baratti, R., Cannas, B., Fanni, A., Pintus, M., Sechi, G. M., and Toreno, N., 2003, River flow forecast for reservoir management through neural networks, Neurocomputing 55:421-437.
Bardossy, A., and Duckstein, L., 1995, Fuzzy rule-based modeling with application to geophysical, economic, biological and engineering systems, CRC, London.
Bargiela, A., and Pedrycz, W., 2003, Granular Computing. An Introduction, Kluwer Academic Publishers, Boston, Dordrecht, London.
Baxter, C. W., Smith, D. W., and Stanley, S. J., 2004, A comparison of artificial neural networks and multiple regression methods for the analysis of pilot-scale data, J. Environ. Eng. Sci. 3:S45-S58.
Bazartseren, B., Hildebrandt, G., and Holz, K. P., 2003, Short-term water level prediction using neural networks and neuro-fuzzy approach, Neurocomputing, 55:439-450.
Bessler, F. T., Savic, D. A., and Walters, G. A., 2003, Water reservoir control with data mining, Journal of water resources planning and management, 129(1):26-34.
Bhattacharya, B., Lobbrecht, A. H., and Solomatine, D. P., 2003, Neural Networks and Reinforcement Learning in Control of Water Systems, J. Water Res. Plan. and Mgmnt., 129(6):458-465.
Bhattacharya, B., and Solomatine, D. P., 2005, Neural networks and M5 model trees in modeling water level-discharge relationship, Neurocomputing 63:381-396.
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J., 1984, Classification and Regression Trees, Chapman & Hall, New York.
Cestnik, B., Kononenko, I., and Bratko, I., 1987, ASSISTANT 86: A knowledge-elicitation tool for sophisticated users, in: Progress in Machine Learning, Bratko, I., and Lavrac, N., eds., Sigma Press, Wilmslow.
Cichocki, A., and Unbehauen, R., 1993, Neural Networks for Optimization and Signal Processing, John Wiley & Sons Ltd. & B. G. Teubner, Stuttgart.
Clark, P., and Boswell, R., 1991, Rule induction with CN2: Some recent improvements, in: Proc. of the Fifth European Working Session on Learning, Springer, Berlin, pp. 151-163.
Clark, P., and Niblett, T., 1989, The CN2 induction algorithm, Machine Learning 3(4):261-283.
Díaz, J. L., Pérez, R., Nudelman, M., and Izquierdo, J., 2005, Minería de datos (Data Mining) en los abastecimeintos de agua, casos hipotéticos de utilización, in: Proc. of the V SEREA Seminario Iberoamericano sobre Planificación, Proyecto y Operación de Abastecimientos de Agua, Valencia (Spain).
Doyle, P., Lane, J. I., Theeuwes, J. J., and Zayatz, L. M., eds., 2001, Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, Elsevier, Amsterdam.
Dzeroski, S., 2001, Data Mining in a nutshell, in Relational Data Mining, Dzeroski, S., and Lvrac, N., eds., Springer, Berlin, pp. 3-27.
Dzeroski, S., 2002, Knowledge discovery in environmental databases, in: Analysis of environ-mental data with machine learning methods, Ljubljana, Slovenia.
Edwards, M., Ferrand, N., Goreaud, F., and Huet, S., 2005, The relevance of aggregating a water consumption model cannot be disconnected from the choice of the information available on the resource, Simulation Modelling Practice and Theory 13:287-307.
El-Din, A. G., Smith, D. W., and El-Din, M. G., 2004, Application of artificial neural networks in wastewater treatment, J. Environ. Eng. Sci. 3:S81-S95.
Espert, V., López, P. A., and Izquierdo, J., 1999, Fundamentals of a water quality model solution for dissolved oxygen in one-dimensional receiving system, in: Proc. of Intnl. Workshop on Numerical Modelling of Hydrodynamic Systems, pp. 444-445.
Faye, R. M., Sawadogo, S., Lishou, C., and Mora-Camino, F., 2003, Long-term fuzzy manage-ment of water resource systems, Applied Mathematics and Computation 37:459-475.
Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P., 1996, From Data Mining to Knowledge Discovery: An overview, in: Advances and knowledge Discovery and Data Mining, Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R., eds., MIT Press, Cambridge (MA), pp. 1-34.
Fielding, A. H., ed., 1999, Machine Learning Methods for Ecological Applications, Kluwer Academic Press, Dordrecht.
Iglesias, P. L., Martínez, F. J., Fuertes, V. S., and Pérez, R., 2004, Algoritmo genético modificado para diseño de redes de abastecimiento de agua, in: Proc. of the IV SEREA, Seminário Hispano-Brasileiro sobre Sistemas de Abastecimento Urbano de Água, João Pessoa, Brasil.
Izquierdo, J., Pérez, R., and Iglesias, P. L., 2004a, Mathematical models and methods in the water industry, Mathematical and Computer Modeling 39:1353-1374.
Izquierdo, J., 2004b, Detection and identification of anomalies in water distribution systems using neural networks, NATO/CCMS 2nd workshop of the pilot study Integrated Water Management, Genoa.
Jacob, C., 2003, Stochastic Search Methods, in: Intelligent Data Analysis. An Introduction, Berthold, M., and Hand, D. J., eds., Springer, Berlin, pp. 350-401.
Jeffers, J. N. R., 1999, Genetic Algorithms I, in: Machine Learning Methods for Ecological Applications, Fielding, A. H., ed., Kluwer Academic Press, Dordrecht, pp. 107-121.
Juang, Ch. F., 2003, Temporal problems solved by dynamic fuzzy network based on genetic algorithm with variable-length chromosomes, Fuzzy Sets and Systems 142:199-219.
Kaufmann A., and Gupta, M. M., 1991, Introduction to fuzzy arithmetics: Theory and Applica-tions, Van Nostrand Reinhold, New York.
Keim, D., and Ward, M., 2003, Visualization, in: Intelligent Data Analysis. An Introduction, Berthold, M., and Hand D. J., eds., Springer, Berlin, pp. 403-427.
Kohonen, T., 2001, Self-Organizing Maps, Springer-Verlag, Berlin, Heidelberg.
Kuncheva, L. I., Wrench, J., Jain, L. C., and Al-Zaidan, A. S., 2000, A fuzzy model of heavy metal loadings in Liverpool bay, Env. Modeling and Software 15:161-167.
Lavkulich, L. M., 2004, Some remarks on the topic ‘Environmental Indicators/Human Health’, NATO/CCMS pilot study Integrated Water Managemet, Genova.
Lingireddy, S., and Brion, G. M., eds.,2005, Artificial Neural Networks in Water Supply Engineering, ASCE (EWRI), Reston (VA).
López, P. A., 2001, Metodología para la calibración de modelos matemáticos de dispersión de contaminantes incluyendo regímenes no permanentes, Unpublished doctoral Dissertation, Polytechnic University of Valencia (Spain).
Matías, A., 2004, Diseño de redes de distribución de agua contemplando la fiabilidad, mediante algoritmos genéticos, Unpublished doctoral Dissertation, Polytechnic University of Valencia (Spain).
Michalewicz, Z., 1994, Genetic algorithms + data structures = evolution programs, Springer-Verlag, New York.
Michalski, R. S., and Larson, J., 1983, Incremental generation of vl1 hypotheses: the underlying methodology and the description of program AQ11, ISG 83-5, Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign, Urbana.
Michalski, R. S., Mozetic, I., Hong, J., and Lavrac, N., 1986, The AQ15 inductive learning system: an overview and experiments, in: Proc. of IMAL 1986, Orsay, France, Université de Paris-Sud.
Millot, J., Rodríguez, M. J., and Sérodes, J. B., 2002, Contribution of Neural Networks for Modelling Trihalomethanes Occurrence in Drinking Water, J. Water Res. Plan. and Mgmnt. 128 (5):370-376.
Moore, R., 1979, Methods and Applications of Interval Analysis, Siam, Philadelphia.
Nauck, D., Klawonn, F., and Kruse, R., 1997, Foundations of Neuro-Fuzzy Systems, John Wiley, New York.
Nishida, W., Noguchi, M., Matsushita, H., and Solomatine, D. P., 2004, A Study on the Appli-cation of Genetic Algorithm to Calibration of Water Quality Model, Ann. J. of Hydraulic Engineering 48(2):1321-1326.
Oh, S. K., and Pedrycz, W., 2004, Self-organizing polynomial neural networks based on polynomial and fuzzy polynomial neurons: analysis and design, Fuzzy Sets and Systems, 142:163-198.
Oh, S. K., Pedrycz, W., and Park, H. S., 2003, Multi-FNN identification based on HCM clustering and evolutionary fuzzy granulation, Simulation Modelling Practice and Theory 11:627-642.
Panella, M., Rizzi, A., and Martinelli, G., 2003, Refining accuracy of environmental data prediction by MoG neural networks. Neurocomputing 55:521-549.
Quinlan, J.R., 1983, Learning Efficient Classification Procedures and Their Application to Chess End Games, in: Machine Learning: An Artificial Intelligence Approach, Michalski, R., Carbonell, J., Mitchell, T., eds., Morgan Kaufmann, San Mateo, CA.
Quinlan, J.R., 1986, Induction of Decision Trees, Machine Learning 1:81-106.
Quinlan, J.R., 1990, Learning logical definitions from Relations, Machine Learning, 5:239-266.
Quinlan, J.R., 1993, C4.5. Programs for Machine Learning, San Francisco, Morgan Kaufmann.
Quinlan, J.R., 1996, Learning first-order definitions from relations, Machine Learning 5(3): 239-266.
Quinlan, J. R., and Cameron-Jones, R. M., 1993. FOIL: A Midterm Report, in: Proc. of the 6th European Conference on Machine Learning, P. Brazdil ed., 667:3-20.
Rowland, J., Andrews, W. S., and Reber, K. A. M., 2004, A neural network approach to selecting indicators for a sustainable ecosystem, J. Environ. Eng. Sci., 3:S129-S136.
Ruck, B. M., Walley, W. J., and Hawkes, H. A., 1993, Biological classification of river water quality using neural networks, in: Applications of Artificial Intelligence VIII, Vol 2: Applications and Techniques, Rzevski, G., Pastor, J., and Adey, R.A., eds., Elsevier/CMP, Southampton, pp. 361-372.
Solomatine, D. P., and Dulal, K. N., 2003, Model trees as an alternative to neural networks in rainfall-runoff modeling, Hydrological Sciences Journal 48(3):399-411.
Solomatine, D. P., and Siek, M. B., 2004, Flexible and optimal M5 model trees with applications to flow predictions, in: Proc. 6th Int. Conference on Hydroinformatics, World Scientific, Singapore.
Torra, V., 2003, Trends in Information Fusion in Data Mining, in: Information Fusion in Data Mining, Torra, V., ed., Springer-Verlag, Heidelberg, pp. 1-6.
Torra, V., and Domingo-Ferrer, J., 2003, Record linkage methods for multidatabase data mining, in: Information Fusion in Data Mining, Torra, V., ed., Springer-Verlag, Heidelberg, pp. 101-132.
Tronchi, S., Giona, M., and Baratti, R., 2003, Reconstruction of chaotic time series by neural models: a case study, Neurocomputing 55:581-591.
Tsumoto, S., 2003, Discovery of Temporal Knowledge in Medical Time-Series Databases using Moving Average, Multiscale Matching and Rule Information, in: Information Fusion in Data Mining, Torra, V., ed., Springer-Verlag, Heidelberg, pp. 79-100.
U.S. EPA Terms, 2000, Terms of the Environment. Document order number EPA175B97001, National Service Center for Environmental Publications, also available at http://www.epa.org/ OCEPAterms/.
Vojinovic, Z. and Solomatine, D.P., 2005, Multi-criteria global evolutionary optimization approach to rehabilitation of urban drainage systems, Geophysical Research Abstracts, 7:10720. EGU General Assembly, Vienna.
Walley, W. J., and Dzeroski, S., 1996, Biological monitoring: a comparison between Bayesian, neural and machine learning methods of water quality classification, in: Proc. of the International symposium on Environmental Software Systems, Chapman and Hall, London, 229-240.
Walley, W. J., Martin, R. W., and O’Connor, M. A., 2000, Self organizing maps for the classification and diagnosis of river quality from biological and environmental data, in: Environmental Software Systems: Environmental Information and Decision Support, Denzer, R., Swayne, A., Purvis, M., and Schimak, G., eds, Kluwer, Dordretch, pp. 27-41.
Wu, Z. Y., and Simpson, A. R., 2002, A self-adaptative boundary search genetic algorithm and its application to water distribution systems, J. Hydr. Research 40(2):191-199.
Yager, R. R., 2003, Data Mining Using Granular Linguistic Summaries, in: Information Fusion in Data Mining, Torra, V., ed., Springer-Verlag, Heidelberg, pp. 211-229.
Zadeh, L. A., 1965, Fuzzy sets, Information and Control 8:338-353.
Zadeh, L. A., 1995, Probability Theory and fuzzy logic are complementary rather than competitive, Technometrics 37(3):271-276.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer
About this paper
Cite this paper
Izquierdo, J., Díaz, J.L., Pérez, R., López, P.A., Mora, J.J. (2008). Knowledge Discovery in Environmental Data. In: Meire, P., Coenen, M., Lombardo, C., Robba, M., Sacile, R. (eds) Integrated Water Management. NATO Science Series, vol 80. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6552-1_5
Download citation
DOI: https://doi.org/10.1007/978-1-4020-6552-1_5
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6550-7
Online ISBN: 978-1-4020-6552-1
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)