Skip to main content

Applying Bayesian Networks for Meteorological Data Mining

  • Conference paper
Applications and Innovations in Intelligent Systems XIII (SGAI 2005)

Abstract

Bayesian Networks (BNs) have been recently employed to solve meteorology problems. In this paper, the application of BNs for mining a real-world weather dataset is described. The employed dataset discriminates between “wet fog” instances and “other weather conditions” instances, and it contains many missing data. Therefore, BNs were employed not only for classifying instances, but also for filling missing data. In addition, the Markov Blanket concept was employed to select relevant attributes. The efficacy of BNs to perform the aforementioned tasks was assessed by means of several experiments. In summary, more convincing results were obtained by taking advantage of the fact that BNs can directly (i.e. without data preparation) classify instances containing missing values. In addition, the attributes selected by means of the Markov Blanket provide a simpler, faster, and equally accurate classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Basak, J., Sudarshan, A., Trivedi, D., Santhanam, M.S., Weather Data Mining Using Independent Component Analysis, Journal of Machine Learning Research, n.5, pp. 239–253, 2004.

    MathSciNet  Google Scholar 

  2. Cano, R., Sordo, C., Gutiérrez, J.M., Applications of Bayesian Networks in Meteorology, Advances in Bayesian Networks, Gámez, J.A. et al. eds., pp. 309–327, Springer, 2004.

    Google Scholar 

  3. Cofiño, A.S., Gutiérrez, J.M., Jakubiak, B., Melonek, M., Implementation of data mining techniques for meteorological applications. In: Realizing Teracomputing, Zwieflhofer, W. & N. Kreitz eds., pp. 256–271, World Scientific Publishing, 2003.

    Google Scholar 

  4. Heckerman, D. “Bayesian networks for data mining,” Data Mining and Knowledge Discovery, vol. 1, pp. 79–119, 1997.

    Article  Google Scholar 

  5. Hruschka JR., E. R., Hruschka, E. R., Ebecken, N. F. F. A Data Preparation Bayesian Approach for a Clustering Genetic Algorithm. In: Frontiers in Artificial Intelligence and Applications, Soft Computing Systems: Design, Management and Applications, IOS Press, v.87, pp. 453–461, 2002.

    Google Scholar 

  6. Blum, A.L., Langley, P., Selection of Relevant Features and Examples in Machine Learning, Artificial Intelligence, pp. 245–271, 1997.

    Google Scholar 

  7. Hruschka JR., E. R., Hruschka, E. R., Ebecken, N. F. F. Feature Selection by Bayesian Networks In: The Seventeenth Canadian Conference on Artificial Intelligence, 2004, London, Ontario. Lecture Notes in Artificial Intelligence. Berlin: Springer-Verlag, v. 3060, pp. 370–379, 2004.

    Google Scholar 

  8. Pearl, J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA, 1988.

    Google Scholar 

  9. Friedman, N. and Koller, D., Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks, Machine Leraning 50(1–2): 95–125, 2003.

    Article  MATH  Google Scholar 

  10. Cooper, Gregory F. NESTOR: A computer-based medical diagnostic aid that integrates causal and probabilistic knowledge, PhD. thesis, Rep. No. STAN-CS-84-48 (also HPP-84-48) Dept. of Computer Science, Stanford Univ., CA, 1984.

    Google Scholar 

  11. Chickering, D. M., Optimal Structure Identification with Greedy Search, Journal of Machine Learning Research, (3):507–554, 2002.

    Article  MathSciNet  Google Scholar 

  12. Spirtes, P., Glymour, C. and Scheines, R., Causation, Prediction, and Search, (Adaptive Computation and Machine Learning), 2nd edition, Bradford Books, 2001.

    Google Scholar 

  13. Cheng, J., Greiner, R., Kelly, J., Bell, D., Liu, W.R., Learning Bayesian networks from data: An information-theory based approach. Artificial Intelligence, 137(1–2): 43–90, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  14. Cooper G. & Herskovitz, E.. A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning, 9, 309–347, 1992.

    MATH  Google Scholar 

  15. Langley, P. & Sage, S., Induction of Selective Bayesian Classifiers. Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, Seattle, 1994.

    Google Scholar 

  16. Anderson, J. R. & Matessa, M., Explorations of an Incremental Bayesian Algorithm for categorization. Machine Learning, 9, 275–308, 1992.

    Google Scholar 

  17. Hsu, W. H., Genetic Wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning, Information Science, 163, pp. 103–122,2004.

    Article  Google Scholar 

  18. Cheng, J. and Greiner, R., Comparing Bayesian Network Classifiers, Proc. of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI’ 99), Sweden, pp. 101–108, 1999.

    Google Scholar 

  19. Ying, Y. and Webb, G., On Why Discretization Works for Naive-Bayes Classifiers. In Proceedings of the 16th Australian Conference on AI (AI 03), Lecture Notes AI 2903, 440–452. Berlin: Springer, 2003.

    Google Scholar 

  20. Ying, Y., Discretization for Naive-Bayes Learning. PhD. Thesis, Monash University, 2003b. http://www.cs.uvm.edu/~yyang/Yingthesis.pdf

    Google Scholar 

  21. Witten, I. H., Frank, E., Data Mining — Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers, USA, 2000.

    Google Scholar 

  22. Dempster, A. P., Laird, N. M., Rubin, D. B., Maximum Likelihood from Incomplete Data via the EM algorithm, Journal of the Royal Statistical Society B, 39,1–39, 1977.

    Google Scholar 

  23. Gelfand, A.,E. and Smith, A. F. M., Sampling-based approaches to calculating marginal densities. J. American Statistical Association, 85:398–409, 1990.

    Article  MATH  MathSciNet  Google Scholar 

  24. Casella, G. and George, E. I., “Explaining the Gibbs sampler,” Amer. Statist., vol. 46, pp. 167–174, 1992.

    Article  MathSciNet  Google Scholar 

  25. Bigus, J. P., Data Mining with Neural Networks, First edition, USA, McGraw-Hill, 1996.

    Google Scholar 

  26. Han, J. and Kamber, M., Data Mining, Concepts and Techniques. Morgan Kaufmann, 2001.

    Google Scholar 

  27. Reunanen, J., Overfitting in Making Comparissons Between Variable Selection Methods, Journal of Machine Learning Research 3, pp. 1371–1382, 2003.

    Article  MATH  Google Scholar 

  28. Liu, H. and Motoda, H., Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic, 1998.

    Google Scholar 

  29. Guyon, I., Elisseeff, A., An Introduction to Variable and Feature Selection, Journal of Machine Learning Research 3, pp. 1157–1182, 2003.

    Article  MATH  Google Scholar 

  30. Little, R., & Rubin, D. B., Statistical Analysis with Missing Data. Wiley, New York, 1987.

    MATH  Google Scholar 

  31. Lauritzen, S. L., & Spiegelhalter, D. J., Local computations with probabilities on graphical structures and their application to expert systems. J. Royal Statistical Society B, 50, 157–224, 1988.

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag London Limited

About this paper

Cite this paper

Hruschka, E.R., Hruschka, E.R., Ebecken, N.F.F. (2006). Applying Bayesian Networks for Meteorological Data Mining. In: Macintosh, A., Ellis, R., Allen, T. (eds) Applications and Innovations in Intelligent Systems XIII. SGAI 2005. Springer, London. https://doi.org/10.1007/1-84628-224-1_10

Download citation

  • DOI: https://doi.org/10.1007/1-84628-224-1_10

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84628-223-2

  • Online ISBN: 978-1-84628-224-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics