Skip to main content

Discovering Communicable Models from Earth Science Data

  • Chapter
Computational Discovery of Scientific Knowledge

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4660))

Abstract

This chapter describes how we used regression rules to improve upon results previously published in the Earth science literature. In such a scientific application of machine learning, it is crucially important for the learned models to be understandable and communicable. We recount how we selected a learning algorithm to maximize communicability, and then describe two visualization techniques that we developed to aid in understanding the model by exploiting the spatial nature of the data. We also report how evaluating the learned models across time let us discover an error in the data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Andrienko, G.L., Andrienko, N.V.: Interactive maps for visual data exploration. International Journal Geographic Information Science 13, 355–374 (1999)

    Article  Google Scholar 

  • Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Wadsworth, Belmont, CA (1984)

    Google Scholar 

  • Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. Journal of Artificial Intelligence Research 11, 131–167 (1999)

    MATH  Google Scholar 

  • Brunk, C., Kelly, J., Kohavi, R.: MineSet: An integrated system for data mining. In: Proceedings of the Second International Conference of Knowledge Discovery and Data Mining, Portland, OR, pp. 135–138 (1996)

    Google Scholar 

  • Chen, H.S.: Remote sensing calibration systems: An introduction. A. Deepak Publishing, Hampton, VA (1997)

    Google Scholar 

  • Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference of Knowledge Discovery and Data Mining, Portland, OR, pp. 226–231 (1996)

    Google Scholar 

  • John, G.A.: Robust decision trees: Removing outliers from data. In: Proceedings of the First International Conference of Knowledge Discovery and Data Mining, Montreal, Canada, pp. 174–179 (1995)

    Google Scholar 

  • Keim, D.A., Kriegel, H.-P.: Visualization techniques for mining large databases: A comparison. Transactions on Knowledge and Data Engineering 8, 923–938 (1996)

    Article  Google Scholar 

  • Kodratoff, Y., Nédellec, C. (eds.): Working Notes of the IJCAI-95 Workshop on Machine Learning and Comprehensibility, Montreal, Canada (1995)

    Google Scholar 

  • Lieth, H.: Modeling the primary productivity of the world. In: Lieth, H., Whittaker, R.H. (eds.) Primary Productivity of the Biosphere, pp. 237–263. Springer, Heidelberg (1975)

    Google Scholar 

  • Michalski, R.S.: A theory and methodology of inductive learning. Artificial Intelligence 20, 111–161 (1983)

    Article  MathSciNet  Google Scholar 

  • Pazzani, M.J., Bay, S.D.: The independent sign bias: gaining insight from multiple linear regression. In: Proceeding of the Twenty-First Annual Meeting of the Cognitive Science Society, Vancouver, Canada (1999)

    Google Scholar 

  • Potter, C.S., Brooks, V.: Global analysis of empirical relations between annual climate and seasonality of NDVI. International Journal of Remote Sensing 19, 2921–2948 (1998)

    Article  Google Scholar 

  • Potter, C.S., Klooster, S.A.: Interannual variability in soil trace gas (CO 2, N 2 O, NO) fluxes and analysis of controllers on regional to global scales. Global Biochemical Cycles 12, 621–635 (1998)

    Article  Google Scholar 

  • Potter, C.S., Klooster, S.A., Brooks, V.: Interannual variability in terrestrial net primary production: Exploration of trends and controls on regional to global scales. Ecosystems 2(1), 36–48 (1999)

    Article  Google Scholar 

  • Provost, F., Kohavi, R.: On applied research in machine learning. Machine Learning 30, 127–132 (1998)

    Article  Google Scholar 

  • Quinlan, J.R.: Learning with continuous classes. In: Proceedings of the Australian Joint Conference on Artificial Intelligence, Hobart, Australia, pp. 343–348 (1992)

    Google Scholar 

  • Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993)

    Google Scholar 

  • Rheingans, P., desJardins, M.: Visualizing high-dimensional predictive model quality. In: Proceedings of the Eleventh IEEE Visualization Conference, Salt Lake City, UT, pp. 493–496 (2000)

    Google Scholar 

  • RuleQuest. RuleQuest Research data mining tools (2002), http://www.rulequest.com

  • Schwabacher, M., Langley, P.: Discovering communicable scientific knowledge from spatio-temporal data. In: Proceedings of the Eighteenth International Conference on Machine Learning, Stanford, CA, pp. 489–496 (2001)

    Google Scholar 

  • Smyth, P., Ghil, M., Ide, K.: Multiple regimes in Northern hemisphere height fields via mixture model clustering. Journal of the Atmospheric Sciences 56 (1999)

    Google Scholar 

  • SPIN!: Spatial mining for data of public interest (2002), http://www.ccg.leeds.ac.uk/spin

  • Thornthwaite, C.W.: An approach toward rational classification of climate. Geographical Review 38, 55–94 (1948)

    Article  Google Scholar 

  • Todorovski, L., Dzeroski, S.: Declarative bias in equation discovery. In: Proceedings of the Fourteenth International Conference on Machine Learning, Nashville, TN, pp. 376–384 (1997)

    Google Scholar 

  • Tufte, E.R.: The visual display of quantitative information. Graphics Press, Cheshire (1983)

    Google Scholar 

  • Weiss, S., Indurkhya, N.: Rule-based regression. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Chambéry, France, pp. 1072–1078 (1993)

    Google Scholar 

  • Willmott, C.J., Feddema, J.J.: A more rational climate moisture index. Professional Geographer 44, 84–87 (1992)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Sašo Džeroski Ljupčo Todorovski

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Schwabacher, M., Langley, P., Potter, C., Klooster, S., Torregrosa, A. (2007). Discovering Communicable Models from Earth Science Data. In: Džeroski, S., Todorovski, L. (eds) Computational Discovery of Scientific Knowledge. Lecture Notes in Computer Science(), vol 4660. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73920-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73920-3_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73919-7

  • Online ISBN: 978-3-540-73920-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics