Skip to main content

A Framework for Composing Knowledge Discovery Workflows in Grids

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 206))

Summary

Grid computing platforms provide middleware and services for coordinating the use of data and computational resources available throughout the network. Grids are used to implement a wide range of distributed applications and systems, including frameworks for distributed data mining and knowledge discovery. This chapter presents a framework we developed to support the execution of knowledge discovery workflows in Grid environments by executing data mining and computation intelligence algorithms on a set of Grid nodes. Our framework is an extension of Weka, an open-source toolkit for data mining and knowledge discovery, and makes use of Web Service technologies to access Grid resources and distribute the computation. We present the implementation of the framework and show through some applications how it supports the design of knowledge discovery workflows and their execution on a Grid.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid. In: Berman, F., Fox, G., Hey, A. (eds.) Grid Computing: Making the Global Infrastructure a Reality, pp. 217–249. Wiley, New York (2003)

    Google Scholar 

  2. Witten, H., Frank, E.: Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  3. Czajkowski, K., et al.: The WS-Resource Framework Version 1.0. (2006), http://www-106.ibm.com/developerworks/library/ws-resource/ws-wsrf.pdf (visited May 21, 2008)

  4. Foster, I.: Globus Toolkit Version 4: Software for service-oriented systems. In: Jin, H., Reed, D., Jiang, W. (eds.) NPC 2005. LNCS, vol. 3779, pp. 2–13. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Talia, D., Trunfio, P., Verta, O.: Weka4WS: a WSRF-enabled Weka Toolkit for distributed data mining on Grids. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS, vol. 3721, pp. 309–320. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  6. Al Sairafi, S., Emmanouil, F.-S., Ghanem, M., Giannadakis, N., Guo, Y., Kalaitzopoulos, D., Osmond, M., Rowe, A., Syed, J., Wendel, P.: The Design of Discovery Net: Towards Open Grid Services for Knowledge Discovery. Int. Journal of High Performance Computing Applications 17(3), 297–315 (2003)

    Article  Google Scholar 

  7. Brezany, P., Hofer, J., Min Tjoa, A., Woehrer, A.: GridMiner: An Infrastructure for Data Mining on Computational Grids. In: APAC Conference and Exhibition on Advanced Computing, Grid Applications and eResearch, Queensland, Australia (2003)

    Google Scholar 

  8. Congiusta, A., Talia, D., Trunfio, P.: Distributed data mining services leveraging WSRF. Future Generation Computer Systems 23(1), 34–41 (2007)

    Article  Google Scholar 

  9. Allcock, W., Bresnahan, J., Kettimuthu, R., Link, M., Dumitrescu, C., Raicu, I., Foster, I.: The Globus striped GridFTP framework and server. In: Supercomputing Conf. (2005)

    Google Scholar 

  10. Web Services Base Notification 1.3, OASIS Standard (2006), http://docs.oasis-open.org/wsn/wsn-ws_base_notification-1.3-spec-os.pdf (visited May 21, 2008)

  11. Graham, S., et al.: Publish-Subscribe Notification for Web services (2004), http://www.oasis-open.org/committees/download.php/6661/WSNpubsub-1-0.pdf (visited May 21, 2008)

  12. Java GridFTP client, http://www.globus.org/cog/jftp/ (visited May 21, 2008)

  13. Hettich, S., Bay, S.D.: The UCI KDD Archive, University of California, Department of Information and Computer Science, http://kdd.ics.uci.edu (visited March 19, 2007)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Lackovic, M., Talia, D., Trunfio, P. (2009). A Framework for Composing Knowledge Discovery Workflows in Grids. In: Abraham, A., Hassanien, AE., de Leon F. de Carvalho, A.P., Snášel, V. (eds) Foundations of Computational, IntelligenceVolume 6. Studies in Computational Intelligence, vol 206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01091-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01091-0_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01090-3

  • Online ISBN: 978-3-642-01091-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics