Summary
Grid computing platforms provide middleware and services for coordinating the use of data and computational resources available throughout the network. Grids are used to implement a wide range of distributed applications and systems, including frameworks for distributed data mining and knowledge discovery. This chapter presents a framework we developed to support the execution of knowledge discovery workflows in Grid environments by executing data mining and computation intelligence algorithms on a set of Grid nodes. Our framework is an extension of Weka, an open-source toolkit for data mining and knowledge discovery, and makes use of Web Service technologies to access Grid resources and distribute the computation. We present the implementation of the framework and show through some applications how it supports the design of knowledge discovery workflows and their execution on a Grid.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid. In: Berman, F., Fox, G., Hey, A. (eds.) Grid Computing: Making the Global Infrastructure a Reality, pp. 217–249. Wiley, New York (2003)
Witten, H., Frank, E.: Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco (2000)
Czajkowski, K., et al.: The WS-Resource Framework Version 1.0. (2006), http://www-106.ibm.com/developerworks/library/ws-resource/ws-wsrf.pdf (visited May 21, 2008)
Foster, I.: Globus Toolkit Version 4: Software for service-oriented systems. In: Jin, H., Reed, D., Jiang, W. (eds.) NPC 2005. LNCS, vol. 3779, pp. 2–13. Springer, Heidelberg (2005)
Talia, D., Trunfio, P., Verta, O.: Weka4WS: a WSRF-enabled Weka Toolkit for distributed data mining on Grids. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS, vol. 3721, pp. 309–320. Springer, Heidelberg (2005)
Al Sairafi, S., Emmanouil, F.-S., Ghanem, M., Giannadakis, N., Guo, Y., Kalaitzopoulos, D., Osmond, M., Rowe, A., Syed, J., Wendel, P.: The Design of Discovery Net: Towards Open Grid Services for Knowledge Discovery. Int. Journal of High Performance Computing Applications 17(3), 297–315 (2003)
Brezany, P., Hofer, J., Min Tjoa, A., Woehrer, A.: GridMiner: An Infrastructure for Data Mining on Computational Grids. In: APAC Conference and Exhibition on Advanced Computing, Grid Applications and eResearch, Queensland, Australia (2003)
Congiusta, A., Talia, D., Trunfio, P.: Distributed data mining services leveraging WSRF. Future Generation Computer Systems 23(1), 34–41 (2007)
Allcock, W., Bresnahan, J., Kettimuthu, R., Link, M., Dumitrescu, C., Raicu, I., Foster, I.: The Globus striped GridFTP framework and server. In: Supercomputing Conf. (2005)
Web Services Base Notification 1.3, OASIS Standard (2006), http://docs.oasis-open.org/wsn/wsn-ws_base_notification-1.3-spec-os.pdf (visited May 21, 2008)
Graham, S., et al.: Publish-Subscribe Notification for Web services (2004), http://www.oasis-open.org/committees/download.php/6661/WSNpubsub-1-0.pdf (visited May 21, 2008)
Java GridFTP client, http://www.globus.org/cog/jftp/ (visited May 21, 2008)
Hettich, S., Bay, S.D.: The UCI KDD Archive, University of California, Department of Information and Computer Science, http://kdd.ics.uci.edu (visited March 19, 2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Lackovic, M., Talia, D., Trunfio, P. (2009). A Framework for Composing Knowledge Discovery Workflows in Grids. In: Abraham, A., Hassanien, AE., de Leon F. de Carvalho, A.P., Snášel, V. (eds) Foundations of Computational, IntelligenceVolume 6. Studies in Computational Intelligence, vol 206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01091-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-01091-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01090-3
Online ISBN: 978-3-642-01091-0
eBook Packages: EngineeringEngineering (R0)