Skip to main content

A Visual Environment for Designing and Running Data Mining Workflows in the Knowledge Grid

  • Chapter
Data Mining: Foundations and Intelligent Paradigms

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 24))

Abstract

Data mining tasks are often composed by multiple stages that may be linked each other to form various execution flows. Moreover, data mining tasks are often distributed since they involve data and tools located over geographically distributed environments, like the Grid. Therefore, it is fundamental to exploit effective formalisms, such as workflows, to model data mining tasks that are both multi-staged and distributed. The goal of this work is defining a workflow formalism and providing a visual software environment, named DIS3GNO, to design and execute distributed data mining tasks over the Knowledge Grid, a service-oriented framework for distributed data mining on the Grid. DIS3GNO supports all the phases of a distributed data mining task, including composition, execution, and results visualization. The paper provides a description of DIS3GNO, some relevant use cases implemented by it, and a performance evaluation of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cannataro, M., Talia, D.: The Knowledge Grid. Communitations of the ACM 46(1), 89–93 (2003)

    Article  Google Scholar 

  2. Mastroianni, C., Talia, D., Trunfio, P.: Metadata for Managing Grid Resources in Data Mining Applications. Journal of Grid Computing 2(1), 85–102 (2004)

    Article  MATH  Google Scholar 

  3. Congiusta, A., Talia, D., Trunfio, P.: Distributed data mining services leveraging WSRF. Future Generation Computer Systems 23(1), 34–41 (2007)

    Article  Google Scholar 

  4. Foster, I.: Globus Toolkit Version 4: Software for service-oriented systems. In: Conf. on Network and Parallel Computing, pp. 2–13 (2005)

    Google Scholar 

  5. Zhou, Z.H.: Semi-supervised learning by disagreement. In: 4th IEEE International Conference on Granular Computing, p. 93 (2008)

    Google Scholar 

  6. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Reading (2006)

    Google Scholar 

  7. Fahringer, T., Jugravu, A., Pllana, S., Prodan, R., Seragiotto Junior, C., Truong, H.L.: ASKALON: A Tool Set for Cluster and Grid Computing. Concurrency and Computation: Practice & Experience 17(2-4) (2005)

    Google Scholar 

  8. Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludascher, B., Mock, S.: Kepler: an extensible system for design and execution of scientific workflows. In: 16th International Conference on Scientific and Statistical Database Management (2004)

    Google Scholar 

  9. Deelman, E., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Patil, S., Su, M.-H., Vahi, K., Livny, M.: Pegasus: Mapping Scientific Workflows onto the Grid. In: Across Grids Conference (2004)

    Google Scholar 

  10. Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic Acids Research 34(Web Server issue), 729–732 (2006)

    Article  Google Scholar 

  11. Shields, M., Taylor, I.: Programming Scientific and Distributed Workflow with Triana Services. In: Workflow in Grid Systems Workshop in GGF 2010 (2004)

    Google Scholar 

  12. Lackovic, M., Talia, D., Trunfio, P.: A Framework for Composing Knowledge Discovery Workflows in Grids. In: Abraham, A., Hassanien, A., Carvalho, A., Snel, V. (eds.) Foundations of Computational Intelligence, Data Mining Theoretical Foundations and Applications. SCI. Springer, Heidelberg (2009)

    Google Scholar 

  13. BPEL4WS. Business Process Execution Language for Web Services. See, http://www.ibm.com/developerworks/library/specification/ws-bpel/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Cesario, E., Lackovic, M., Talia, D., Trunfio, P. (2012). A Visual Environment for Designing and Running Data Mining Workflows in the Knowledge Grid. In: Holmes, D.E., Jain, L.C. (eds) Data Mining: Foundations and Intelligent Paradigms. Intelligent Systems Reference Library, vol 24. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23241-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23241-1_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23240-4

  • Online ISBN: 978-3-642-23241-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics