Skip to main content

The CloudMiner

Moving Data Mining into Computational Clouds

  • Chapter
  • First Online:
Grid and Cloud Database Management
  • 1128 Accesses

Abstract

Business, scientific and engineering experiments, medical studies, and governments generate huge amount of information. The problem is how to extract knowledge from all this information. Data mining provides means for at least a partial solution to this problem. However, it would be too expensive to all these areas of human activity and companies to develop their own data mining solutions, develop software, and deploy it on their private infrastructure. This chapter presents the CloudMiner that offers a cloud of data mining services (Software as a Service) running on a cloud service provider infrastructure. The architecture of the CloudMiner is shown and its main components are discussed: MiningCloud that contains all published data mining services, BrokerCloud which mining service providers publish services to, DataCloud that contains the collected data, and Access Point which allows users to access the Service Broker to discover mining services and supports mining service selection and their invocation. The chapter finishes with a short presentation of two use cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al-Ali, R., von Laszewski, G., Amin, K., Hategan, M., Rana, O.,Walker, D., Zaluzec, N.: QoS support for high-performance scientific Grid applications. In: Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid, CCGRID ’04, pp. 134–143. IEEE Computer Society, Washington, DC, USA (2004)

    Google Scholar 

  2. Amazon: Amazon Elastic Compute Cloud (2010). URL http://aws.amazon.com/ec2

  3. Banerjee, S., Basu, S., Garg, S., Garg, S., Lee, S.J., Mullan, P., Sharma, P.: Scalable Grid Service Discovery based on UDDI. In: Proceedings of the 3rd international workshop on Middleware for grid computing, MGC ’05, pp. 1–6. ACM, NY, USA (2005)

    Google Scholar 

  4. Benkner, S., Engelbrecht, G.: A Generic QoS Infrastructure for Grid Web Services. In: Proceedings of the Advanced Int’l Conference on Telecommunications and Int’l Conference on Internet and Web Applications and Services, AICT-ICIW ’06, p. 141. IEEE Computer Society, Washington, DC, USA (2006)

    Google Scholar 

  5. Brezany, P., Janciak, I., Tjoa, A.M.: GridMiner: An advanced support for e-science analytics. In: Dubitzky, W. (ed.) Data Mining Techniques in Grid Computing Environments, pp. 37–55. Wiley, NY (2008)

    Google Scholar 

  6. Brezany, P., Elsayed, I., Han, Y., Janciak, I.,W¨ohrer, A.,Novakova, L., Stepankova, O., Zakova, M., Han, J., Liu, T.: Inside the NIGM Grid Service: Implementation, Evaluation and Extension. In: Proceedings of the 2008 4th International Conference on Semantics, Knowledge and Grid, pp. 314–321. IEEE Computer Society, Washington, DC, USA (2008)

    Google Scholar 

  7. Brock, M., Goscinski, A.: State aware WSDL. In: Proceedings of the sixth Australasian workshop on Grid computing and e-research – vol. 82, AusGrid ’08, pp. 35–44. Australian Computer Society, Darlinghurst, Australia (2008)

    Google Scholar 

  8. Brock, M., Goscinski, A.: Attributed publication and selection for web service-based distributed systems. In: Proceedings of the 2009 Congress on Services – I, pp. 732–739. IEEE Computer Society, Washington, DC, USA (2009)

    Google Scholar 

  9. Brock, M., Goscinski, A.: A technology to expose a cluster as a service in a cloud. In: Proceedings of the Eighth Australasian Symposium on Parallel and Distributed Computing – vol. 107, AusPDC ’10, pp. 3–12. Australian Computer Society, Darlinghurst, Australia (2010)

    Google Scholar 

  10. Brock, M., Goscinski, A.: Toward a Framework for Cloud Security. In: ICA3PP (2), pp. 254–263 (2010)

    Google Scholar 

  11. Brock, M., Goscinski, A.: Toward ease of discovery, selection and use of clusters within a cloud. In: IEEE International Conference on Cloud Computing, pp. 289–296 (2010)

    Google Scholar 

  12. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISPDM 1.0 Step-by-step data mining guide. Tech. rep., The CRISP-DM consortium (2000)

    Google Scholar 

  13. Data Mining Group: Predictive Model Markup Language, version 4.0 (2010)

    Google Scholar 

  14. Demers, A., Gehrke, J.E., Riedewald, M.: Research issues in distributed mining and monitoring. In: Proceedings of the National Science Foundation Workshop on Next Generation Data Mining. Baltimore, MD (2002)

    Google Scholar 

  15. Foster, I.: Globus Toolkit Version 4: Software for Service-Oriented Systems. In: IFIP International Conference on Network and Parallel Computing, no. 3779 in LNCS, pp. 2–13. Springer, Berlin (2005)

    Google Scholar 

  16. Foster, I., Frey, J., Graham, S., Tuecke, S., Czajkowski, K., Ferguson, D., Leymann, F., Nally, M., Sedukhin, I., Snelling, D., Storey, T., Vambenepe, W.,Weerawarana, S.: Modeling stateful resources with web services v.1.1. Tech. rep., Globus Alliance (2004)

    Google Scholar 

  17. Goscinski, A., Brock, M.: Toward dynamic and attribute based publication, discovery and selection for cloud computing. Future Gener. Comput. Syst. 26, 947–970 (2010)

    Article  Google Scholar 

  18. Grant, A., Antonioletti, M., Hume, A., Krause, A., Dobrzelecki, B., Jackson, M., Parsons, M., Atkinson, M., Theocharopoulos, E.: OGSA-DAI: Middleware for Data Integration: Selected Applications. In: IEEE Fourth International Conference on eScience ’08, p. 343 (2008)

    Google Scholar 

  19. Grossman, R., Gu, Y.: Data mining using high performance data clouds: Experimental studies using sector and sphere. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08, pp. 920–927. ACM, NY, USA (2008)

    Google Scholar 

  20. Han, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann, CA (2005)

    Google Scholar 

  21. Han, Y., Brezany, P., Janciak, I.: Cloud-Enabled Scalable Decision Tree Construction. In: International Conference on Semantics, Knowledge and Grid, pp. 128–135. IEEE Computer Society, Los Alamitos, CA, USA (2009)

    Google Scholar 

  22. IBM: IBM Smart Analytics System (2010). URL http://www-01.ibm.com/software/data/infosphere/smart-analytics-system/data.html

  23. Janciak, I., Brezany, P.: A Reference Model for Data Mining Web Services. In: International Conference on Semantics, Knowledge and Grid, pp. 251–258. IEEE Computer Society, Los Alamitos, CA, USA (2010)

    Google Scholar 

  24. Janciak, I., Kloner, C., Brezany, P.: Workflow enactment engine for WSRF-compliant services orchestration. In: Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing, GRID ’08, pp. 1–8. IEEE Computer Society, Washington, DC, USA (2008)

    Google Scholar 

  25. Keahey, K., Freeman, T.: Science Clouds: Early Experiences in Cloud Computing for Scientific Applications. In: Cloud Computing and its Applications (CCA) (2008)

    Google Scholar 

  26. Kopeck´y, J., Vitvar, T., Bournez, C., Farrell, J.: SAWSDL: Semantic Annotations for WSDL and XML Schema. IEEE Internet Comput. 11, 60–67 (2007)

    Article  Google Scholar 

  27. R Systems: (2010). URL http://www.rsystems.com/index.asp

  28. Shafer, J.C., Agrawal, R., Mehta, M.: SPRINT: A Scalable Parallel Classifier for Data Mining. In: Proceedings of the 22th International Conference on Very Large Data Bases, VLDB ’96, pp. 544–555. Morgan Kaufmann, CA (1996)

    Google Scholar 

  29. Hoch, F., Kerr, M., Griffith, A.: Software as a service: strategic backgrounder. Tech. Rep., Software Inform. Indus. Assoc. (2001)

    Google Scholar 

  30. Alves, A., Arkin, A., Askary, S., Bloch, B., Curbera, F., Goland, Y., Kartha, N., Sterling, Konig, D.,Mehta, V., Thatte, S., van der Rijn, D., Yendluri, P., Yiu, A.:Web Services Business Process Execution Language Version 2.0. OASIS Committee Draft (2006)

    Google Scholar 

  31. Wang, G.: Domain-oriented data-driven data mining (3DM): Simulation of human knowledge understanding. In: Proceedings of the 1st WICI International Conference on Web Intelligence Meets Brain Informatics, WImBI’06, pp. 278–290. Springer, Heidelberg (2007)

    Google Scholar 

  32. Wolfram Research: Cloud services for mathematica (2010). URL http://www.nimbisservices. com/page/what-cloud-services-mathematica

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrzej Goscinski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Goscinski, A., Janciak, I., Han, Y., Brezany, P. (2011). The CloudMiner. In: Fiore, S., Aloisio, G. (eds) Grid and Cloud Database Management. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20045-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20045-8_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20044-1

  • Online ISBN: 978-3-642-20045-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics