From Parallel Data Mining to Grid-Enabled Distributed Knowledge Discovery

Cesario, Eugenio; Talia, Domenico

doi:10.1007/978-3-540-72530-5_3

From Parallel Data Mining to Grid-Enabled Distributed Knowledge Discovery

Eugenio Cesario²⁴ &
Domenico Talia^24,25

Conference paper

1515 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4482))

Abstract

Data mining often is a compute intensive and time requiring process. For this reason, several data mining systems have been implemented on parallel computing platforms to achieve high performance in the analysis of large data sets. Moreover, when large data repositories are coupled with geographical distribution of data, users and systems, more sophisticated technologies are needed to implement high-performance distributed KDD systems. Recently computational Grids emerged as privileged platforms for distributed computing and a growing number of Grid-based KDD systems have been designed. In this paper we first outline different ways to exploit parallelism in the main data mining techniques and algorithms, then we discuss Grid-based KDD systems.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cohen, W.W.: Fast Effective Rule Induction. In: Proc. of the 12th Int. Conf. Machine Learning (ICML’95), Tahoe City, California, USA, pp. 115–123 (1995)
Google Scholar
Provost, F.J., Aronis, J.M.: Scaling up inductive learning with massive parallelism. International Journal of Machine Learning 23(1), 33–46 (1996)
Google Scholar
Skillicorn, D.: Strategies for Parallel Data Mining. IEEE Concurrency 7(4), 26–35 (1999)
Article Google Scholar
Talia, D.: Parallelism in Knowledge Discovery Techniques. In: Fagerholm, J., et al. (eds.) PARA 2002. LNCS, vol. 2367, pp. 127–136. Springer, Heidelberg (2002)
Google Scholar
Foster, I., et al.: The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Globus Project (2002), http://www.globus.org/alliance/publications/papers/ogsa.pdf
Congiusta, A., Talia, D., Trunfio, P.: Parallel and Grid-Based Data Mining. In: Data Mining and Knowledge Discovery Handbook, pp. 1017–1041. Springer, Heidelberg (2005)
Chapter Google Scholar
Pawlak, Z.: Rough Sets. International Journal of Computer and Information Science 11, 341–356 (1982)
Article MathSciNet MATH Google Scholar
Düntsch, I., Günther, G.: Roughian: Rough information analysis. International Journal of Intelligent Systems 16(1), 121–147 (2001)
Article MATH Google Scholar
Skowron, A., Rauszer, C.: The Discernibility Matrices and Functions in Information Systems. In: Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht (1992)
Google Scholar
Park, B., Kargupta, H.: Distributed Data Mining: Algorithms, Systems, and Applications. In: Data Mining Handbook, pp. 341–358. IEA Publisher, Amsterdam (2002)
Google Scholar
Moore, R.: Knowledge-based Grids. In: Proc. of the 18th IEEE Symposium on Mass Storage Systems and 9th Goddard Conference on Mass Storage Systems and Technologies, San Diego, USA (2001)
Google Scholar
Berman, F.: From TeraGrid to Knowledge Grid. Communications of the ACM 44(11), 27–28 (2001)
Article Google Scholar
Johnston, W.E.: Computational and Data Grids in Large Scale Science and Engineering. Future Generation Computer Systems 18(8), 1085–1100 (2002)
Article MATH Google Scholar
Talia, D., Cannataro, M., Trunfio, P.: KNOWLEDGE GRID: High Performance Knowledge Discovery Services on the Grid. In: Lee, C.A. (ed.) GRID 2001. LNCS, vol. 2242, Springer, Heidelberg (2001)
Google Scholar
Cannataro, M., Talia, D.: The Knowledge Grid. Communications of the ACM 46(1), 89–93 (2003)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

ICAR-CNR, Italy
Eugenio Cesario & Domenico Talia
DEIS-University of Calabria, Italy
Domenico Talia

Authors

Eugenio Cesario
View author publications
You can also search for this author in PubMed Google Scholar
Domenico Talia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science and Engineering, York University, M3J 1P3, Toronto, Ontario, Canada
Aijun An
Institute of Computing Sciences, Poznań University of Technology, ul. Piotrowo 2, 60–965, Poznań, Poland
Jerzy Stefanowski
Department of Applied Computer Science, University of Winnipeg, R3B 2E9, Winnipeg, Manitoba, Canada
Sheela Ramanna
Department of Computer Science, University of Regina, S4S 0A2, Regina, Saskatchewan, Canada
Cory J. Butz
Department of Electrical and Computer Engineering, University of Alberta, T6G 2V4, Edmonton, Alberta, Canada
Witold Pedrycz
Institute of Compuer Science and Technology, Chongqing University of Posts and Telecommunications, 40065, Chongqing, P.R. China
Guoyin Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cesario, E., Talia, D. (2007). From Parallel Data Mining to Grid-Enabled Distributed Knowledge Discovery. In: An, A., Stefanowski, J., Ramanna, S., Butz, C.J., Pedrycz, W., Wang, G. (eds) Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. RSFDGrC 2007. Lecture Notes in Computer Science(), vol 4482. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72530-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-72530-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72529-9
Online ISBN: 978-3-540-72530-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics