Skip to main content

Multilevel Clustering on Very Large Scale of Web Data

  • Conference paper
Management Intelligent Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 220))

  • 1015 Accesses

Abstract

With the evolution of the WWW, the computer world has become a huge wave of data, to perform a search of this data, the classical approaches of data mining are still valid, but with diminished performance. In this paper, we present a new clustering approach based on multilevel paradigm called multilevel clustering, that allows to divert the complexity of calculation and execution period of data mining on very large scale. The developed algorithm have been implemented on three public benchmarks to test the effectiveness of the multilevel clustering approach. The numerical results have been compared to those of the simple k-means algorithm. As foreseeable, the multilevel clustering outperforms clearly the basic k-means on both the execution time and success rate that reaches 100 % while increasing the number of data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Han, J., Kamber, M.: Data mining: Concepts and techniques. Morgan Kaufmann Publisher (2001)

    Google Scholar 

  2. Shih, M.-Y., Jheng, J.W., Lai, L.F.: A Two-Step Method for Clustering Mixed Categroical and Numeric Data. Tamkang Journal of Science and Engineering 13, 11–19 (2010)

    Google Scholar 

  3. He, Z., Xu, X., Deng, S.: Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach. CoRR abs/cs/0509011 (2005)

    Google Scholar 

  4. Huang, Z.: Clustering large data sets with mixed numeric and categorical values. In: The First Pacific- Asia Conference on Knowledge Discovery and Data Mining (1997)

    Google Scholar 

  5. Meta-Knowledge, W., Drias, H., Djenouri, Y.: Multilevel clustering of induction rules for web meta-knowledge. In: Rocha, Á., Correia, A.M., Wilson, T., Stroetmann, K.A. (eds.) Advances in Information Systems and Technologies. AISC, vol. 206, pp. 43–54. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Czarnul, P., Ciereszko, A., Frązak, M.: Towards efficient parallel image processing on cluster grids using GIMP. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3037, pp. 451–458. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Quaresma, P., Rodrigues, I.P.: Cooperative Information Retrieval Dialogues through Clustering. In: Text,Speech and Dialogue, Part-III, pp. 415–420 (2000)

    Google Scholar 

  8. Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  9. Álvarez, M., Pan, A., Raposo, J., Bellas, F., Cacheda, F.: Using Clustering and Edit Distance Techniques for Automatic Web Data Extraction. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 212–224. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Agarwal, P., et al.: International Journal of Engineering Science and Technology (IJEST) 3 (2011) ISSN : 8282-8289

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amine Chemchem .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Chemchem, A., Drias, H. (2013). Multilevel Clustering on Very Large Scale of Web Data. In: Casillas, J., Martínez-López, F., Vicari, R., De la Prieta, F. (eds) Management Intelligent Systems. Advances in Intelligent Systems and Computing, vol 220. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00569-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-00569-0_2

  • Publisher Name: Springer, Heidelberg

  • Print ISBN: 978-3-319-00568-3

  • Online ISBN: 978-3-319-00569-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics