Advertisement

A New Method Based on Fuzzy C-Means Algorithm for Search Results Clustering

  • Fei Wang
  • Yueming Lu
  • Fangwei Zhang
  • Songlin Sun
Part of the Communications in Computer and Information Science book series (CCIS, volume 320)

Abstract

The existing Fuzzy C-means (FCM) clustering algorithm can only cluster the web documents samples with a pre-known cluster number c which is impossible in practical situations. A new method based on fuzzy c-means algorithm for search results clustering is proposed in this paper. The new clustering method combines FCM algorithm with Affinity Propagation (AP) algotithm to find the optimal c for search results. It is proved that the new method has a better performance in accuracy than traditional method in search results clustering.

Keywords

clustering algorithm search engine results clustering FCM similarity measure 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Wang, Y., Kitsuregawa, M.: On Combining Link and Contents Information for Web Page Clustering. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds.) DEXA 2002. LNCS, vol. 2453, pp. 902–913. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Li, J.C., Yao, T.F.: An Efficient Token-based Approach for Web-Snippet Clustering. In: Proceedings of the Second International Conference on Semantics, knowledge, and Grid (SKG 2006) (November 2006)Google Scholar
  3. 3.
    Corrot2 clustering engine, http://search.carrot2.org/
  4. 4.
    Vivisimo clustering engine, http://vivisimo.com/
  5. 5.
    Oren, Z., Oren, E.: Web Document Clustering: A Feasibility Demonstration. In: Proceedings of the 21st annual international ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1998 (August 1998)Google Scholar
  6. 6.
    A Tutorial on Clustering Algorithms : Fuzzy C-means, http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/cmeans.html
  7. 7.
    Frey, B.J., Dueck, D.: Clustering by Passing Messages Between Data Points. Science 315, 972–976 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Wang, K.J., Zhang, J.Y.: Adaptive Affinity Propagation Clustering. Acta Automatica Sinica, Computer and Information Science 33, 1242–1246 (2008)Google Scholar
  9. 9.
    Yang, N., Liu, Y., Yang, G.: Clustering of Web Search Results Based on Combination of Links and In-snippets. In: 2011 Eighth Web Information Systems and Applications Conference, pp. 108–113 (October 2011)Google Scholar
  10. 10.
    Wang, Y., Kitsuregawa, M.: Link Based Clustering of Web Search Results. In: Wang, X.S., Yu, G., Lu, H. (eds.) WAIM 2001. LNCS, vol. 2118, pp. 225–236. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  11. 11.
    Oren, Z., Oren, E.: Web Document Clustering: A Feasibility Demonstration. In: Proceedings of the 21st ACM SIGIR, pp. 46–54 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Fei Wang
    • 1
    • 2
  • Yueming Lu
    • 1
    • 2
  • Fangwei Zhang
    • 3
  • Songlin Sun
    • 1
    • 2
  1. 1.School of Information and Communication EngineeringBeijing University of Posts and TelecommunicationsBeijingChina
  2. 2.Key Laboratory of Trustworthy Distributed Computing and Service (BUPT)Ministry of EducationBeijingChina
  3. 3.School of HumanitiesBeijing University of Posts and TelecommunicationsBeijingChina

Personalised recommendations