Advertisement

Association Rule Centric Clustering of Web Search Results

  • Hima Bindu Kommanti
  • Chillarige Raghavendra Rao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7080)

Abstract

Information abundance induced due to the ambiguous queries demands soft computing strategies. This problem can be addressed by Search Results Clustering. This paper presents a novel approach to the web search results clustering based on association rules using the Snowball technique. Association rule mining is employed on terms extracted from title and snippet of the search results. The detailed algorithm and experimental results on data sets of ambiguous queries are presented.

Keywords

Search results clustering web mining association rules Snowball technique 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bai, S., Zhu, W., Zhang, B.: Search Results Clustering Based on Suffix Array and VSM. In: 2010 IEEE/ACM International Conference on Green Computing and Communications & 2010 IEEE/ACM International Conference on Cyber, Physical and Social Computing (2010)Google Scholar
  2. 2.
    Bernardini, A., Carpineto, C., D’Amico, M.: Full subtopic retrieval with keyphrase –based search results clustering. In: Proceedings of WI 2009, pp. 206–213 (2009)Google Scholar
  3. 3.
    Carpineto, C., Romano, G.: Exploiting the potential of concept lattices for information retrieval with CREDO. J. Univ. Comput. Sci. 10(8), 985–1013 (2004)zbMATHGoogle Scholar
  4. 4.
    Carpineto, C., Osiński, S., Romano, G., Weiss, D.: A survey of Web clustering engines. ACM Computing Surveys (CSUR) 41(3) (July 2009) ISSN:0360-0300Google Scholar
  5. 5.
    Cutting, D.R., Pedersen, J.O., Karger, D., Tukey, J.W.: Scatter/Gather: A cluster-based approach to browsing large document collections. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329. ACM Press (1992)Google Scholar
  6. 6.
    Ferragina, P., Gullì, A.: The Anatomy of SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 506–508. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Han, J., Kamber, M.: Data mining Concepts and Techniques, 2nd edn. Morgan Kauffman Publishers (2006)Google Scholar
  8. 8.
    Kiran, G.V., Shankar, R., Vikram, P.: Frequent Itemset Based Hierarchical Document Clustering Using Wikipedia as External Knowledge. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds.) KES 2010, Part II. LNCS (LNAI), vol. 6277, pp. 11–20. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Kummamuru, K., Lotlikar, R., Roy, S., Singal, K., Krishnapuram, R.: A hierarchical monothetic document clustering algorithm for summarization and browsing search results. In: Proceedings of the 13th International Conference on World Wide Web, pp. 658–665. ACM Press (2004)Google Scholar
  10. 10.
    Manber, U., Myers, G.: Suffix Arrays: A new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefzbMATHGoogle Scholar
  12. 12.
    Maslowska, I.: Phrase-Based Hierarchical Clustering of Web Search Results. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 555–562. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  13. 13.
    Navigli, R., Crisafulli, G.: Inducing Word Senses to Improve Web Search Result Clustering. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 116–126. MIT, Massachusetts (2010)Google Scholar
  14. 14.
    Ngo, C., Nguyen, H.S.: A Tolerance Rough Set Approach to Clustering Web Search Results. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 515–517. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  15. 15.
    Osinski, S.: Improving Quality of Search Results Clustering with Approximate Matrix Factorisations. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 167–178. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  16. 16.
    Osiński, S., Stefanowski, J., Weiss, D.: Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition. In: Proceedings of the International IIS: IIPWM 2004 Conference, Advances in Soft Computing, Intelligent Information Processing and Web Mining, Zakopane, Poland, pp. 359–368 (2004)Google Scholar
  17. 17.
    Osinski, S., Weiss, D.: A concept-driven algorithm for clustering search results. IEEE Intell. Syst. 20(3), 48–54 (2005)CrossRefGoogle Scholar
  18. 18.
    Project, T.O.D.: http://dmoz.org/
  19. 19.
    Silverstein, C., Henzinger, M., Marais, H., Moricz, M.: Analysis of a very large altavista query log. Tech. Rep. 1998-014, Digital SRC (1998)Google Scholar
  20. 20.
  21. 21.
    Spink, A., Wolfram, D., Jansen, B., Saracevic, T.: Searching the web: The public and their queries. Journal of the American Society for Information Science and Technology 52(3), 226–234 (2001)CrossRefGoogle Scholar
  22. 22.
  23. 23.
    Zamir, O., Etzioni, O.: Web document clustering: A feasibility demonstration. In: Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 46–54. ACM Press (1998)Google Scholar
  24. 24.
    Zeng, H.-J., He, Q.-C., Chen, Z., Ma, W.-Y., Ma, J.: Learning to cluster Web search results. In: Proceedings of the 27th ACM International Conference on Research and Development in Information Retrieval, pp. 210–217. ACM Press (2004)Google Scholar
  25. 25.
    Zhang, D., Dong, Y.: Semantic, Hierarchical, Online Clustering of Web Search Results. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds.) APWeb 2004. LNCS, vol. 3007, pp. 69–78. Springer, Heidelberg (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Hima Bindu Kommanti
    • 1
  • Chillarige Raghavendra Rao
    • 1
  1. 1.Department of Computer and Information SciencesUniversity of HyderabadHyderabadIndia

Personalised recommendations