Skip to main content

Query Clustering in the Web Context

  • Chapter
Clustering and Information Retrieval

Part of the book series: Network Theory and Applications ((NETA,volume 11))

Abstract

Query clustering is a class of techniques aiming at grouping users’ semantically related, not syntactically related, queries in a query repository, which were accumulated with the interactions between users and the system. While there are numerous previous works on document clustering, query clustering is a relatively new topic. The driving force of the development of query clustering techniques comes recently from the requirements of modern web searching Below we briefly analyze several motivations and applications of query clustering — FAQ detecting, index-term selection and query reformulation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D. Beeferman and A. Berger, Agglomerative clustering of a search engine query log, Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(2000) pp. 407–416.

    Google Scholar 

  2. R. Cooley, B. Mobasher, and J. Srivastava, Data preparation for mining World Wide Web browsing patterns, Journal of Knowledge and Information SystemsVol. 1 No. 1 (1999).

    Google Scholar 

  3. H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma, Probabilistic query expansion using query logs, Proceedings of the Eleventh World Wide Web conference (WWW 2002)(2002) pp. 325–332.

    Google Scholar 

  4. E. De Lima and J. Pedersen, Phrases recognition and expansion for short, precision-biased queries based on a query log, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1999) pp. 145–152.

    Google Scholar 

  5. R. C. Dubes and A. K. Jain, Algorithms for Clustering Data, (Prentice Hill, 1988 ).

    Google Scholar 

  6. M. Ester, H. Kriegel, J. Sander, and X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining(1996) pp. 226–231.

    Google Scholar 

  7. M. Ester, H. Kriegel, J. Sander, M. Wimmer, and X. Xu, Incremental clustering for mining in a data warehousing environment, Proceedings of the 24th International Conference on Very Large Data Bases(1998) pp. 323–333.

    Google Scholar 

  8. L. Fitzpatrick. and M. Dent, Automatic feedback using past queries: social searching? Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1997) pp. 306–312.

    Google Scholar 

  9. G.W. Furnas, T.K. Landauer, L.M. Gomez, and S.T. Dumais, The vocabulary problem in human-system communication, CALM Vol.30 No.11(1987) pp. 964–971.

    Google Scholar 

  10. D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Part III “Inexact Matching, Sequence Alignment, and Dynamic Programming”, (Press of Cambridge University, 1997 ).

    Google Scholar 

  11. M.H. Hansen and E. Shriver, Using navigation data to improve IR functions in the context of web search, Proceedings of the 10th International Conference on Information and Knowledge Management (ACM CIKM 2001), (2001) pp. 135–142.

    Google Scholar 

  12. C.-K. Huang, L.-F. Chien, and Y.-J. Oyang, Query-session-based term suggestion for interactive web search, WWW10 Poster Proceedings(2001).

    Google Scholar 

  13. K. Kukich, Techniques for automatically correcting words in text, ACM Computing SurveysVol. 24 No. 4 (1992) pp. 377–439.

    Article  Google Scholar 

  14. V.A. Kulyukin, K.J. Hammond, and R.D. Burke, Answering questions for an organization online, Proceedings of AAAI’98 (1998) pp. 532–538.

    Google Scholar 

  15. D.D. Lewis and W.B. Croft, Term clustering of syntactic phrases, Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1990) pp. 385–404.

    Google Scholar 

  16. Z. Lu and K. McKinley, Partial collection replication versus caching for information retrieval systems, Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(2000) pp. 248–255.

    Google Scholar 

  17. G.A. Miller (eds.), WordNet: an on-line Lexical Database, International Journal of LexicographyVol.3 No.4 (1990).

    Google Scholar 

  18. R. Ng and J. Han, Efficient and effective clustering method for spatial data mining, Proceedings of the 20th International Conference on Very Large Data Bases(1994) pp. 144–155.

    Google Scholar 

  19. P. Pirolli, J. Pitkow, and R. Rao, Silk from a sow’s ear: Extracting usable structures from the Web. Proceedings of 1996 Conference on Human Factors in Computing Systems (CHI-96)(1996).

    Google Scholar 

  20. J. Pitkow, In search of reliable usage data on the WWW, Proceedings of the Sixth World Wide Web conference (WWW6)(1997) pp. 451–463.

    Google Scholar 

  21. M. Porter, An algorithm for suffix stripping, ProgramVol. 14 No. 3 (1980) pp. 130–137.

    Google Scholar 

  22. J. Rocchio, Relevance feedback in information retrieval, in G. Salton (eds.) The Smart Retrieval System — Experiments in Automatic Document Processing(Prentice-Hall Englewood Cliffs, 1971 ) pp. 313–323.

    Google Scholar 

  23. G. Salton and C. Buckley, Improving retrieval performance by relevance feedback, Journal of the American Society for Information Science, Vol. 41 No. 4 (1990) pp. 288–297.

    Article  Google Scholar 

  24. G. Salton and M.J. McGill, Introduction to Modern Information Retrieval(McGraw-Hill Book Company, 1983 ).

    Google Scholar 

  25. R. Srihari and W. Li, Question answering supported by information extraction, Proceedings of TREC8(1999) pp. 75–85.

    Google Scholar 

  26. C.J. van Rijsbergen, Information Retrieval (Second Edition)( Butter-worths, London, 1979 ).

    Google Scholar 

  27. E. Voorhees, N.K. Gupta, and B. Johnson-Laird, Learning collection fusion strategies, Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1995) pp. 172–179.

    Google Scholar 

  28. J.-R. Wen, J.-Y. Nie, and H.-J. Zhang, Clustering user queries of a search engine, Proceedings of the Tenth World Wide Web conference (WWW10)(2001) pp. 162–168.

    Google Scholar 

  29. J.-R. Wen, J.-Y. Nie, and H.-J. Zhang, Query Clustering Using User Logs, ACM Transactions on Information Systems (ACM TOTS)Vol. 20 No. 1 (2002) pp. 59–81.

    Article  Google Scholar 

  30. P. Willett, Recent trends in hierarchical document clustering: A critical review, Information Processing and ManagementVol. 24 No. 5 (1988) pp. 577–597.

    Article  Google Scholar 

  31. J. Xu and W.B. Croft, Query expansion using local and global document analysis, Proceedings of the 19th Annual International ACM SI-GIR Conference on Research and Development in Information Retrieval(1996) pp. 4–11.

    Google Scholar 

  32. O. Zamir and O. Etzioni, Web document clustering: A feasibility demonstration, Proceedings of the 21st Annual International ACM SI-GIR Conference on Research and Development in Information Retrieval(1998).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Kluwer Academic Publishers

About this chapter

Cite this chapter

Wen, JR., Zhang, HJ. (2004). Query Clustering in the Web Context. In: Clustering and Information Retrieval. Network Theory and Applications, vol 11. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-0227-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-0227-8_7

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-7949-2

  • Online ISBN: 978-1-4613-0227-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics