Query Clustering in the Web Context

Wen, Ji-Rong; Zhang, Hong-Jiang

doi:10.1007/978-1-4613-0227-8_7

Ji-Rong Wen⁵ &
Hong-Jiang Zhang⁵

Part of the book series: Network Theory and Applications ((NETA,volume 11))

236 Accesses
2 Citations

Abstract

Query clustering is a class of techniques aiming at grouping users’ semantically related, not syntactically related, queries in a query repository, which were accumulated with the interactions between users and the system. While there are numerous previous works on document clustering, query clustering is a relatively new topic. The driving force of the development of query clustering techniques comes recently from the requirements of modern web searching Below we briefly analyze several motivations and applications of query clustering — FAQ detecting, index-term selection and query reformulation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. Beeferman and A. Berger, Agglomerative clustering of a search engine query log, Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(2000) pp. 407–416.
Google Scholar
R. Cooley, B. Mobasher, and J. Srivastava, Data preparation for mining World Wide Web browsing patterns, Journal of Knowledge and Information SystemsVol. 1 No. 1 (1999).
Google Scholar
H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma, Probabilistic query expansion using query logs, Proceedings of the Eleventh World Wide Web conference (WWW 2002)(2002) pp. 325–332.
Google Scholar
E. De Lima and J. Pedersen, Phrases recognition and expansion for short, precision-biased queries based on a query log, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1999) pp. 145–152.
Google Scholar
R. C. Dubes and A. K. Jain, Algorithms for Clustering Data, (Prentice Hill, 1988 ).
Google Scholar
M. Ester, H. Kriegel, J. Sander, and X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining(1996) pp. 226–231.
Google Scholar
M. Ester, H. Kriegel, J. Sander, M. Wimmer, and X. Xu, Incremental clustering for mining in a data warehousing environment, Proceedings of the 24th International Conference on Very Large Data Bases(1998) pp. 323–333.
Google Scholar
L. Fitzpatrick. and M. Dent, Automatic feedback using past queries: social searching? Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1997) pp. 306–312.
Google Scholar
G.W. Furnas, T.K. Landauer, L.M. Gomez, and S.T. Dumais, The vocabulary problem in human-system communication, CALM Vol.30 No.11(1987) pp. 964–971.
Google Scholar
D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Part III “Inexact Matching, Sequence Alignment, and Dynamic Programming”, (Press of Cambridge University, 1997 ).
Google Scholar
M.H. Hansen and E. Shriver, Using navigation data to improve IR functions in the context of web search, Proceedings of the 10th International Conference on Information and Knowledge Management (ACM CIKM 2001), (2001) pp. 135–142.
Google Scholar
C.-K. Huang, L.-F. Chien, and Y.-J. Oyang, Query-session-based term suggestion for interactive web search, WWW10 Poster Proceedings(2001).
Google Scholar
K. Kukich, Techniques for automatically correcting words in text, ACM Computing SurveysVol. 24 No. 4 (1992) pp. 377–439.
Article Google Scholar
V.A. Kulyukin, K.J. Hammond, and R.D. Burke, Answering questions for an organization online, Proceedings of AAAI’98 (1998) pp. 532–538.
Google Scholar
D.D. Lewis and W.B. Croft, Term clustering of syntactic phrases, Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1990) pp. 385–404.
Google Scholar
Z. Lu and K. McKinley, Partial collection replication versus caching for information retrieval systems, Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(2000) pp. 248–255.
Google Scholar
G.A. Miller (eds.), WordNet: an on-line Lexical Database, International Journal of LexicographyVol.3 No.4 (1990).
Google Scholar
R. Ng and J. Han, Efficient and effective clustering method for spatial data mining, Proceedings of the 20th International Conference on Very Large Data Bases(1994) pp. 144–155.
Google Scholar
P. Pirolli, J. Pitkow, and R. Rao, Silk from a sow’s ear: Extracting usable structures from the Web. Proceedings of 1996 Conference on Human Factors in Computing Systems (CHI-96)(1996).
Google Scholar
J. Pitkow, In search of reliable usage data on the WWW, Proceedings of the Sixth World Wide Web conference (WWW6)(1997) pp. 451–463.
Google Scholar
M. Porter, An algorithm for suffix stripping, ProgramVol. 14 No. 3 (1980) pp. 130–137.
Google Scholar
J. Rocchio, Relevance feedback in information retrieval, in G. Salton (eds.) The Smart Retrieval System — Experiments in Automatic Document Processing(Prentice-Hall Englewood Cliffs, 1971 ) pp. 313–323.
Google Scholar
G. Salton and C. Buckley, Improving retrieval performance by relevance feedback, Journal of the American Society for Information Science, Vol. 41 No. 4 (1990) pp. 288–297.
Article Google Scholar
G. Salton and M.J. McGill, Introduction to Modern Information Retrieval(McGraw-Hill Book Company, 1983 ).
Google Scholar
R. Srihari and W. Li, Question answering supported by information extraction, Proceedings of TREC8(1999) pp. 75–85.
Google Scholar
C.J. van Rijsbergen, Information Retrieval (Second Edition)( Butter-worths, London, 1979 ).
Google Scholar
E. Voorhees, N.K. Gupta, and B. Johnson-Laird, Learning collection fusion strategies, Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(1995) pp. 172–179.
Google Scholar
J.-R. Wen, J.-Y. Nie, and H.-J. Zhang, Clustering user queries of a search engine, Proceedings of the Tenth World Wide Web conference (WWW10)(2001) pp. 162–168.
Google Scholar
J.-R. Wen, J.-Y. Nie, and H.-J. Zhang, Query Clustering Using User Logs, ACM Transactions on Information Systems (ACM TOTS)Vol. 20 No. 1 (2002) pp. 59–81.
Article Google Scholar
P. Willett, Recent trends in hierarchical document clustering: A critical review, Information Processing and ManagementVol. 24 No. 5 (1988) pp. 577–597.
Article Google Scholar
J. Xu and W.B. Croft, Query expansion using local and global document analysis, Proceedings of the 19th Annual International ACM SI-GIR Conference on Research and Development in Information Retrieval(1996) pp. 4–11.
Google Scholar
O. Zamir and O. Etzioni, Web document clustering: A feasibility demonstration, Proceedings of the 21st Annual International ACM SI-GIR Conference on Research and Development in Information Retrieval(1998).
Google Scholar

Download references

Author information

Authors and Affiliations

Beijing Sigma Center, Microsoft Research Asia 3F, No. 49, Zhichun Road, Haidian District Beijing, China, 100080
Ji-Rong Wen & Hong-Jiang Zhang

Authors

Ji-Rong Wen
View author publications
You can also search for this author in PubMed Google Scholar
Hong-Jiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wen, JR., Zhang, HJ. (2004). Query Clustering in the Web Context. In: Clustering and Information Retrieval. Network Theory and Applications, vol 11. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-0227-8_7

Download citation

DOI: https://doi.org/10.1007/978-1-4613-0227-8_7
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7949-2
Online ISBN: 978-1-4613-0227-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics