Word Clouds of Multiple Search Results

Kaptein, Rianne; Kamps, Jaap

doi:10.1007/978-3-642-21353-3_7

Rianne Kaptein¹⁹ &
Jaap Kamps^19,20

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6653))

Included in the following conference series:

Information Retrieval Facility Conference

423 Accesses
2 Citations

Abstract

Search engine result pages (SERPs) are known as the most expensive real estate on the planet. Most queries yield millions of organic search results, yet searchers seldom look beyond the first handful of results. To make things worse, different searchers with different query intents may issue the exact same query. An alternative to showing individual web pages summarized by snippets is to represent whole group of results. In this paper we investigate if we can use word clouds to summarize groups of documents, e.g. to give a preview of the next SERP, or clusters of topically related documents. We experiment with three word cloud generation methods (full-text, query biased and anchor text based clouds) and evaluate them in a user study. Our findings are: First, biasing the cloud towards the query does not lead to test persons better distinguishing relevance and topic of the search results, but test persons prefer them because differences between the clouds are emphasized. Second, anchor text clouds are to be preferred over full-text clouds. Anchor text contains less noisy words than the full text of documents. Third, we obtain moderately positive results on the relation between the selected world clouds and the underlying search results: there is exact correspondence in 70% of the subtopic matching judgments and in 60% of the relevance assessment judgments. Our initial experiments open up new possibilities to have SERPs reflect a far larger number of results by using word clouds to summarize groups of search results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bateman, S., Gutwin, C., Nacenta, M.: Seeing things in the clouds: the effect of visual features on tag cloud selections. In: HT 2008: Proceedings of the Nineteenth ACM Conference on Hypertext and Hypermedia, Pittsburgh, PA, USA, pp. 193–202. ACM, New York (2008)
Chapter Google Scholar
Carmel, D., Roitman, H., Zwerdling, N.: Enhancing cluster labeling using wikipedia. In: Proceedings of SIGIR 2009, pp. 139–146. ACM, New York (2009)
Google Scholar
Clarke, C.L.A., Craswell, N., Soboroff, I.: Overview of the trec 2009 web track. In: Proceedings of the Eighteenth Text REtrieval Conference, TREC 2009 (2010)
Google Scholar
Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: IJCAI 1999: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, pp. 668–673 (1999)
Google Scholar
Glover, E., Pennock, D.M., Lawrence, S., Krovetz, R.: Inferring hierarchical descriptions. In: Proceedings of CIKM 2002, pp. 507–514. ACM, New York (2002)
Google Scholar
Gupta, S., Kaiser, G., Neistadt, D., Grimm, P.: Dom-based content extraction of html documents. In: Proceedings of the 12th International Conference on World Wide Web, WWW 2003, pp. 207–214. ACM, New York (2003)
Google Scholar
Halvey, M.J., Keane, M.T.: An assessment of tag presentation techniques. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 1313–1314. ACM, New York (2007)
Google Scholar
Hiemstra, D., Robertson, S., Zaragoza, H.: Parsimonious language models for information retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185. ACM Press, New York (2004)
Google Scholar
Kaptein, R., Hiemstra, D., Kamps, J.: How different are language models and Word clouds? In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 556–568. Springer, Heidelberg (2010)
Chapter Google Scholar
Kaptein, R., Serdyukov, P., Kamps, J., de Vries, A.P.: Entity ranking using Wikipedia as a pivot. In: Proceedings of the 19th ACM Conference on Information and Knowledge Management (CIKM 2010), pp. 69–78. ACM Press, New York (2010)
Chapter Google Scholar
Kuo, B.Y.-L., Hentrich, T., Good, B.M., Wilkinson, M.D.: Tag clouds for summarizing web search results. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 1203–1204. ACM, New York (2007)
Google Scholar
Pirolli, P., Schank, P., Hearst, M., Diehl, C.: Scatter/gather browsing communicates the topic structure of a very large text collection. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Common Ground, CHI 1996, Vancouver, British Columbia, Canada, pp. 213–220. ACM, New York (1996)
Google Scholar
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st ACM Conference on Research and Development in Information Retrieval, pp. 275–281 (1998)
Google Scholar
Rivadeneira, A.W., Gruen, D.M., Muller, M.J., Millen, D.R.: Getting our head in the clouds: toward evaluation studies of tagclouds. In: CHI 2007: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, San Jose, California, USA, pp. 995–998. ACM, New York (2007)
Chapter Google Scholar
Song, M., Song, I. Y., Allen, R. B., Obradovic, Z.: Keyphrase extraction-based query expansion in digital libraries. In: JCDL 2006: Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 202–209 (2006)
Google Scholar
Srikanth, M., Srihari, R.: Biterm language models for document retrieval. In: Proceedings of SIGIR 2002, pp. 425–426. ACM, New York (2002)
Google Scholar
Tombros, A., Sanderson, M.: Advantages of query biased summaries in information retrieval. In: Proceedings of SIGIR 1998, pp. 2–10. ACM, New York (1998)
Google Scholar
Tsagkias, M., Larson, M., de Rijke, M.: Term clouds as surrogates for user generated speech. In: Proceedings of SIGIR 2008, pp. 773–774. ACM, New York (2008)
Google Scholar
Turney, P.: Coherent keyphrase extraction via web mining. In: IJCAI 2003, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp. 434–442 (2003)
Google Scholar
Venetis, P., Koutrika, G., Garcia-Molina, H.: On the selection of tags for tag clouds. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011, Hong Kong, China, pp. 835–844. ACM, New York (2011)
Google Scholar
White, R.W., Ruthven, I., Jose, J.M.: Finding relevant documents using top ranking sentences: an evaluation of two alternative schemes. In: Proceedings of SIGIR 2002, pp. 57–64. ACM, New York (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Archives and Information Studies, University of Amsterdam, The Netherlands
Rianne Kaptein & Jaap Kamps
ISLA, Informatics Institute, University of Amsterdam, The Netherlands
Jaap Kamps

Authors

Rianne Kaptein
View author publications
You can also search for this author in PubMed Google Scholar
Jaap Kamps
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Information Retrieval Facility, Donau City Str. 1, 1220, Vienna, Austria
Allan Hanbury
Vienna University of Technology, Favoritenstr. 9-11/188, 1040, Vienna, Austria
Andreas Rauber
CWI, Science Park 123, 1098 XG, Amsterdam, The Netherlands
Arjen P. de Vries

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kaptein, R., Kamps, J. (2011). Word Clouds of Multiple Search Results. In: Hanbury, A., Rauber, A., de Vries, A.P. (eds) Multidisciplinary Information Retrieval. IRFC 2011. Lecture Notes in Computer Science, vol 6653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21353-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-21353-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21352-6
Online ISBN: 978-3-642-21353-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics