Abstract
Search engine result pages (SERPs) are known as the most expensive real estate on the planet. Most queries yield millions of organic search results, yet searchers seldom look beyond the first handful of results. To make things worse, different searchers with different query intents may issue the exact same query. An alternative to showing individual web pages summarized by snippets is to represent whole group of results. In this paper we investigate if we can use word clouds to summarize groups of documents, e.g. to give a preview of the next SERP, or clusters of topically related documents. We experiment with three word cloud generation methods (full-text, query biased and anchor text based clouds) and evaluate them in a user study. Our findings are: First, biasing the cloud towards the query does not lead to test persons better distinguishing relevance and topic of the search results, but test persons prefer them because differences between the clouds are emphasized. Second, anchor text clouds are to be preferred over full-text clouds. Anchor text contains less noisy words than the full text of documents. Third, we obtain moderately positive results on the relation between the selected world clouds and the underlying search results: there is exact correspondence in 70% of the subtopic matching judgments and in 60% of the relevance assessment judgments. Our initial experiments open up new possibilities to have SERPs reflect a far larger number of results by using word clouds to summarize groups of search results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bateman, S., Gutwin, C., Nacenta, M.: Seeing things in the clouds: the effect of visual features on tag cloud selections. In: HT 2008: Proceedings of the Nineteenth ACM Conference on Hypertext and Hypermedia, Pittsburgh, PA, USA, pp. 193–202. ACM, New York (2008)
Carmel, D., Roitman, H., Zwerdling, N.: Enhancing cluster labeling using wikipedia. In: Proceedings of SIGIR 2009, pp. 139–146. ACM, New York (2009)
Clarke, C.L.A., Craswell, N., Soboroff, I.: Overview of the trec 2009 web track. In: Proceedings of the Eighteenth Text REtrieval Conference, TREC 2009 (2010)
Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: IJCAI 1999: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, pp. 668–673 (1999)
Glover, E., Pennock, D.M., Lawrence, S., Krovetz, R.: Inferring hierarchical descriptions. In: Proceedings of CIKM 2002, pp. 507–514. ACM, New York (2002)
Gupta, S., Kaiser, G., Neistadt, D., Grimm, P.: Dom-based content extraction of html documents. In: Proceedings of the 12th International Conference on World Wide Web, WWW 2003, pp. 207–214. ACM, New York (2003)
Halvey, M.J., Keane, M.T.: An assessment of tag presentation techniques. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 1313–1314. ACM, New York (2007)
Hiemstra, D., Robertson, S., Zaragoza, H.: Parsimonious language models for information retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185. ACM Press, New York (2004)
Kaptein, R., Hiemstra, D., Kamps, J.: How different are language models and Word clouds? In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 556–568. Springer, Heidelberg (2010)
Kaptein, R., Serdyukov, P., Kamps, J., de Vries, A.P.: Entity ranking using Wikipedia as a pivot. In: Proceedings of the 19th ACM Conference on Information and Knowledge Management (CIKM 2010), pp. 69–78. ACM Press, New York (2010)
Kuo, B.Y.-L., Hentrich, T., Good, B.M., Wilkinson, M.D.: Tag clouds for summarizing web search results. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 1203–1204. ACM, New York (2007)
Pirolli, P., Schank, P., Hearst, M., Diehl, C.: Scatter/gather browsing communicates the topic structure of a very large text collection. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Common Ground, CHI 1996, Vancouver, British Columbia, Canada, pp. 213–220. ACM, New York (1996)
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st ACM Conference on Research and Development in Information Retrieval, pp. 275–281 (1998)
Rivadeneira, A.W., Gruen, D.M., Muller, M.J., Millen, D.R.: Getting our head in the clouds: toward evaluation studies of tagclouds. In: CHI 2007: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, San Jose, California, USA, pp. 995–998. ACM, New York (2007)
Song, M., Song, I. Y., Allen, R. B., Obradovic, Z.: Keyphrase extraction-based query expansion in digital libraries. In: JCDL 2006: Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 202–209 (2006)
Srikanth, M., Srihari, R.: Biterm language models for document retrieval. In: Proceedings of SIGIR 2002, pp. 425–426. ACM, New York (2002)
Tombros, A., Sanderson, M.: Advantages of query biased summaries in information retrieval. In: Proceedings of SIGIR 1998, pp. 2–10. ACM, New York (1998)
Tsagkias, M., Larson, M., de Rijke, M.: Term clouds as surrogates for user generated speech. In: Proceedings of SIGIR 2008, pp. 773–774. ACM, New York (2008)
Turney, P.: Coherent keyphrase extraction via web mining. In: IJCAI 2003, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp. 434–442 (2003)
Venetis, P., Koutrika, G., Garcia-Molina, H.: On the selection of tags for tag clouds. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011, Hong Kong, China, pp. 835–844. ACM, New York (2011)
White, R.W., Ruthven, I., Jose, J.M.: Finding relevant documents using top ranking sentences: an evaluation of two alternative schemes. In: Proceedings of SIGIR 2002, pp. 57–64. ACM, New York (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kaptein, R., Kamps, J. (2011). Word Clouds of Multiple Search Results. In: Hanbury, A., Rauber, A., de Vries, A.P. (eds) Multidisciplinary Information Retrieval. IRFC 2011. Lecture Notes in Computer Science, vol 6653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21353-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-21353-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21352-6
Online ISBN: 978-3-642-21353-3
eBook Packages: Computer ScienceComputer Science (R0)