Skip to main content

Word Clouds of Multiple Search Results

  • Conference paper
Multidisciplinary Information Retrieval (IRFC 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6653))

Included in the following conference series:

Abstract

Search engine result pages (SERPs) are known as the most expensive real estate on the planet. Most queries yield millions of organic search results, yet searchers seldom look beyond the first handful of results. To make things worse, different searchers with different query intents may issue the exact same query. An alternative to showing individual web pages summarized by snippets is to represent whole group of results. In this paper we investigate if we can use word clouds to summarize groups of documents, e.g. to give a preview of the next SERP, or clusters of topically related documents. We experiment with three word cloud generation methods (full-text, query biased and anchor text based clouds) and evaluate them in a user study. Our findings are: First, biasing the cloud towards the query does not lead to test persons better distinguishing relevance and topic of the search results, but test persons prefer them because differences between the clouds are emphasized. Second, anchor text clouds are to be preferred over full-text clouds. Anchor text contains less noisy words than the full text of documents. Third, we obtain moderately positive results on the relation between the selected world clouds and the underlying search results: there is exact correspondence in 70% of the subtopic matching judgments and in 60% of the relevance assessment judgments. Our initial experiments open up new possibilities to have SERPs reflect a far larger number of results by using word clouds to summarize groups of search results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bateman, S., Gutwin, C., Nacenta, M.: Seeing things in the clouds: the effect of visual features on tag cloud selections. In: HT 2008: Proceedings of the Nineteenth ACM Conference on Hypertext and Hypermedia, Pittsburgh, PA, USA, pp. 193–202. ACM, New York (2008)

    Chapter  Google Scholar 

  2. Carmel, D., Roitman, H., Zwerdling, N.: Enhancing cluster labeling using wikipedia. In: Proceedings of SIGIR 2009, pp. 139–146. ACM, New York (2009)

    Google Scholar 

  3. Clarke, C.L.A., Craswell, N., Soboroff, I.: Overview of the trec 2009 web track. In: Proceedings of the Eighteenth Text REtrieval Conference, TREC 2009 (2010)

    Google Scholar 

  4. Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: IJCAI 1999: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, pp. 668–673 (1999)

    Google Scholar 

  5. Glover, E., Pennock, D.M., Lawrence, S., Krovetz, R.: Inferring hierarchical descriptions. In: Proceedings of CIKM 2002, pp. 507–514. ACM, New York (2002)

    Google Scholar 

  6. Gupta, S., Kaiser, G., Neistadt, D., Grimm, P.: Dom-based content extraction of html documents. In: Proceedings of the 12th International Conference on World Wide Web, WWW 2003, pp. 207–214. ACM, New York (2003)

    Google Scholar 

  7. Halvey, M.J., Keane, M.T.: An assessment of tag presentation techniques. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 1313–1314. ACM, New York (2007)

    Google Scholar 

  8. Hiemstra, D., Robertson, S., Zaragoza, H.: Parsimonious language models for information retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185. ACM Press, New York (2004)

    Google Scholar 

  9. Kaptein, R., Hiemstra, D., Kamps, J.: How different are language models and Word clouds? In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 556–568. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Kaptein, R., Serdyukov, P., Kamps, J., de Vries, A.P.: Entity ranking using Wikipedia as a pivot. In: Proceedings of the 19th ACM Conference on Information and Knowledge Management (CIKM 2010), pp. 69–78. ACM Press, New York (2010)

    Chapter  Google Scholar 

  11. Kuo, B.Y.-L., Hentrich, T., Good, B.M., Wilkinson, M.D.: Tag clouds for summarizing web search results. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 1203–1204. ACM, New York (2007)

    Google Scholar 

  12. Pirolli, P., Schank, P., Hearst, M., Diehl, C.: Scatter/gather browsing communicates the topic structure of a very large text collection. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Common Ground, CHI 1996, Vancouver, British Columbia, Canada, pp. 213–220. ACM, New York (1996)

    Google Scholar 

  13. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st ACM Conference on Research and Development in Information Retrieval, pp. 275–281 (1998)

    Google Scholar 

  14. Rivadeneira, A.W., Gruen, D.M., Muller, M.J., Millen, D.R.: Getting our head in the clouds: toward evaluation studies of tagclouds. In: CHI 2007: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, San Jose, California, USA, pp. 995–998. ACM, New York (2007)

    Chapter  Google Scholar 

  15. Song, M., Song, I. Y., Allen, R. B., Obradovic, Z.: Keyphrase extraction-based query expansion in digital libraries. In: JCDL 2006: Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 202–209 (2006)

    Google Scholar 

  16. Srikanth, M., Srihari, R.: Biterm language models for document retrieval. In: Proceedings of SIGIR 2002, pp. 425–426. ACM, New York (2002)

    Google Scholar 

  17. Tombros, A., Sanderson, M.: Advantages of query biased summaries in information retrieval. In: Proceedings of SIGIR 1998, pp. 2–10. ACM, New York (1998)

    Google Scholar 

  18. Tsagkias, M., Larson, M., de Rijke, M.: Term clouds as surrogates for user generated speech. In: Proceedings of SIGIR 2008, pp. 773–774. ACM, New York (2008)

    Google Scholar 

  19. Turney, P.: Coherent keyphrase extraction via web mining. In: IJCAI 2003, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp. 434–442 (2003)

    Google Scholar 

  20. Venetis, P., Koutrika, G., Garcia-Molina, H.: On the selection of tags for tag clouds. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011, Hong Kong, China, pp. 835–844. ACM, New York (2011)

    Google Scholar 

  21. White, R.W., Ruthven, I., Jose, J.M.: Finding relevant documents using top ranking sentences: an evaluation of two alternative schemes. In: Proceedings of SIGIR 2002, pp. 57–64. ACM, New York (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kaptein, R., Kamps, J. (2011). Word Clouds of Multiple Search Results. In: Hanbury, A., Rauber, A., de Vries, A.P. (eds) Multidisciplinary Information Retrieval. IRFC 2011. Lecture Notes in Computer Science, vol 6653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21353-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21353-3_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21352-6

  • Online ISBN: 978-3-642-21353-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics