Skip to main content

Corpus Clouds - Facilitating Text Analysis by Means of Visualizations

  • Conference paper
Human Language Technology. Challenges for Computer Science and Linguistics (LTC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6562))

Included in the following conference series:

Abstract

Large text corpora are a main language resource for the human-driven analysis of linguistic phenomena. With the ever increasing amount of data, it is vital to find ways to help people understand the data, and visualization techniques provide one way to do that. Corpus Clouds is a program which provides visualizations of different types of frequency information dynamically derived from a corpus via a standard query system, integrated with a standard KWIC display. We apply established principles from information visualization to provide dynamic, interactive representations of the query results. The selected design principles and alternatives to the implementation will be discussed and a preview on what other types of information connected to corpora can be visualized in similar ways are provided. Corpus Clouds can thus be seen as answer to the call by Collins et al. [1] to design in a principled way new visualization tools for linguistic data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Collins, C., Penn, G., Carpendale, S.: Interactive Visualization for Computational Linguistics. In: ACL 2008: HLT Tutorials (2008), http://www.cs.utoronto.ca/~ccollins/acl2008-vis.pdf

  2. Card, S.K., Mackinlay, J., Shneiderman, B.: Readings in Information Visualization: Using Vision to Think. Academic Press, San Diego (1999)

    Google Scholar 

  3. Ware, C.: Information Visualization, 2nd edn. Perception for Design. Elsevier, Inc., San Francisco (2004)

    Google Scholar 

  4. Collins, C.: A Critical Review of Information Visualizations for Natural Language. PhD qualifying exam paper, University of Toronto (2005), http://www.cs.utoronto.ca/~ccollins/publications/docs/depthPaper.pdf

  5. Wattenberg, M., Viégas, F.B.: The Word Tree, an Interactive Visual Concordance. IEEE Trans. on Visualization and Computer Graphics 14(6), 1221–1228 (2008)

    Article  Google Scholar 

  6. Hearst, M.A.: Tilebars: Visualization of Term Distribution Information in Full Text Information Access. In: CHI 1995, Denver, Colorado, pp. 56–66 (1995)

    Google Scholar 

  7. Wattenberg, M.: Arc Diagrams: Visualizing Structure in Strings. In: IEEE Symposium on Information Visualization, pp. 110–116. IEEE Computer Society Press, Washington (2002)

    Google Scholar 

  8. Widdows, D., Cederberg, S., Dorow, B.: Visualisation Techniques for Analysing Meaning. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, pp. 107–114. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. DeCamp, P., Frid-Jimenez, A., Guiness, J., Roy, D.: Gist Icons: Seeing Meaning in Large Bodies of Literature. In: IEEE Symposium on Information Visualization. IEEE Computer Society Press, Washington (2005)

    Google Scholar 

  10. Collins, C.: Docuburst: Radial Space-filling Visualization of Document Content. Technical Report KMDI-TR-2007-1, Knowledge Media Design Institute, University of Toronto (2007)

    Google Scholar 

  11. Rohrer, R.M., Sibert, J.L., Ebert, D.S.: The Shape of Shakespeare: Visualizing Text Using Implicit Surfaces. In: IEEE Symposium on Information Visualization, pp. 121–129. IEEE Computer Society Press, Washington (1998)

    Google Scholar 

  12. TAPoR, http://portal.tapor.ca/

  13. MONK, http://www.monkproject.org/

  14. Shneiderman, B.: The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations. In: IEEE Symposium on Visual Languages, pp. 336–343. IEEE Computer Society Press, Washington (1996)

    Google Scholar 

  15. Tufte, E.: Beautiful Evidence. Graphics Press, Cheshire (2006)

    Google Scholar 

  16. Kennedy, G.: An Introduction to Corpus Linguistics. Longman, London (1998)

    Google Scholar 

  17. Christ, O.: A Modular and Flexible Architecture for an Integrated Corpus Query System. In: 3rd Conference on Computational Lexicography and Text Research, Budapest, pp. 23–32 (1994)

    Google Scholar 

  18. Scott, M.: Developing WordSmith. In: Scott, M., Pérez-Paredes, P., Sánchez-Hernández, P. (eds.) Software-aided Analysis of Language, special issue of International Journal of English Studies, vol. 8(1), pp. 153–172 (2008)

    Google Scholar 

  19. Kilgarriff, A., Rychly, P., Smrz, P., Tugwell, D.: The Sketch Engine. In: EURALEX 2004, Lorient, pp. 105–116 (2004)

    Google Scholar 

  20. Sokirko, A.: DDC – A Search Engine for Linguistically Annotated Corpora. In: Dialogue (2003)

    Google Scholar 

  21. Lemnitzer, L., Zinsmeister, H.: Korpuslinguistik. Eine Einführung. Gunter Narr, Tübingen (2006)

    Google Scholar 

  22. Müller, B.: Fast Faust (2000), http://www.esono.com/boris/projects/faust/

  23. Zipf, G.K.: Human Behavior and the Principle of Least-effort. Addison-Wesley, Cambridge (1949)

    Google Scholar 

  24. Hearst, M.A., Rosner, D.: Tag Clouds: Data Analysis Tool or Social Signaller? In: 41st Annual Hawaii international Conference on System Sciences, p. 160. IEEE Computer Society, Washington (2008)

    Google Scholar 

  25. Hassan-Montero, Y., Herrero-Solana, V.: Improving Tag-clouds as Visual Information Retrieval Interfaces. In: InSciT 2006, Mérida (2006)

    Google Scholar 

  26. Kaser, O., Lamire, D.: Tag-Cloud Drawing: Algorithms for Cloud Visualization. In: WWW 2007 Workshop on Tagging and Metadata for Social Information Organization, Banff, Alberta (2007)

    Google Scholar 

  27. Google Visualization API, http://code.google.com/apis/visualization/documentation/gallery.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Culy, C., Lyding, V. (2011). Corpus Clouds - Facilitating Text Analysis by Means of Visualizations. In: Vetulani, Z. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2009. Lecture Notes in Computer Science(), vol 6562. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20095-3_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20095-3_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20094-6

  • Online ISBN: 978-3-642-20095-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics