Skip to main content

Which Words Do You Remember? Temporal Properties of Language Use in Digital Archives

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7489))

Abstract

Knowing the behavior of terms in written texts can help us tailor fit models, algorithms and resources to improve access to digital libraries and help us answer information needs in longer spanning archives. In this paper we investigate the behavior of English written text in blogs in comparison to traditional texts from the New York Times, The Times Archive, and the British National Corpus. We show that user generated content, similar to spoken content, differs in characteristics from ‘professionally’ written text and experiences a more dynamic behavior.

This work is partly funded by the European Commission under ARCOMEM (ICT 270239).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abecker, A., Stojanovic, L.: Ontology evolution: Medline case study. In: Wirtschaftsinformatik: eEconomy, eGovernment, eSociety, pp. 1291–1308 (2005)

    Google Scholar 

  2. Bamman, D., Crane, G.: Measuring historical word sense variation. In: JCDL, pp. 1–10 (2011)

    Google Scholar 

  3. The British National Corpus, version 3, BNC Consortium (2007)

    Google Scholar 

  4. Christiansen, M., Kirby, S.: Language evolution. Studies in the evolution of language. Oxford University Press (2003)

    Google Scholar 

  5. Ernst-Gerlach, A., Fuhr, N.: Retrieval in text collections with historic spelling using linguistic and spelling variants. In: JCDL, pp. 333–341 (2007)

    Google Scholar 

  6. Kanhabua, N., Nørvåg, K.: Exploiting time-based synonyms in searching document archives. In: JCDL, pp. 79–88 (2010)

    Google Scholar 

  7. Macdonald, C., Ounis, I.: The TREC Blogs06 Collection: Creating and Analysing a Blog Test Collection. DCS Technical Report Series (2006)

    Google Scholar 

  8. Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38, 39–41 (1995)

    Article  Google Scholar 

  9. Pinker, S., Bloom, P.: Natural selection and natural language. Behavioral and Brain Sciences 13(4), 707–784 (1990)

    Article  Google Scholar 

  10. Segerstad, Y.: Use and adaptation of written language to the conditions of computer-mediated communication. Ph.D. thesis, Göteborg University (2002)

    Google Scholar 

  11. Tahmasebi, N., Niklas, K., Theuerkauf, T., Risse, T.: Using Word Sense Discrimination on Historic Document Collections. In: JCDL, pp. 89–98 (2010)

    Google Scholar 

  12. TREC-BLOG (2012), http://ir.dcs.gla.ac.uk/wiki/TREC-BLOG

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tahmasebi, N., Gossen, G., Risse, T. (2012). Which Words Do You Remember? Temporal Properties of Language Use in Digital Archives. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds) Theory and Practice of Digital Libraries. TPDL 2012. Lecture Notes in Computer Science, vol 7489. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33290-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33290-6_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33289-0

  • Online ISBN: 978-3-642-33290-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics