Skip to main content

Why Don’t Romanians Have a Five O’clock Tea, Nor Halloween, But Have a Kind of Valentines Day?

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4919))

Abstract

Recently the focus on temporal information in NLP applications has increased. Based on general temporal theories, annotations and standards, the paper presents the steps performed towards obtaining a parallel English-Romanian corpus, with the temporal information marked in both languages. The automatic import from English to Romanian of the TimeML markup has a success rate of 96.53%. The paper analyzes the main situations that appeared during the automatic import: perfect or impossible transfer, transfer with amendments or for the language specific phenomena. This corpus study permits to decide how import techniques can be used on the temporal domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allen, J.F.: Towards a General Theory of Action and Time. Artificial Intelligence 23, 123–154 (1984)

    Article  MATH  Google Scholar 

  2. Armstrong, S.: Multext: Multilingual Text Tools and Corpora. Lexikon und Text, 107–119 (1996)

    Google Scholar 

  3. Brants, T.: TnT – a statistical part-of-speech tagger. In: Proceedings of the 6th Applied NLP Conference, ANLP-2000, Seattle, WA, pp. 224–231 (2000)

    Google Scholar 

  4. Boguraev, B., Ando, R.: Analysis of TimeBank as a Resource for TimeML Parsing. In: Proceedings of LREC 2006, Genoa, Italy, pp. 71–76 (2006)

    Google Scholar 

  5. Ceauşu, A.: Integrated platform for Statistical Machine Translation system development (MTkit). Microsoft Imagine Cup (2005)

    Google Scholar 

  6. Cristea, D., Ide, N., Romary, L.: Veins Theory. An Approach to Global Cohesion and Coherence. In: Proceedings of COLING/ACL- 1998, Montreal, Canada, pp. 281–285 (1998)

    Google Scholar 

  7. Ferro, L., Gerber, L., Mani, I., Sundheim, B., Wilson, G.: TIDES 2005 Standard for the Annotation of Temporal Expressions (2005)

    Google Scholar 

  8. Forăscu, C., Pistol, I., Cristea, D.: Temporality in Relation with Discourse Structure. In: Proceedings of LREC-2006, Genoa, Italy, pp. 65–70 (2006) ISBN 2-9517408-2-4

    Google Scholar 

  9. Forăscu, C., Solomon, D.: Towards a Time Tagger for Romanian. In: Proceedings of the ESSLLI Student Session, Nancy, France (2004)

    Google Scholar 

  10. Hobbs, J.: Toward an Ontology for Time for the Semantic Web. In: Proceedings of the LREC 2002 Workshop Annotation Standards for Temporal Information in Natural Language, Las Palmas, Spain, pp. 28–35 (2002)

    Google Scholar 

  11. Hobbs, J., Pustejovsky, J.: Annotating and Reasoning about Time and Events. In: Proceedings of the AAAI Spring Symposium on Logical Formalizations of Commonsense Reasoning, Stanford, California (2003)

    Google Scholar 

  12. Ide, N., Bonhomme, P., Romary, L.: XCES: An XML-based Encoding Standard for Linguistic Corpora. In: Proceedings of the Second International Language Resources and Evaluation Conference, pp. 825–830 (2000)

    Google Scholar 

  13. Ion, R.: Word Sense Disambiguation Methods Applied to English and Romanian. (in Romanian) PhD thesis. Romanian Academy, Bucharest (2007)

    Google Scholar 

  14. Katz, G., Arosio, F.: The Annotation of Temporal Information in Natural Language Sentences. In: Proceedings of the ACL-2001 Workshop on Temporal and Spatial Information Processing, ACL-2001, Toulose, France, pp. 104–111 (2001)

    Google Scholar 

  15. Mani, I., Pustejovsky, J., Gaizauskas, R. (eds.): The Language of Time: A Reader. Oxford University Press, Oxford (2005)

    Google Scholar 

  16. Mann, W.C., Thompson, S.A.: Rhetorical structure theory: Description and construction of texts structures. In: Kempen, G. (ed.) Natural Language Generation, pp. 85–96. Martinus Nijhoff Publisher, Dordrecht (1987)

    Google Scholar 

  17. Martin, J., Mihalcea, R., Pedersen, T.: Word Alignment for Languages with Scarce Resources. In: Proceeding of the ACL2005 Workshop on Building and Using Parallel Corpora: Datadriven Machine Translation and Beyond. Ann Arbor, Michigan, pp. 65–74 (2005)

    Google Scholar 

  18. Pustejovsky, J., Belanger, L., Castaño, J., Gaizauskas, R., Hanks, P., Ingria, B., Katz, G., Radev, D., Rumshisky, A., Sanfilippo, A., Sauri, R., Setzer, A., Sundheim, B., Verhagen, M.: NRRC Summer Workshop on Temporal and Event Recognition for QA Systems (2002)

    Google Scholar 

  19. Pustejovsky, J., Verhagen, M., Sauri, R., Littman, J., Gaizauskas, R., Katz, G., Mani, I., Knippen, B., Setzer, A.: TimeBank 1.2. Linguistic Data Consortium (2006)

    Google Scholar 

  20. Reichenbach., H.: The tenses of verbs. In: Reichenbach, H. (ed.) Elements of Symbolic Logic, Section 51, pp. 287–298. Macmillan, New York (1947)

    Google Scholar 

  21. Sauri, R., Littman, J., Knippen, B., Gaizauskas, R., Setzer, A., Pustejovsky, J.: TimeML Annotation Guidelines, Version 1.2.1 (2006)

    Google Scholar 

  22. Setzer, A.: Temporal Information in Newswire Articles: an Annotation Scheme and Corpus Study. PhD dissertation. University of Sheffield (2001)

    Google Scholar 

  23. Tufiş, D., Ion, R., Ceauşu, A., Ştefănescu, D.: Combined Aligners. In: Proceedings of the ACL 2005 Workshop on Building and Using Parallel Corpora: Data-driven Machine Translation and Beyond, Ann Arbor, Michigan pp. 107–110 (2005)

    Google Scholar 

  24. Tufiş, D., Ion, R., Ceauşu, A., Ştefănescu, D.: Improved Lexical Alignment by Combining Multiple Reified Alignments. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2006) Trento, Italy pp. 153–160 (2006)

    Google Scholar 

  25. Tufiş, D., Barbu, A.M.: Revealing translators knowledge: statistical methods in constructing practical translation lexicons for language and speech processing. International Journal of Speech Technology (5), 199–209 (2002)

    Article  Google Scholar 

  26. Verhagen, M., Mani, I., Sauri, R., Littman, J., Knippen, R., Bae Jang, S., Rumshisky, A., Phillips, J., Pustejovsky, J.: Automating Temporal Annotation with TARSQI. In: Proceedings of the 43rd Annual Meeting of the ACL, Ann Arbor, Michigan, pp. 81–84 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Forăscu, C. (2008). Why Don’t Romanians Have a Five O’clock Tea, Nor Halloween, But Have a Kind of Valentines Day?. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78135-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78135-6_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78134-9

  • Online ISBN: 978-3-540-78135-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics