Skip to main content

It-TimeML and the Ita-TimeBank: Language Specific Adaptations for Temporal Annotation

  • Chapter
  • First Online:
Handbook of Linguistic Annotation

Abstract

This chapter presents the language specific adaptation of the TimeML annotation scheme to Italian and the creation of the Ita-TimeBank, a language resource composed of two corpora manually annotated with temporal and event information. Particular attention is given to the methodology followed in the development of the corpora: the annotation guidelines document the actual choices done during the annotation and address language specific issues while maintaining adherence to the specifications. The annotation guidelines are supplied with decision tree like instructions and tests grounded in linguistic analysis but theory independent. The results obtained show the reliability of the adaptation of the annotation specifications to Italian and of the methodology used for the creation of the resources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 449.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    TempEval 2007 [45]: http://www.timeml.org/tempeval/; TempEval 2010 [46]: http://www.timeml.org/tempeval2/; TempEval 2013 [42]: http://www.cs.york.ac.uk/semeval-2013/task1/.

  2. 2.

    http://www.intratext.com/bsi/listapolirematiche/indalfa.htm.

  3. 3.

    Using the full set of grammatical tense forms and viewpoints, the table would contain 64 combinations.

  4. 4.

    firmato [signed] AFTER chiesto [asked].

  5. 5.

    The corpora have been named after the research institutes where they have been initially developed, the “Center for the Evaluation of Language and Communication Technologies” (CELCT), and “Istituto di Linguistica Computazionale “A. Zampolli” - CNR Pisa” (ILC), respectively.

  6. 6.

    http://www.livememories.org.

  7. 7.

    http://www.newsreader-project.eu/.

  8. 8.

    Please note that in the CELCT Corpus the number of annotated temporal expressions is calculated on a total of 180,000 tokens (i.e. 525 files), while the number of events, signals and links is calculated on more than 90,000 tokens (i.e. 283 files).

  9. 9.

    The DTD document is available at https://sites.google.com/site/ittimeml/documents.

  10. 10.

    https://sites.google.com/site/eventievalita2014/.

  11. 11.

    The CELCT score is computed on the basis of the Dice coefficient. We did not report it here as it is not directly comparable with the kappa score.

  12. 12.

    http://www.evalita.it/2014.

References

  1. AA.VV.: Funzioni delle frasi subordinative. In: L. Renzi, G. Salvi, A. Cardinaletti (eds.) Grande Grammatica Italiana di Consultazione. I sintagmi verbale, aggettivale e avverbiale. La Subordinazione, vol. II, pp. 633–853. Il Mulino (2001)

    Google Scholar 

  2. Angeli, G., Uszkoreit, J.: Language-independent discriminative parsing of temporal expressions. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 83–92. Association for Computational Linguistics, Sofia (2013)

    Google Scholar 

  3. Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008)

    Article  Google Scholar 

  4. Bach, E.: The algebra of events. Linguist. Philos. 9, 5–16 (1986)

    Google Scholar 

  5. Bartalesi Lenzi, V., Moretti, G., Sprugnoli, R.: CAT: the CELCT annotation tool. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, pp. 333–338 (2012)

    Google Scholar 

  6. Bertinetto, P.: Tempo, Aspetto e Azione nel verbo Italiano. Il sistema dell’indicativo. Accademia della Crusca, Firenze (1986)

    Google Scholar 

  7. Bertinetto, P.: Le strutture tempo-aspettuali dell’italiano e dell’inglese a confronto. In: Mocciaro, A.G., Soravia, G. (eds.) L’Europa linguistica: contatti, contrasti, e affinità di lingue, pp. 49–68. SLI, Atti XXI Congresso Internazionale di Studi, Bulzoni (1992)

    Google Scholar 

  8. Bertinetto, P.: Il verbo. In: Renzi, L., Salvi, G., Cardinaletti, A. (eds.) Grande Grammatica Italiana di Consultazione. I sintagmi verbale, aggettivale e avverbiale. La Subordinazione, vol. II, pp. 13–162. Il Mulino (2001)

    Google Scholar 

  9. Bertinetto, P.M.: Sulle proprieta tempo-aspettuali dell’infinito in italiano. In: Atti del 35 Congresso Internazionale della Societa di Linguistica Italiana (2001)

    Google Scholar 

  10. Bittar, A.: Annotation of events and temporal expressions in French texts. In: Proceedings of the Joint conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp. 48–51 (2009)

    Google Scholar 

  11. Bittar, A., Amsili, P., Denis, P., Danlos, L.: French TimeBank: an ISO-TimeML annotated reference corpus. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 130–134. Association for Computational Linguistics, Portland (2011)

    Google Scholar 

  12. Caselli, T.: Time, events and temporal relations: an empirical model for temporal processing of Italian texts. Ph.D. thesis, Dept. of Linguistics, University of Pisa (2009)

    Google Scholar 

  13. Caselli, T., Lenzi, V.B., Sprugnoli, R., Pianta, E., Prodanof, I.: Annotating events, temporal expressions and relations in Italian: The It-Timeml experience for the Ita-TimeBank. In: Proceedings of the Fifth Linguistic Annotation Workshop, pp. 143–151 (2011)

    Google Scholar 

  14. Caselli, T., Llorens, H., Navarro-Colorado, B., Saquete, E.: Data-driven approach using semantics for recognizing and classifying TimeML events in Italian. In: Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, pp. 533–538. RANLP 2011 Organising Committee, Hissar (2011)

    Google Scholar 

  15. Caselli, T., Sprugnoli, R.: It-TimeML - TimeML Annotation Guidelines for Italian, v. 1.4. Technical report, VU Amsterdam and Fondazione Bruno Kessler (2015)

    Google Scholar 

  16. Caselli, T., Sprugnoli, R., Speranza, M., Monachini, M.: Eventi. EValuation of events and temporal INformation at Evalita 2014. In: Bosco, C., DellOrletta, F., Montemagni, S., Simi, M. (eds.) Evaluation of Natural Language and Speech Tools for Italian, pp. 27–34. Pisa University Press, Pisa (2014)

    Google Scholar 

  17. Costa, F., Branco, A.: TimeBankPT: a TimeML annotated corpus of Portuguese. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, pp. 3727–3734 (2012)

    Google Scholar 

  18. Eshaghzadeh Torbati, M., Ghassem-sani, G., Mirroshandel, S.A., Yaghoobzadeh, Y., Karimi Hosseini, N.: Temporal relation classification in Persian and english contexts. In: Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013, pp. 261–269. INCOMA Ltd. Shoumen (2013)

    Google Scholar 

  19. Forascu, C.: Why don’t Romanians have a five O’clock tea, nor Halloween, but have a kind of valentines day? In: 9th International Computational Linguistics and Intelligent Text Processing Conference (CICLing 2008). LNCS, vol. 4919, pp. 73–84. Springer (2008)

    Google Scholar 

  20. Group, T.W.: TimeML Annotation Guidelines Version 1.3. Brandeis University, Boston (2008)

    Google Scholar 

  21. Im, S., You, H., Jang, H., Nam, S., Shin, H.: KTimeML: specification of temporal and event expressions in Korean text. In: Proceedings of the 7th Workshop on Asian Language Resources, pp. 115–122. Association for Computational Linguistics (2009)

    Google Scholar 

  22. ISO, S.W.G.: ISO DIS 24617–1: 2008 Language resource management - Semantic annotation framework - Part 1: Time and events. ISO Central Secretariat, Geneva (2008)

    Google Scholar 

  23. Linguistic Data Consortium.: ACE (Automatic Content Extraction) English Annotation Guidelines for Entities, Version 6.6 2008.06.13 (2008)

    Google Scholar 

  24. Magnini, B., Pianta, E., Girardi, C., Negri, M., Romano, L., Speranza, M., Bartalesi Lenzi, V., Sprugnoli, R.: I-CAB: The Italian content annotation bank. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, pp. 963–968 (2006)

    Google Scholar 

  25. Manfedi, G., Strötgen, J., Zell, J., Gertz, M.: HeidelTime at EVENTI: tuning Italian resources and addressing TimeML empty tags. In: Proceedings of the 4th Evaluation Campaign of Natural Language Processing and Speech tools for Italian (EVALITA 2014), pp. 39–43. Pisa University Press, Pisa (2014)

    Google Scholar 

  26. Mani, I.: Chronoscopes: a theory of underspecified temporal representation. In: Schilder, F., Katz, G., Pustejovsky, J. (eds.) Annotating, Extracting and Reasoning about Time and Events. LNAI, pp. 127–139. Springer, Berlin (2007)

    Chapter  Google Scholar 

  27. Marinelli, R., Biagini, L., Bindi, R., Goggi, S., Monachini, M., Orsolini, P., Picchi, E., Rossi, S., Calzolari, N., Zampolli, A.: The Italian PAROLE corpus: an overview. In: Computational Linguistics in Pisa, XVI-XVII, IEPI., I, pp. 401–421 (2003)

    Google Scholar 

  28. Miller, T.A., Bethard, S., Dligach, D., Pradhan, S., Lin, C., Savova, G.K.: Discovering temporal narrative containers in clinical text. In: Proceedings of the Workshop on Biomedical Natural Language Processing, pp. 18-26 (2013)

    Google Scholar 

  29. Mirza, P., Minard, A.L.: FBK-HLT-time: a complete Italian temporal processing system for EVENTI-EVALITA 2014. In: Proceedings of the 4th Evaluation Campaign of Natural Language Processing and Speech tools for Italian (EVALITA 2014), pp. 44–49. Pisa University Press, Pisa (2014)

    Google Scholar 

  30. Montemagni, S., Barsotti, F., Battista, M., Calzolari, N., Corazzar, O., Lenci, A., Pirelli, V., Zampolli, A., Fanciulli, F., Massetani, M., Raffaelli, R., Basili, R., Pazienza, M.T., Saracino, D., Zanzotto, F., Mana, N., Pianesi, F., Delmonte, R.: The syntactic-semantic treebank of Italian. An overview. In: Computational Linguistics in Pisa, special Issue XVIII-XIX, pp. 461–93 (2003)

    Google Scholar 

  31. Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., Webber, B.: The penn discourse TreeBank 2.0. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08). European Language Resources Association (ELRA), Marrakech (2008)

    Google Scholar 

  32. Pustejovsky, J., Castaño, J.M., Ingria, R., Saurí, R., Gaizauskas, R., Setzer, A., Katz, G.: TimeML: Robust specification of event and temporal expressions in text. In: Proceedings of the Fifth International Workshop on Computational Semantics (2003)

    Google Scholar 

  33. Pustejovsky, J., Littman, J., Saurí, R., Verhagen, M.: TimeBank 1.2 Documentation (2006)

    Google Scholar 

  34. Pustejovsky, J., Stubbs, A.: Increasing informativeness in temporal annotation. In: Proceedings of the fifth Linguistic Annotation Workshop, pp. 152–160. Association for Computational Linguistics (2011)

    Google Scholar 

  35. Pustejovsky, J., Stubbs, A.: Natural Language Annotation for Machine Learning. O’Reilly Media Inc., Sebastopol (2012)

    Google Scholar 

  36. Robaldo, L., Caselli, T., Grella, M.: Rule-based creation of timeml documents from dependency trees. In: Pirrone, R., Sorbello, F. (eds.) AI* IA 2011: Artificial Intelligence Around Man and Beyond, pp. 389–394. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  37. Saurì, R.: Annotating Temporal Relations in Catalan and Spanish TimeML Annotation Guidelines (2010)

    Google Scholar 

  38. Saurı, R., Pustejovsky, J.: Annotating Events in Catalan - TimeML Annotation Guidelines (Version TempEval-2010) (2009)

    Google Scholar 

  39. Saurı, R., Pustejovsky, J.: Annotating Time Expressions in Catalan - TimeML Annotation Guidelines (Version TempEval-2010) (2010)

    Google Scholar 

  40. Smith, C.S.: The Parameter of Aspect. Kluwer Academic Publishers, Dordrecht (1997)

    Book  Google Scholar 

  41. Stubbs, A.: MAE and MAI: lightweight annotation and adjudication tools. In: Proceedings of the fifth Linguistic Annotation Workshop, pp. 129–133. Association for Computational Linguistics (2011)

    Google Scholar 

  42. UzZaman, N., Llorens, H., Derczynski, L., Allen, J., Verhagen, M., Pustejovsky, J.: Semeval-2013 task 1: Tempeval-3: evaluating time expressions, events, and temporal relations. In: Second Joint Conference on Lexical and Computational Semantics (*SEM). Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pp. 1–9. Association for Computational Linguistics, Atlanta (2013)

    Google Scholar 

  43. Vanelli, L.: La concordanza dei tempi. In: Renzi, L., Salvi, G., Cardinaletti, A. (eds.) Grande Grammatica Italiana di Consultazione. I sintagmi verbale, aggettivale e avverbiale. La Subordinazione, vol. II, pp. 611–632. Il Mulino (2001)

    Google Scholar 

  44. Verhagen, M.: The brandeis annotation tool. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation, pp. 3638–3643. European Languages Resources Association (ELRA), Valletta (2010). ACL Anthology Identifier: L10-1513

    Google Scholar 

  45. Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Katz, G., Pustejovsky, J.: Semeval-2007 task 15: tempeval temporal relation identification. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), pp. 75–80 (2007)

    Google Scholar 

  46. Verhagen, M., Saurí, R., Caselli, T., Pustejovsky, J.: Semeval-2010 task 13: Tempeval-2. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 57–62. ACL, Uppsala (2010)

    Google Scholar 

  47. Yaghoobzadeh, Y., Ghassem-Sani, G., Mirroshandel, S.A., Eshaghzadeh, M.: ISO-TimeML event extraction in Persian text. In: Proceedings of the 24th International Conference on Computational Linguistics, pp. 2931–2944 (2012)

    Google Scholar 

Download references

Acknowledgements

This development of the ILC corpus has been supported by two grants from the Instituto di Linguistica Computazionale - CNR of Pisa, “Disegno di Standard e Costruzione di Risorse Linguistico Computazionali”, IC-P02-ILC-CNR and “Risorse e Tecnologie Linguistiche: modelli, metodi di sviluppo, applicazioni, disegno di strategie internazionali”, IC.P02.005. Assistance provided by Irina Prodanof and Nicoletta Calzolari was greatly appreciated.

The development of the CELCT corpus has been supported by LiveMemories project (Active Digital Memories of Collective Life), funded by the Autonomous Province of Trento under the Major Projects 2006 research program, and by the European Union’s 7th Framework Programme via the NewsReader Project (ICT-316404). Emanuele Pianta and Valentina Bartalesi Lenzi made invaluable contribution to the creation of the CELCT Corpus. Special thanks go to Giovanni Moretti, and Alessandro Marchetti who collaborated with us in processing and annotating the CELCT corpus.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tommaso Caselli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Caselli, T., Sprugnoli, R. (2017). It-TimeML and the Ita-TimeBank: Language Specific Adaptations for Temporal Annotation. In: Ide, N., Pustejovsky, J. (eds) Handbook of Linguistic Annotation. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-0881-2_36

Download citation

  • DOI: https://doi.org/10.1007/978-94-024-0881-2_36

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-024-0879-9

  • Online ISBN: 978-94-024-0881-2

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics