Skip to main content

Morphology Within the Multi-layered Annotation Scenario of the Prague Dependency Treebank

  • Conference paper
  • First Online:
Systems and Frameworks for Computational Morphology (SFCM 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 537))

  • 295 Accesses

Abstract

Morphological annotation constitutes a separate layer in the multi-layered annotation scenario of the Prague Dependency Treebank. At this layer, morphological categories expressed by a word form are captured in a positional part-of-speech tag. According to the Praguian approach based on the relation between form and function, functions (meanings) of morphological categories are represented as well, namely as grammateme attributes at the deep-syntactic (tectogrammatical) layer of the treebank.

In the present paper, we first describe the role of morphology in the Prague Dependency Treebank, and then outline several recent topics based on Praguian morphology: named entity recognition in Czech, formemes attributes encoding morpho-syntactic information in the dependency-based machine translation system, and development of a lexical database of derivational relations based partially on information provided by the morphological analyser.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The issues of morphological synthesis, generation etc. go beyond the scope of the paper; see Hajič [11] for a complex description of computational approach to Czech morphology including formal definitions.

  2. 2.

    The semi-supervised version of Morče was published under the Compost project (http://ufal.mff.cuni.cz/legacy/compost/cz/). An implementation of the averaged perceptron algorithm was released in the Featurama project too (http://sourceforge.net/projects/featurama/).

  3. 3.

    http://korpus.cz/.

  4. 4.

    http://ufal.mff.cuni.cz/rest/CAC/tOrig.html.

  5. 5.

    http://ufal.mff.cuni.cz/pdt1/Morphology_and_Tagging/Doc/compact_tags.pdf.

  6. 6.

    The tag of the verb form is composed according to the pattern for present indicative forms: VPnpa (i.e., verb – indicative present – number – person – negation).

  7. 7.

    http://ufal.mff.cuni.cz/pdt1/Morphology_and_Tagging/Doc/hmptagqr.pdf.

  8. 8.

    An extended version of 16 positions was used in corpora of the Czech National Corpus. The 16th position is associated with the category of aspect which is, when using the tag with 15 positions, encoded in the technical lemma suffix described below.

  9. 9.

    Generally speaking, there are typical nominal categories, such as case and gender, which do not combine with verbal categories, such as person, tense, mood, and voice. However, for instance, some Czech verb forms (past participle, transgressive) are marked for gender.

  10. 10.

    With pluralia tantum nouns and other words with an incomplete or deficient paradigm, other forms are used instead of the canonical one; for instance, the plurale tantum kalhoty ‘trousers’ is assigned the nominative plural form as a lemma.

  11. 11.

    The present paper draws a terminological distinction between a level as a concept of the theoretical framework of FGD and a layer as a part of the annotation scenario of PDT.

  12. 12.

    An opposite perspective, i.e. the text as a surface string which is assigned a deeper analysis, is justifiable as well; however, we stick to the perspective from the text as a basis on the top of which analyses are built.

  13. 13.

    There are considerable similarities in dealing with morphology between FGD (and PDT) and the Meaning-Text Theory (MTT). As in MTT even more levels are distinguished than in FGD, the morphological level in FGD corresponds mainly to the deep-morphological representation in MTT but shares several features with the surface-syntactic representation of this framework [34]. The function of morphological categories is then a part of the deep-syntactic representation in MTT (the attributes are called grammemes in MTT and grammatemes in FGD); see Žabokrtský [74] for a more detailed comparison of these frameworks.

  14. 14.

    A preliminary, test version of the treebank (PDT 0.5), containing 450 thousand tokens in 26 thousand sentences, was compiled for the Summer Workshop on Language Engineering at the Johns Hopkins University in Baltimore in 1998.

  15. 15.

    Syntactically annotated PDT data of the particular versions are publicly accesible via the PML Tree Query environment (https://lindat.mff.cuni.cz/services/pmltq/; [38]) for searching.

  16. 16.

    https://bitbucket.org/jhana/feat-morph/wiki/Home.

  17. 17.

    These derivations are subtypes of lexical derivation according to Kuryłowicz [30].

  18. 18.

    Negated verb forms are analysed differently at the tectogrammatical layer, namely, they are decomposed into two nodes; cf. the verbal node with the lemma lze and node with the artificial lemma #Neg representing the negation in Fig. 2.

  19. 19.

    They belong to syntactic derivation as defined by Kuryłowicz [30].

  20. 20.

    http://nl.ijs.si/sdt/.

  21. 21.

    http://nlp.perseus.tufts.edu/syntax/treebank/.

  22. 22.

    Stanford Universal Dependencies, the Interset interligua (mentioned in Sect. 2.2), and Google universal POS tags [41] served as a basis for the annotation scheme of the Universal Dependencies treebank project, the current version of which (Universal Dependencies 1.1; [1]) contains dependency annotated data for 18 languages including Czech.

  23. 23.

    See http://ufal.mff.cuni.cz/treex.

  24. 24.

    http://qtleap.eu/.

  25. 25.

    A limited derivational analysis is carried out also by the ajka analyser (see Sect. 2.1).

  26. 26.

    In Czech linguistics, derivation is separated from inflectional morphology, being described as the core part of word-formation, which is kept apart from the grammatical module; only inflectional morphology and syntax are supposed to constitute the grammatical structure of Czech.

  27. 27.

    http://ufal.mff.cuni.cz/derinet.

  28. 28.

    http://ufal.mff.cuni.cz/derinet/viewer.

  29. 29.

    For instance, one of the changes occurring during derivation of the adjective sněžný ‘snowy’ from the noun sníh ‘snow’ is present in the inflectional paradigm of the noun (sníh.nom.sg – sněhu.gen.sg).

  30. 30.

    One of the current mistakes is documented in the tree in Fig. 3: the noun nestandardnost ‘non-standardness’ is to be captured as derived either from the noun standardnost ‘standardness’, or from the adjective nestandardní ‘non-standard’ (which is not included in the network, though).

  31. 31.

    For instance, the suffix -ka is used both in diminutives and female nouns (e.g. skříň ‘cupboard’ \(>\) skříňka ‘small cupboard’, učitel ‘teacher’ \(>\) učitelka ‘female teacher’), and, on the other hand, several meanings are expressed by formally different affixes in Czech (e.g. female nouns are derived by the suffixes -ka, -yně, -ice, -ovna and several others).

References

  1. Agić, Ž., Aranzabe, M.J., Atutxa, A., Bosco, C., Choi, J., de Marneffe, M.-C., Dozat, T., Farkas, R., Foster, J., Ginter, F., Goenaga, I., Gojenola, K., Goldberg, Y., Hajič, J., Johannsen, A.T., Kanerva, J., Kuokkala, J., Laippala, V., Lenci, A., Lindén, K., Ljubešić, N., Lynn, T., Manning, C., Martínez, H.A., McDonald, R., Missilä, A., Montemagni, S., Nivre, J., Nurmi, H., Osenova, P., Petrov, S., Piitulainen, J., Plank, B., Prokopidis, P., Pyysalo, S., Seeker, W., Seraji, M., Silveira, N., Simi, M., Simov, K., Smith, A., Tsarfaty, R., Vincze, V., Zeman, D.: Universal Dependencies 1.1. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2015). http://hdl.handle.net/11234/LRT-1478

  2. Baayen, R.H., Piepenbrock, R., Gulikers, L.: The CELEX lexical database (release 2), Data/software. Linguistic Data Consortium, Philadelphia (1995)

    Google Scholar 

  3. Bejček, E., Hajič, J., Panevová, J., Mírovský, J., Spoustová, J., Štěpánek, J., Straňák, P., Šidák, P., Vimmrová, P., Št’astná, E., Ševčíková, M., Smejkalová, L., Homola, P., Popelka, J., Lopatková, M., Hrabalová, L., Kluyeva, N., Žabokrtský, Z.: Prague Dependency Treebank 2.5. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2011). http://hdl.handle.net/11858/00-097C-0000-0006-DB11-8

  4. Bejček, E., Hajičová, E., Hajič, J., Jínová, P., Kettnerová, V., Kolářová, V., Mikulová, M., Mírovský, J., Nedoluzhko, A., Panevová, J., Poláková, L., Ševčíková, M., Štěpánek, J., Zikánová, Š.: Prague Dependency Treebank 3.0. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2013). http://hdl.handle.net/11858/00-097C-0000-0023-1AAF-3

  5. Böhmová, A., Hajič, J., Hajičová, E., Hladká, B.: The Prague dependency treebank: a three-level annotation scenario. In: Abeillé, A. (ed.) Treebanks: Building and Using Syntactically Annotated Corpora, pp. 103–128. Kluwer Academic Publishers, Dordrecht (2003)

    Chapter  Google Scholar 

  6. Collins, M.: Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), vol. 10, pp. 1–8. Association for Computational Linguistics, Philadelphia (2002)

    Google Scholar 

  7. Dušek, O., Žabokrtský, Z., Popel, M., Majliš, M., Novák, M., Mareček, D.: Formemes in english-czech deep syntactic MT. In: Proceedings of the Seventh ACL Workshop on Statistical Machine Translation, pp. 267–274. Association for Computational Linguistics, Montréal (2012)

    Google Scholar 

  8. Feldman, A., Hana, J.: A Resource-Light Approach to Morpho-Syntactic Tagging. Rodopi, Amsterdam (2010)

    Book  Google Scholar 

  9. Fleischman, M., Hovy, E.: Fine-grained classification of named entities. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING), vol. I, pp. 267–273. Association for Computational Linguistics, Taipei (2002)

    Google Scholar 

  10. Giesbrecht, E., Evert, S.: Part-of-speech tagging - a solved task? an evaluation of POS taggers for the Web as corpus. In: Proceedings of the 5th Web as Corpus Workshop (WAC5), San Sebastian, pp. 27–35 (2009)

    Google Scholar 

  11. Hajič, J.: Disambiguation of Rich Inflection: Computational Morphology of Czech. Karolinum, Prague (2004)

    Google Scholar 

  12. Hajič, J., Hajičová, E., Panevová, J., Sgall, P., Cinková, S., Fučíková, E., Mikulová, M., Pajas, P., Popelka, J., Semecký, J., Šindlerová, J., Štěpánek, J., Toman, J., Urešová, Z., Žabokrtský, Z.: Prague Czech-English Dependency Treebank 2.0. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2012). http://hdl.handle.net/11858/00-097C-0000-0015-8DAF-4

  13. Hajič, J., Hladká, B.: Probabilistic and rule-based tagger of an inflective language - a comparison. In: Proceedings of the 5th Conference on Applied Natural Language Processing, pp. 111–118. Association for Computational Linguistics, Washington, DC (1997)

    Google Scholar 

  14. Hajič, J., Hlaváčvá, J.: MorfFlex CZ. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (1990). http://hdl.handle.net/11858/00-097C-0000-0015-A780-9

  15. Hajič, J., Krbec, P., Oliva, K., Květoň, P., Petkevič, V.: Serial combination of rules and statistics: a case study in Czech tagging. In: Proceedings of the 39th Annual Meeting of the Association of Computational Linguistics (ACL 2001), pp. 260–267. Association for Computational Linguistics, Tolouse (2001)

    Google Scholar 

  16. Hajič, J., Panevová, J., Hajičcová, E., Sgall, P., Pajas, P., Štěpánek, J., Havelka, J., Mikulová, M., Žabokrtský, Z., Ševčíková-Razímová, M., Urešová, Z.: Prague Dependency Treebank 2.0. Data/software. Linguistic Data Consortium, Philadelphia (2006)

    Google Scholar 

  17. Hajič, J., Smrž, O., Zemánek, P., Pajas, P., Šnaidauf, J., Beška, E., Kracmar, J., Hassanová, K.: Prague Arabic Dependency Treebank 1.0. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2009). http://hdl.handle.net/11858/00-097C-0000-0001-4872-3

  18. Hajič, J., Vidová Hladká, B.: Czech language processing - PoS tagging. In: Proceedings of the 1st International Conference on Language Resources and Evaluation (LREC 1998), pp. 931–936. ELRA, Granada (1998)

    Google Scholar 

  19. Hajič, J., Vidová Hladká, B., Panevová, J., Hajičcová, E., Sgall, P., Pajas, P.: Prague Dependency Treebank 1.0. Data/software. Linguistic Data Consortium, Philadelphia (2001)

    Google Scholar 

  20. Hana, J., Feldman, A.: Resource-light approaches to computational morphology. Part 1: monolingual approaches. Lang. Linguist. Compass 6, 622–634 (2012)

    Article  Google Scholar 

  21. Hana, J., Zeman, D., Hajič, J., Hanová, H., Hladká, B., Jeřábek, E.: Manual for Morphological Annotation, Revision for the Prague Dependency Treebank 2.0. Technical report no. 2005/TR-2005-27, FAL MFF UK, Prague (2005)

    Google Scholar 

  22. Hathout, N., Namer, F.: Démonette, a French derivational morpho-semantic network. Linguist. Issues Lang. Technol. 11, 125–168 (2014)

    Google Scholar 

  23. Hladká, B.: Software Tools for Large Czech Corpora Annotation. Master thesis. MFF UK, Prague (1994)

    Google Scholar 

  24. Hladká, B., Králík, J.: Proměny Českého akademického korpusu. Slovo a Slovesnost 67, 179–194 (2006)

    Google Scholar 

  25. Komárek, M., Kořenský, J., Petr, J., Veselková, J., et al.: Mluvnice češtiny 2. Tvarosloví. Academia, Prague (1986)

    Google Scholar 

  26. Konkol, M., Konopík, M.: Maximum entropy named entity recognition for czech language. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 203–210. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  27. Konkol, M., Konopík, M.: CRF-based Czech named entity recognizer and consolidation of Czech NER research. In: Habernal, I., Matoušek, V. (eds.) TSD 2013. LNCS, vol. 8082, pp. 153–160. Springer, Heidelberg (2013)

    Google Scholar 

  28. Kravalová, J., Žabokrtský Z.: Czech named entity corpus and SVM-based recognizer. In: Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009), pp. 194–201. Association for Computational Linguistics, Suntec (2009)

    Google Scholar 

  29. Krbec, P.: Language Modelling for Speech Recognition of Czech. Ph.D. thesis. MFF UK, Prague (2005)

    Google Scholar 

  30. Kuryłowicz, J.: Dérivation lexicale et dérivation syntaxique. Bull. de la Société de Linguistique de Paris 37, 79–92 (1936)

    Google Scholar 

  31. Květoň, P.: Rule-based Morphological Disambiguation. Ph.D. thesis. MFF UK, Prague (2006)

    Google Scholar 

  32. Marcus, M., Santorini, B., Marcinkiewicz, M.A.: Building A Large Annotated Corpus of English: The Penn Treebank. Technical reports (CIS), Paper 237 (1993). http://repository.upenn.edu/cis_reports/237/

  33. de Marneffe, M.-C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J., Manning, C.: Universal stanford dependencies: a cross-linguistic typology. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 4585–4592. ELRA, Reykjavík (2014)

    Google Scholar 

  34. Mel’čuk, I.A.: Dependency Syntax: Theory and Practice. State University of New York Press, New York (1988)

    Google Scholar 

  35. Mikulová, M., Bémová, A., Hajič, J., Hajičová, E., Havelka, J., Kolářová, V., Kučová, L., Lopatková, M., Pajas, P., Panevová, J., Razímová, M., Sgall, P., Štěpánek, J., Urešová, Z., Veselá, K., Žabokrtský, Z.: Annotation on the tectogrammatical level in the Prague Dependency Treebank. Annotation manual. Technical report no. 2006/30, ÚFAL MFF UK, Prague (2006)

    Google Scholar 

  36. Oliva, K., Květoň, P., Ondruška, R.: The computational complexity of rule-based part-of-speech tagging. In: Matoušek, V., Mautner, P. (eds.) TSD 2003. LNCS (LNAI), vol. 2807, pp. 82–89. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  37. Oliva, K., Hnátková, M., Petkevič, V., Květoň, P.: The linguistic basis of a rule-based tagger of Czech. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2000. LNCS (LNAI), vol. 1902, pp. 3–8. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  38. Pajas, P., Štěpánek, J., Sedlák, M.: PML Tree Query. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2009). http://hdl.handle.net/11858/00-097C-0000-0022-C7F6-3

  39. Petkevič, V.: Reliable morphological disambiguation of Czech: rule-based approach is necessary. In: Šimková, M. (ed.) Insight into the Slovak and Czech Corpus Linguistics, pp. 26–44. Veda, Bratislava (2006)

    Google Scholar 

  40. Petkevič, V.: Problémy automatické morfologické disambiguace češtiny. Naše řeč 97, 194–207 (2014)

    Google Scholar 

  41. Petrov, S., Das, D., McDonald, R.: A universal part-of-speech tagset. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 2089–2096. ELRA, Istanbul (2012)

    Google Scholar 

  42. Razímová, M., Žabokrtský, Z.: Annotation of grammatemes in the prague dependency treebank 2.0. In: Proceedings of the LREC Workshop on Annotation Science, pp. 12–19. ELRA, Genova (2006)

    Google Scholar 

  43. Rosa, R., Mašek, J., Mareček, D., Popel, M., Zeman, D., Žabokrtský, Z.: HamleDT 2.0: thirty dependency treebanks stanfordized. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 2334–2341. ELRA, Reykjavík (2014)

    Google Scholar 

  44. Sedláček, R.: Morfologický analyzátor češtiny. Master thesis. FI MU, Brno (1999)

    Google Scholar 

  45. Sedláček, R., Smrž, P.: A new Czech morphological analyser ajka. In: Matoušek, V., Mautner, P., Mouček, R., Tauser, K. (eds.) TSD 2001. LNCS (LNAI), vol. 2166, pp. 100–107. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  46. Sedlák, M.: Treex::Web. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2014). http://hdl.handle.net/11858/00-097C-0000-0023-44AF-C

  47. Sekine, S.: Sekine’s Extended Named Entity Hierarchy (2003). http://nlp.cs.nyu.edu/ene/

  48. Sgall, P.: Generativní Popis Jayzka a Česká Deklinace. Academia, Prague (1967)

    Google Scholar 

  49. Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in its Semantic and Pragmatic Aspects. Reidel Publishing Company, Dordrecht (1986)

    Google Scholar 

  50. Spoustová, D.: Kombinované statisticko-pravidlové metody značkování češtiny. Ph.D. thesis. MFF UK, Prague (2007)

    Google Scholar 

  51. Spoustová, D., Hajič, J., Raab, J., Spousta, M.: Semi-supervised training for the averaged perceptron POS tagger. In: Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), pp. 763–771. Association for Computational Linguistics, Athens (2009)

    Google Scholar 

  52. Spoustová, D., Hajič, J., Votrubec, J., Krbec, P., Květoň, P.: The best of two worlds: cooperation of statistical and rule-based taggers for Czech. In: Proceedings of the Workshop on Balto-Slavonic Natural Language Processing 2007, pp. 67–74. Association for Computational Linguistics, Prague (2007)

    Google Scholar 

  53. Straka, M., Straková, J.: MorphoDiTa: Morphological Dictionary and Tagger. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2014). http://hdl.handle.net/11858/00-097C-0000-0023-43CD-0

  54. Straka, M., Straková, J.: NameTag. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2014). http://hdl.handle.net/11858/00-097C-0000-0023-43CE-E

  55. Straková, J., Straka, M., Hajič, J.: A new state-of-the-art Czech named entity recognizer. In: Habernal, I. (ed.) TSD 2013. LNCS, vol. 8082, pp. 68–75. Springer, Heidelberg (2013)

    Google Scholar 

  56. Straková, J., Straka, M., Hajič, J.: Open-source tools for morphology, lemmatization, POS tagging and named entity recognition. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014): System Demonstrations, pp. 13–18. Association for Computational Linguistics, Baltimore (2014)

    Google Scholar 

  57. Straková, J., Straka, M., Ševčíková, M., Žabokrtský, Z.: Czech Named Entity Corpus. In: Ide, N., Pustejovsky, J. (eds.) Handbook of Linguistic Annotation. Springer, Heidelberg (in press)

    Google Scholar 

  58. Ševčíková, M., Panevová, J., Smejkalová, L.: Specificity of the number of nouns in Czech and its annotation in prague dependency treebank. Prague Bull. Math. Linguist. 96, 27–47 (2011)

    Google Scholar 

  59. Ševčíková, M., Žabokrtský, Z.: Word-formation network for czech. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 1087–1093. ELRA, Reykjavík (2014)

    Google Scholar 

  60. Ševčíková, M., Žabokrtský, Z., Krůza, O.: Named entities in Czech: annotating data and developing NE tagger. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 188–195. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  61. Ševčíková, M., Žabokrtský, Z., Straková, J., Straka, M.: Czech Named Entity Corpus 1.1. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2014). http://hdl.handle.net/11858/00-097C-0000-0023-1B04-C

  62. Ševčíková Razímová, M., Žabokrtský, Z.: Systematic parameterized description of pro-forms in the prague dependency treebank 2.0. In: Proceedings of the Fifth International Workshop on Treebanks and Linguistic Theories (TLT 2006), pp. 175–186. Institute of Formal and Applied Linguistics, Prague (2006)

    Google Scholar 

  63. Šnajder, J.: DerivBase.Hr: a high-coverage derivational morphology resource for croatian. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 3371–3377. ELRA, Reykjavík (2014)

    Google Scholar 

  64. Štěpánek, J.: Post-annotation checking of prague dependency treebank 2.0 data. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 277–284. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  65. Štěpánek, J.: Závislostní zachycení větné struktury v anotovaném syntaktickém korpusu (nástroje pro zajištění konzistence dat). Ph.D. thesis. MFF UK, Prague (2006)

    Google Scholar 

  66. Vidová Hladká, B., Hajič, J., Hana, J., Hlaváčová, J., Mírovský, J., Raab, J.: Czech Academic Corpus 2.0. Data/software. Linguistic Data Consortium, Philadelphia (2008)

    Google Scholar 

  67. Viová Hladká, B., Hana, J., Hajič, J., Hlaváčová, J., Mírovský, J., Votrubec, J.: Czech Academic Corpus 1.0. Data/software. Karolinum, Prague (2007)

    Google Scholar 

  68. Votrubec, J.: Volba vhodné sady rysů pro morfologické značkování češtiny. Master thesis. MFF UK, Prague (2005)

    Google Scholar 

  69. Zeller, B., Šnajder, J., Padó, S.: DerivBase: inducing and evaluating a derivational morphology resource for German. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), pp. 1201–1211. Association for Computational Linguistics, Sofia (2013)

    Google Scholar 

  70. Zeman, D.: Lingua: Interset 2.026. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2014). http://hdl.handle.net/11234/1-1465

  71. Zeman, D.: Reusable tagset conversion using tagset drivers. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), pp. 213–218. ELRA, Marrakech (2008)

    Google Scholar 

  72. Zeman, D., Mareček, D., Mašek, J., Popel, M., Ramasamy, L., Rosa, R., Štěpánek, J., Žabokrtský, Z.: HamleDT 2.0. LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague (2014). http://hdl.handle.net/11858/00-097C-0000-0023-9551-4

  73. Zeman, D., Mareček, D., Popel, M., Ramasamy, L., Štěpánek, J., Žabokrtský, Z., Hajič, J.: HamleDT: to parse or not to parse? In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 2735–2741. ELRA, Istanbul (2012)

    Google Scholar 

  74. Žabokrtský, Z.: Resemblances between meaning-text theory and functional generative description. In: Proceedings of the 2nd International Conference of Meaning-Text Theory, pp. 549–557. Slavic Culture Languages Publishers House, Moskva (2005)

    Google Scholar 

  75. Žabokrtský, Z., Ptáček, J., Pajas. P.: TectoMT: highly modular MT system with tectogrammatics used as transfer layer. In: Proceedings of the Third ACL Workshop on Statistical Machine Translation, pp. 167–170. Association for Computational Linguistics, Columbus (2008)

    Google Scholar 

Download references

Acknowledgements

The research reported on in the paper has been supported by the LINDAT-Clarin project of the Ministry of Education of the Czech Republic (LM2010013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Magda Ševčíková .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ševčíková, M. (2015). Morphology Within the Multi-layered Annotation Scenario of the Prague Dependency Treebank. In: Mahlow, C., Piotrowski, M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2015. Communications in Computer and Information Science, vol 537. Springer, Cham. https://doi.org/10.1007/978-3-319-23980-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23980-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23978-1

  • Online ISBN: 978-3-319-23980-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics