Abstract
Syntactic parsing is the process of taking an input sentence and producing an appropriate syntactic structure for it. It is a crucial stage in that it provides a way to pass from core NLP tasks to the semantic layer and it has been shown to increase the performance of many high-tier NLP applications such as machine translation, sentiment analysis, question answering, and so on. Statistical dependency parsing with its high coverage and easy-to-use outputs has become very popular in recent years for many languages including Turkish. In this chapter, we describe the issues in developing and evaluating a dependency parser for Turkish, which poses interesting issues and many different challenges due to its agglutinative morphology and freeness of its constituent order. Our approach is an adaptation of a language-independent data-driven statistical parsing system to Turkish.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Please note that arrows in this representations point from dependents to heads and we do not include punctuation in dependency relations.
- 2.
We however do not necessarily suggest that the morphological sub-lexical representation that we use for Turkish later in this paper is applicable to these languages.
- 3.
In Turkish, such sentences are called “inverted sentences” and are mostly used in spoken language but rarely in written form.
- 4.
+A3sg: Third person singular agreement, +P2pl: Second person plural possessive agreement, +Loc: Locative Case.
- 5.
Bozşahin (2002) uses morphemes as sub-lexical constituents in a CCG framework. Since the lexicon was organized in terms of morphemes each with its own CCG functor, the grammar had to account for both the morphotactics and the syntax at the same time.
- 6.
Experiments have also been performed using memory-based learning (Daelemans and van den Bosch 2005). They were found to give lower parsing accuracy.
- 7.
A recent study by Sulubacak and Eryiğit (2013) extends this representation and assigns different lemma and surface form information for each IG.
- 8.
The token indexes within the actual token sequence are represented by their relative positions to the stack and queue elements. In this representation σ 0 + 1 refers directly to the right neighbor of the σ 0 within the actual sequence. Similarly, σ 0 − 1 refers to the left neighbor.
- 9.
Actually, there are two parsers (Bick and Attardi in Table 7.2) in this group that try to use parts of the inflectional features under special circumstances.
References
Ambati BR, Reddy S, Kilgarriff A (2012) Word sketches for Turkish. In: Proceedings of LREC, Istanbul, pp 2945–2950
Arısoy E, Saraçlar M, Roark B, Shafran I (2012) Discriminative language modeling with linguistic and statistically derived features. IEEE Trans Audio Speech Lang Process 20(2):540–550
Attardi G (2006) Experiments with a multilanguage non-projective dependency parser. In: Proceedings of CONLL, New York, NY, pp 166–170
Bick E (2006) Lingpars, a linguistically inspired, language-independent machine learner for dependency treebanks. In: Proceedings of CONLL, New York, NY, pp 171–175
Black E, Jelinek F, Lafferty JD, Magerman DM, Mercer RL, Roukos S (1992) Towards history-based grammars: using richer models for probabilistic parsing. In: Proceedings of the DARPA speech and natural language workshop, New York, NY, pp 31–37
Bozşahin C (2002) The combinatory morphemic lexicon. Comput Linguist 28(2):145–186
Buchholz S, Marsi E (2006) CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of CONLL, New York, NY, pp 149–164
Canisius S, Bogers T, van den Bosch A, Geertzen J, Sang ETK (2006) Dependency parsing by inference over high-recall dependency predictions. In: Proceedings of CONLL, New York, NY, pp 176–180
Carreras X, Surdeanu M, Marquez L (2006) Projective dependency parsing with perceptron. In: Proceedings of CONLL, New York, NY, pp 181–185
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3), 1–27
Chang MW, Do Q, Roth D (2006) A pipeline model for bottom-up dependency parsing. In: Proceedings of CONLL, New York, NY, pp 186–190
Cheng Y, Asahara M, Matsumoto Y (2006) Multi-lingual dependency parsing at NAIST. In: Proceedings of CONLL, New York, NY, pp 191–195
Chung H, Rim HC (2004) Unlexicalized dependency parser for variable word order languages based on local contextual pattern. In: Proceedings of CICLING, Seoul, pp 109–120
Collins M (1996) A new statistical parser based on bigram lexical dependencies. In: Proceedings of ACL, Santa Cruz, CA, pp 184–191
Collins M (1999) Head-driven statistical models for natural language parsing. PhD thesis, University of Pennsylvania, Philadelphia, PA
Corston-Oliver S, Aue A (2006) Dependency parsing with reference to Slovene, Spanish and Swedish. In: Proceedings of CONLL, New York, NY, pp 196–200
Daelemans W, van den Bosch A (2005) Memory-based language processing. Cambridge University Press, Cambridge
Derici C, Çelik K, Özgür A, Güngör T, Kutbay E, Aydın Y, Kartal G (2014) Rule-based focus extraction in Turkish question answering systems. In: Proceedings of IEEE signal processing and communications applications conference, Trabzon, pp 1604–1607
Dreyer M, Smith DA, Smith NA (2006) Vine parsing and minimum risk reranking for speed and precision. In: Proceedings of CONLL, New York, NY, pp 201–205
Eisner J (1996) Three new probabilistic models for dependency parsing: an exploration. In: Proceedings of COLING, Copenhagen, pp 340–345
Erguvanlı EE (1979) The function of word order in Turkish grammar. PhD thesis, UCLA, Los Angeles, CA
Eryiğit G (2007) Dependency parsing of Turkish. PhD thesis, Istanbul Technical University, Istanbul
Eryiğit G (2014) ITU Turkish NLP web service. In: Proceedings of EACL, Gothenburg, pp 1–4
Eryiğit G, Oflazer K (2006) Statistical dependency parsing of Turkish. In: Proceedings of EACL, Trento, pp 89–96
Eryiğit G, Nivre J, Oflazer K (2006) The incremental use of morphological information and lexicalization in data-driven dependency parsing. In: Proceedings of the international conference on the computer processing of oriental languages, Singapore, pp 498–507
Eryiğit G, Nivre J, Oflazer K (2008) Dependency parsing of Turkish. Comput Linguist 34(3):357–389
Eryiğit G, Çetin FS, Yanık M, Temel T, Çiçekli İ (2013) Turksent: a sentiment annotation tool for social media. In: Proceedings of the workshop on linguistic annotation workshop and interoperability with discourse, Sofia, pp 131–134
Goldberg Y, Nivre J (2012) A dynamic oracle for arc-eager dependency parsing. In: Proceedings of COLING, Mumbai, pp 959–976
Hadımlı K, Yöndem MT (2011) Information retrieval from Turkish radiology reports without medical knowledge. In: Proceedings of the international conference on flexible query answering systems, Ghent, pp 210–220
Hadımlı K, Yöndem MT (2012) Two alternate methods for information retrieval from Turkish radiology reports. In: Proceedings of ISCIS, London, pp 527–532
Hakkani-Tür DZ, Oflazer K, Tür G (2002) Statistical morphological disambiguation for agglutinative languages. Comput Hum 36(4):381–410
Hoffman B (1994) Generating context appropriate word orders in Turkish. In: Proceedings of the international workshop on natural language generation, Kennebunkport, ME, pp 117–126
Hudson RA (1990) English word grammar, vol 108. Basil Blackwell, Oxford
Johansson R, Nugues P (2006) Investigating multilingual dependency parsing. In: Proceedings of CONLL, New York, NY, pp 206–210
Koo T, Collins M (2010) Efficient third-order dependency parsers. In: Proceedings of ACL, Uppsala, pp 1–11
Koo T, Rush AM, Collins M, Jaakkola T, Sontag D (2010) Dual decomposition for parsing with non-projective head automata. In: Proceedings of EMNLP, Cambridge, MA, pp 1288–1298
Kudo T, Matsumoto Y (2002) Japanese dependency analysis using cascaded chunking. In: Proceedings of CONLL, Taipei, pp 63–69
Liu T, Ma J, Zhu H, Li S (2006) Dependency parsing based on dynamic local optimization. In: Proceedings of CONLL, New York, NY, pp 211–215
Magerman DM (1995) Statistical decision-tree models for parsing. In: Proceedings of ACL, Cambridge, MA, pp 276–283
Marcus M (1980) A theory of syntactic recognition for natural language. MIT Press, Cambridge, MA
Martins AF, Smith NA, Xing EP (2009) Concise integer linear programming formulations for dependency parsing. In: Proceedings of the ACL-IJCNLP, Singapore, pp 342–350
McDonald R, Pereira F (2006) Online learning of approximate dependency parsing algorithms. In: Proceedings of EACL, Trento, pp 81–88
McDonald R, Crammer K, Pereira F (2005) Online large-margin training of dependency parsers. In: Proceedings of ACL, Ann Arbor, MI, pp 91–98
McDonald R, Lerman K, Pereira F (2006) Multilingual dependency analysis with a two-stage discriminative parser. In: Proceedings of CONLL, New York, NY, pp 216–220
Megyesi B, Dahlqvist B (2007) The Swedish-Turkish parallel corpus and tools for its creation. In: Proceedings of the nordic conference on computational linguistics, Tartu, pp 136–143
Megyesi B, Dahlqvist B, Pettersson E, Nivre J (2008) Swedish-Turkish parallel treebank. In: Proceedings of LREC, Marrakesh, pp 470–473
Mel’čuk IA (1988) Dependency syntax: theory and practice. SUNY Press, Albany, NY
Meral HM, Sankur B, Özsoy AS, Güngör T, Sevinç E (2009) Natural language watermarking via morphosyntactic alterations. Comput Speech Lang 23(1):107–125
Nivre J (2003) An efficient algorithm for projective dependency parsing. In: Proceedings of IWPT, Nancy, pp 149–160
Nivre J (2004) Incrementality in deterministic dependency parsing. In: Proceedings of the workshop on incremental parsing: bringing engineering and cognition together, Barcelona, pp 50–57
Nivre J (2006) Inductive dependency parsing. Springer, Dordrecht
Nivre J, Scholz M (2004) Deterministic dependency parsing of English text. In: Proceedings of COLING, Geneva, pp 64–70
Nivre J, Hall J, Nilsson J (2004) Memory-based dependency parsing. In: Proceedings of CONLL, Boston, MA, pp 49–56
Nivre J, Hall J, Nilsson J, Eryiğit G, Marinov S (2006) Labeled pseudo-projective dependency parsing with support vector machines. In: Proceedings of CONLL, New York, NY, pp 221–225
Nivre J, Hall J, Nilsson J, Chanev A, Eryiğit G, Kübler S, Marinov S, Marsi E (2007) MaltParser: a language-independent system for data-driven dependency parsing. Nat Lang Eng 13(2):95–135
Oflazer K (2003) Dependency parsing with an extended finite-state approach. Comput Linguist 29(4):515–544
Oflazer K (2014) Turkish and its challenges for language processing. Lang Resour Eval 48(4):639–653
Oflazer K, Say B, Hakkani-Tür DZ, Tür G (2003) Building a Turkish treebank. In: Treebanks: building and using parsed corpora. Kluwer Academic Publishers, Berlin
Ratnaparkhi A (1997) A linear observed time statistical parser based on maximum entropy models. In: Proceedings of EMNLP, Providence, RI, pp 1–10
Riedel S, Çakıcı R, Meza-Ruiz I (2006) Multi-lingual dependency parsing with incremental integer linear programming. In: Proceedings of CONLL, New York, NY, pp 226–230
Sagae K, Lavie A (2005) A classifier-based parser with linear run-time complexity. In: Proceedings of IWPT, Vancouver, pp 125–132
Saygın AP (2010) A computational analysis of interaction patterns in the acquisition of Turkish. Res Lang Comput 8(4):239–253
Schiehlen M, Spranger K (2006) Language independent probabilistic context-free parsing bolstered by machine learning. In: Proceedings of CONLL, New York, NY, pp 231–235
Sekine S, Uchimoto K, Isahara H (2000) Backward beam search algorithm for dependency analysis of Japanese. In: Proceedings of COLING, Saarbrücken, pp 754–760
Sgall P, Hajicová E, Panevová J (1986) The meaning of the sentence in its semantic and pragmatic aspects. Springer, Dordrecht
Shieber SM (1983) Sentence disambiguation by a shift-reduce parsing technique. In: Proceedings of ACL, Cambridge, MA, pp 113–118
Shimizu N (2006) Maximum spanning tree algorithm for non-projective labeled dependency parsing. In: Proceedings of CONLL, New York, NY, pp 236–240
Sulubacak U, Eryiğit G (2013) Representation of morphosyntactic units and coordination structures in the Turkish dependency treebank. In: Proceedings of the workshop on statistical parsing of morphologically rich languages, Seattle, WA, pp 129–134
Tesnière L (1959) Eléments de syntaxe structurale. Librairie C. Klincksieck, Paris
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York, NY
Wu YC, Lee YS, Yang JC (2006) The exploration of deterministic and efficient dependency parsing. In: Proceedings of CONLL, New York, NY, pp 241–245
Yamada H, Matsumoto Y (2003) Statistical dependency analysis with support vector machines. In: Proceedings of IWPT, Nancy, pp 195–206
Yıldırım E, Çetin FS, Eryiğit G, Temel T (2014) The impact of NLP on Turkish sentiment analysis. In: Proceedings of the international conference on Turkic language processing, Istanbul
Yuret D (2006) Dependency parsing as a classification problem. In: Proceedings of CONLL, New York, NY, pp 246–250
Zhang Y, Clark S (2008) A tale of two parsers: investigating and combining graph-based and transition-based dependency parsing using beam-search. In: Proceedings of EMNLP, Honolulu, HI, pp 562–571
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Eryiğit, G., Nivre, J., Oflazer, K. (2018). Dependency Parsing of Turkish. In: Oflazer, K., Saraçlar, M. (eds) Turkish Natural Language Processing. Theory and Applications of Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-90165-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-90165-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90163-3
Online ISBN: 978-3-319-90165-7
eBook Packages: Computer ScienceComputer Science (R0)