Construction Grammar Based Annotation Framework for Parsing Tamil

Muralidaran, Vigneshwaran; Misra Sharma, Dipti

doi:10.1007/978-3-319-75477-2_27

Vigneshwaran Muralidaran¹⁴ &
Dipti Misra Sharma¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9623))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1365 Accesses
1 Citations

Abstract

Syntactic parsing in NLP is the task of working out the grammatical structure of sentences. Some of the purely formal approaches to parsing such as phrase structure grammar, dependency grammar have been successfully employed for a variety of languages. While phrase structure based constituent analysis is possible for fixed order languages such as English, dependency analysis between the grammatical units have been suitable for many free word order languages. These approaches rely on identifying the linguistic units based on their formal syntactic properties and establishing the relationships between such units in the form of a tree. Instead, we characterize every morphosyntactic unit as a mapping between form and function on the lines of Construction Grammar and parsing as identification of dependency relations between such conceptual units. Our approach to parser annotation shows an average MALT LAS score of 82.21% on Tamil gold annotated corpus of 935 sentences in a five-fold validation experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Hindi, Telugu and Bangla accuracies reported are experimented with Computational Paninian Grammar Framework [7]. The Tamil accuracy reported is based on the Universal Treebank results reported by Straka et al. [15].
2.
Indian Language Machine Translation Project funded by DIT, Government of India.
3.
The gold annotation was carried out by AU-KBC Research Centre, Chennai.
4.
http://www.maltparser.org/download.html.

References

Goldberg, A.E.: Construction Grammar. Wiley Online Library (2002)
Google Scholar
Fried, M., Östman, J.O.: Construction grammar. In: Construction Grammar in a Cross-Language Perspective (2011)
Google Scholar
Langacker, R.W.: Cognitive Grammar: A Basic Introduction. Oxford University Press, Oxford (2008)
Book Google Scholar
Shieber, S.M.: Evidence against the context-freeness of natural language. In: Savitch, W.J., Bach, E., Marsh, W., Safran-Naveh, G. (eds.) The Formal Complexity of Natural Language. Studies in Linguistics and Philosophy, vol. 33, pp. 320–334. Springer, Heidelberg (1985). https://doi.org/10.1007/978-94-009-3401-6_12
Chapter Google Scholar
Melčuk, I.A.: Dependency Syntax: Theory and Practice. SUNY Press, Albany (1988)
Google Scholar
Bharati, A., Chaitanya, V., Sangal, R., Ramakrishnamacharyulu, K.: Natural Language Processing: A Paninian Perspective. Prentice-Hall of India, New Delhi (1995)
Google Scholar
Bharati, A., Sangal, R.: Parsing free word order languages in the Paninian framework. In: Proceedings of the 31st Annual Meeting on Association for Computational Linguistics, pp. 105–111. Association for Computational Linguistics (1993)
Google Scholar
Bharati, A., Gupta, M., Yadav, V., Gali, K., Sharma, D.M.: Simple parser for Indian languages in a dependency framework. In: Proceedings of the Third Linguistic Annotation Workshop, pp. 162–165. Association for Computational Linguistics (2009)
Google Scholar
Mannem, P.: Bidirectional dependency parser for Hindi, Telugu and Bangla. In: Proceedings of NLP Tools Contest: Indian Language Dependency Parsing, ICON 2009, India (2009)
Google Scholar
Nivre, J.: Parsing Indian languages with MaltParser. In: Proceedings of the NLP Tools Contest: Indian Language Dependency Parsing, ICON 2009, pp. 12–18 (2009)
Google Scholar
Ambati, B.R., Gadde, P., Jindal, K.: Experiments in Indian language dependency parsing. In: Proceedings of the NLP Tools Contest: Indian Language Dependency Parsing, ICON 2009, pp. 32–37 (2009)
Google Scholar
Antony, P., Warrier, N.J., Soman, K.: Penn treebank-based syntactic parsers for South Dravidian languages using a machine learning approach. Int. J. Comput. Appl. 7, 14–21 (2010)
Google Scholar
Selvam, M., Natarajan, A., Thangarajan, R.: Structural parsing of natural language text in Tamil using phrase structure hybrid language model. Int. J. Comput. Inf. Syst. Sci. Eng. 2008, 2–4 (2008)
Google Scholar
Ramasamy, L., Žabokrtský, Z.: Tamil dependency parsing: results using rule based and corpus based approaches. In: Gelbukh, A.F. (ed.) CICLing 2011. LNCS, vol. 6608, pp. 82–95. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19400-9_7
Chapter Google Scholar
Straka, M., Hajic, J., Straková, J., Hajic Jr., J.: Parsing universal dependency treebanks using neural networks and search-based oracle. In: International Workshop on Treebanks and Linguistic Theories (TLT 2014), p. 208 (2014)
Google Scholar
Kumari, B.V.S., Rao, R.R.: Hindi dependency parsing using a combined model of Malt and MST. In: 24th International Conference on Computational Linguistics, p. 171. Citeseer (2012)
Google Scholar
Kesidi, S.R., Kosaraju, P., Vijay, M., Husain, S.: A constraint based hybrid dependency parser for Telugu. Int. J. Comput. Linguist. Appl. 2, 53 (2011)
Google Scholar
Seddah, D., Tsarfaty, R., Kübler, S., Candito, M., Choi, J., Farkas, R., Foster, J., Goenaga, I., Gojenola, K., Goldberg, Y., et al.: Overview of the SPMRL 2013 shared task: cross-framework evaluation of parsing morphologically rich languages. Association for Computational Linguistics (2013)
Google Scholar
Amritavalli, R., Jayaseelan, K.: Finiteness and negation in Dravidian. In: The Oxford Handbook of Comparative Syntax, pp. 178–220 (2005)
Google Scholar
Amritavalli, R.: Separating tense and finiteness: anchoring in Dravidian. Nat. Lang. Linguist. Theory 32, 283–306 (2014)
Article Google Scholar
McFadden, T., Sundaresan, S.: Finiteness in south Asian languages: an introduction. Nat. Lang. Linguist. Theory 32, 1–27 (2014)
Article Google Scholar
Jayaseelan, K.A.: The serial verb construction in Malayalam. In: Dayal, V., Mahajan, A. (eds.) Clause Structure in South Asian Languages. Studies in Natural Language and Linguistic Theory, vol. 61, pp. 67–91. Springer, Heidelberg (2004). https://doi.org/10.1007/978-1-4020-2719-2_3
Chapter Google Scholar
Jayaseelan, K.: Coordination, relativization and finiteness in Dravidian. Nat. Lang. Linguist. Theory 32, 191–211 (2014)
Article Google Scholar
Herring, S.C.: Aspect as a discourse category in Tamil. In: Annual Meeting of the Berkeley Linguistics Society, vol. 14 (2011)
Google Scholar
Karmakar, S., Kasturirangan, R.: Cognitive processes underlying the meaning of complex predicates and serial verbs from the perspective of individuating and ordering situations in bānlā. In: Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia, pp. 81–87. ACM (2010)
Google Scholar
Bharati, A., Husain, D.S.S., Bai, L., Begam, R., Sangal, R.: Anncorra: Treebanks for Indian languages, guidelines for annotating Hindi treebank (version-2.0) (2009)
Google Scholar
Ambati, B.R., Husain, S., Nivre, J., Sangal, R.: On the role of morphosyntactic features in Hindi dependency parsing. In: Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages, pp. 94–102. Association for Computational Linguistics (2010)
Google Scholar
Szabolcsi, A.: What do quantifier particles do? Linguist. Philos. 38, 159–204 (2015)
Article Google Scholar
Bharati, A., Sangal, R., Sharma, D.M., Bai, L.: Anncorra: Annotating corpora guidelines for POS and chunk annotation for Indian languages. LTRC-TR31 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

International Institute of Information Technology, Gachibowli, Hyderabad, 500032, Telengana, India
Vigneshwaran Muralidaran & Dipti Misra Sharma

Authors

Vigneshwaran Muralidaran
View author publications
You can also search for this author in PubMed Google Scholar
Dipti Misra Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vigneshwaran Muralidaran .

Editor information

Editors and Affiliations

CIC, Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muralidaran, V., Misra Sharma, D. (2018). Construction Grammar Based Annotation Framework for Parsing Tamil. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-75477-2_27
Published: 21 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75476-5
Online ISBN: 978-3-319-75477-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics