Skip to main content

An Unsupervised Approach for Linking Automatically Extracted and Manually Crafted LTAGs

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6608))

  • 2189 Accesses

Abstract

Though the lack of semantic representation of automatically extracted LTAGs is an obstacle in using these formalism, due to the advent of some powerful statistical parsers that were trained on them, these grammars have been taken into consideration more than before. Against of this grammatical class, there are some widely usage manually crafted LTAGs that are enriched with semantic representation but suffer from the lack of efficient parsers. The available representation of latter grammars beside the statistical capabilities of former encouraged us in constructing a link between them.

Here, by focusing on the automatically extracted LTAG used by MICA [4] and the manually crafted English LTAG namely XTAG grammar [32], a statistical approach based on HMM is proposed that maps each sequence of former elementary trees onto a sequence of later elementary trees. To avoid of converging the HMM training algorithm in a local optimum state, an EM-based learning process for initializing the HMM parameters were proposed too. Experimental results show that the mapping method can provide a satisfactory way to cover the deficiencies arises in one grammar by the available capabilities of the other.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baldi, P., Chauvin, Y.: Smooth On-Line Learning Algorithms for Hidden Markov Models. Neural Computation 6(2), 307–318 (1994)

    Article  Google Scholar 

  2. Bangalore, S., Joshi, A.: Supertagging: An approach to almost parsing. Computational Linguistics 25(2), 237–266 (1999)

    Google Scholar 

  3. Bangalore, S., Haffner, P., Emami, G.: Factoring global inference by enriching local representations. Technical report, AT&T Labs – Reserach (2005)

    Google Scholar 

  4. Bangalore, S., Boulllier, P., Nasr, A., Rambow, O., Sagot, B.: MICA: A probabilistic dependency parser based on Tree Insertion Grammar. In: North American Chapter of the Association for Computational Linguistics, NAACL (2009)

    Google Scholar 

  5. Baum, L.E.: An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. In: Shisha, O. (ed.) Inequalities III: Proceedings of the Third Symposium on Inequalities, University of California, Los Angeles, pp. 1–8. Academic Press, London (1972)

    Google Scholar 

  6. Boullier, P., Deschamp, P.: Le système SYNTAXTM – manuel d’utilisation et de mise en oeuvre sous UNIXTM (1972), http://syntax.gforge.inria.fr/syntax3.8-manual.pdf

  7. Brill, E.: Transformation-based error-driven learning and natural language processing: A case study in part of-speech-tagging. Computational Linguistics 21(4), 543–566 (1995)

    MathSciNet  Google Scholar 

  8. Chen, J., Bangalore, S., Vijay-Shanker, K.: New models for improving supertags disambiguation. In: Proc. EACL 1999, Bergen, pp. 188–195 (1999)

    Google Scholar 

  9. Chen, J.: Towards Efficient Statistical Parsing Using Lexicalized Grammatical Information. Ph.D. thesis, University of Delaware (2001)

    Google Scholar 

  10. Dang, H., Kipper, K., Palmer, M.: Integrating Compositional Semantic into a Verb Lexicon. In: Proceedings of the Eighteenth International Conference on Computational Linguistic, COLING 2000 (2000)

    Google Scholar 

  11. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from in- complete data via the EM algorithm. Journal of the Royal Statistical Society 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  12. Faili, H., Ghassem-Sani, G.: An application of Lexicalized Grammars in English-Persian Translation. In: Proceedings of 16TM European Conference on Artificial Intelligence (ECAI), pp. 596–600 (2004)

    Google Scholar 

  13. Faili, H.: From partial toward full parsing. In: Recent Advance In Natural Language Processing (RANLP), pp. 71–75 (2009)

    Google Scholar 

  14. Faili, H., Basirat, A.: Augmenting the Automated Extracted Tree Adjoining Grammars by Semantic Representation. In: The 6th IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE 2010), pp. 584–590 (2010)

    Google Scholar 

  15. Habash, N., Rambow, O.: Extracting a Tree Adjoining Grammar from the Penn Arabic Treebank. In: Proceedings of Traitement Automatique du Langage Naturel (TALN 2004), Fez, Morocco (2004)

    Google Scholar 

  16. Hemphill, C.T., Godfrey, J.J., Doddington, G.R.: The atis spoken language systems pilot corpus. In: DARPA Speech and Natural Language Workshop, Hidden Valley (1990)

    Google Scholar 

  17. Joshi, A.K., Levy, L., Takahashi, M.: Tree Adjunct Grammars. Journal of Computer and System Sciences 10(1), 136–163 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  18. Joshi, A.K.: How much context-sensitivity is necessary for characterizing structural descriptions? In: Natural Language Processing: Theoretical, Computational, and Psychological Perspectives, pp. 206–250. Cambridge University Press, New York (1985)

    Chapter  Google Scholar 

  19. Kipper, K., Dang, H., Palmer, M.: Class-based Construction of Verb Lexicon. In: Proceedings of Seventh Nation Conference on Artificial Intelligence, AAAI 2000 (2000)

    Google Scholar 

  20. Makino, T., Yoshida, M., Torisawa, K., Tsujii, J.: LiLFeS-Toward a Practical HPSG Parser. In: Proceeding of COLING-ACL 1998 (1998)

    Google Scholar 

  21. Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics 19(2) (1993)

    Google Scholar 

  22. Murphy, K.: A Hidden Markov Model (HMM) Toolbox for Matlab (1998), http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html

  23. Neumann, G.: Automatic extraction of stochastic lexicalized tree grammars from tree banks. In: Forth International Workshop on TAG and Related Frameworks, TAG+4 (1998)

    Google Scholar 

  24. Park, J.: Extraction of tree adjoining grammar from a tree bank for Korean. In: Proceedings of the COLING/ACL 2006 Student Research Workshop, pp. 73–78 (2006)

    Google Scholar 

  25. Rabiner, L.R.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  26. Ryant, N., Kipper, K.: Assigning XTAG trees to VerbNet. In: Seventh International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+7), pp. 194–198 (May 2004)

    Google Scholar 

  27. Schabes, Y., Waters, R.C.: Tree Insertion Grammar. Computational Linguistics 21(4) (1995)

    Google Scholar 

  28. Shen, L., Joshi Aravind, K.: Incremental ltag parsing. In: HLT-EMNLP 2005 (2005)

    Google Scholar 

  29. Van Noord, G.: Head-corner parsing for TAG. Computational Intelligence 10(4), 525–534 (1994)

    Article  Google Scholar 

  30. Xia, F.: Automatic grammar generation from two different perspectives. Ph D. thesis, University of Pennsylvania (2001)

    Google Scholar 

  31. Xia, F., Palmer, M.: Evaluating the Coverage of LTAGs on Annotated Corpora. In: Proceeding of Workshop on Using Evaluation within HLD Programs. Results and Trends at Second International Conference on Language Resources and Evaluation, pp. 1–6 (2001)

    Google Scholar 

  32. XTAG-group. A Lexicalized Tree Adjoining Grammar for English, Technical Report IRCS 98-18, Institute for Research in Cognitive Science, University of Pennsylvania, pp. 5–10 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Faili, H., Basirat, A. (2011). An Unsupervised Approach for Linking Automatically Extracted and Manually Crafted LTAGs. In: Gelbukh, A.F. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19400-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19400-9_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19399-6

  • Online ISBN: 978-3-642-19400-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics