Advertisement

Interactive Parsing

  • Alejandro Héctor Toselli
  • Enrique Vidal
  • Francisco Casacuberta

Abstract

This chapter introduces the Interactive Parsing (IP) framework for obtaining the correct syntactic parse tree of a given sentence. This formal framework allows us to make the construction of interactive systems for tree annotation. These interactive systems can help to human annotators in creating error-free parse trees with little effort, when compared with manual post-editing of the trees provided by an automatic parser.

In principle, the interaction protocol defined in the IP framework differs from the left-to-right interaction protocol used throughout this book. Specifically, the IP protocol will be of desultory order; that is, in IP the user can edit any part of the parse tree and in any order. However, in order to efficiently calculate the next best tree in IP framework, in Sect. 9.4, a left-to-right depth-first tree review order will be introduced. In addition, this order also introduces computational advantages into the lookout of most probable tree for interactive bottom-up parsing algorithms. The use of Confidence Measures in IP is also presented as an efficient technique to detect erroneous parse trees. Confidence Measures can be efficiently computed in the IP framework and can help in detecting erroneous constituents within the IP process more quickly, as they provide discriminant information over all the IP process.

Keywords

Machine Translation Parse Tree Confidence Measure Statistical Machine Translation Interaction Protocol 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Baker, J. K. (1979). Trainable grammars for speech recognition. The Journal of the Acoustical Society of America, 65, 31–35. Google Scholar
  2. 2.
    Benedí, J. M., & Sánchez, J. A. (2005). Estimation of stochastic context-free grammars and their use as language models. Computer Speech & Language, 19(3), 249–274. CrossRefGoogle Scholar
  3. 3.
    Benedí, J. M., Sánchez, J. A., & Sanchis, A. (2007). Confidence measures for stochastic parsing. In Proceedings of the international conference recent advances in natural language processing (pp. 58–63), Borovets, Bulgaria. Google Scholar
  4. 4.
    Carter, D. (1997). The TreeBanker. A tool for supervised training of parsed corpora. In Proceedings of the workshop on computational environments for grammar development and linguistic engineering (pp. 9–15), Madrid, Spain. Google Scholar
  5. 5.
    Charniak, E. (1997). Statistical parsing with a context-free grammar and word statistics. In Proceedings of the national conference on artificial intelligence (pp. 598–603), Providence, Rhode Island, USA. Google Scholar
  6. 6.
    Charniak, E. (2000). A maximum-entropy-inspired parser. In Proceedings of the first conference on North American chapter of the association for computational linguistics (pp. 132–139), Seattle, Washington, USA. Google Scholar
  7. 7.
    Charniak, E., Knight, K., & Yamada, K. (2003). Syntax-based language models for statistical machine translation. In Machine translation summit, IX international association for machine translation, New Orleans, Louisiana, USA. Google Scholar
  8. 8.
    Chelba, F., & Jelinek, C. (2000). Structured language modeling. Computer Speech and Language, 14(4), 283–332. CrossRefGoogle Scholar
  9. 9.
    Chiang, D. (2007). Hierarchical phrase-based translation. Computational Linguistics, 33(2), 201–228. MATHCrossRefGoogle Scholar
  10. 10.
    Collins, M. (2003). Head-driven statistical models for natural language parsing. Computational Linguistics, 29(4), 589–637. MathSciNetMATHCrossRefGoogle Scholar
  11. 11.
    de la Clergerie, E. V., Hamon, O., Mostefa, D., Ayache, C., Paroubek, P., & Vilnat, A. (2008). PASSAGE: from French parser evaluation to large sized treebank. In Proceedings of the sixth international language resources and evaluation (pp. 3570–3577), Marrakech, Morocco. Google Scholar
  12. 12.
    Earley, J. (1970). An efficient context-free parsing algorithm. Communications of the ACM, 8(6), 451–455. Google Scholar
  13. 13.
    Gascó, G., & Sánchez, J. A. (2007). A* parsing with large vocabularies. In Proceedings of the international conference recent advances in natural language processing (pp. 215–219), Borovets, Bulgaria. Google Scholar
  14. 14.
    Gascó, G., Sánchez, J. A., & Benedí, J. M. (2010). Enlarged search space for sitg parsing. In Proceedings of the North American chapter of the association for computational linguistics—human language technologies conference (pp. 653–656), Los Angeles, California. Google Scholar
  15. 15.
    Hopcroft, J. E., & Ullman, J. D. (1979). Introduction to automata theory, languages and computation. Reading: Addison-Wesley. MATHGoogle Scholar
  16. 16.
    Huang, L., & Chiang, D. (2005). Better k-best parsing. In Proceedings of the ninth international workshop on parsing technology (pp. 53–64), Vancouver, British Columbia. Menlo Park: Association for Computational Linguistics. CrossRefGoogle Scholar
  17. 17.
    Jain, A. K., Duin, R. P., & Mao, J. (2000). Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 4–37. CrossRefGoogle Scholar
  18. 18.
    Klein, D., & Manning, C. D. (2003). Accurate unlexicalized parsing. In Proceedings of the 41st annual meeting on association for computational linguistics (Vol. 1, pp. 423–430), Association for Computational Linguistics Morristown, NJ, USA. Google Scholar
  19. 19.
    Lease, M., Charniak, E., Johnson, M., & McClosky, D. (2006). A look at parsing and its applications. In Proceedings of the twenty-first national conference on artificial intelligence, Boston, Massachusetts, USA. Google Scholar
  20. 20.
    Marcus, M. P., Santorini, B., & Marcinkiewicz, M. A. (1994). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330. Google Scholar
  21. 21.
    Oepen, S., Flickinger, D., Toutanova, K., & Manning, C. D. (2004). LinGO redwoods. Research on Language and Computation, 2(4), 575–596. CrossRefGoogle Scholar
  22. 22.
    Pereira, F., & Schabes, Y. (1992). Inside-outside reestimation from partially bracketed corpora. In Proceedings of the 30th annual meeting of the association for computational linguistics (pp. 128–135). Newark: University of Delaware. CrossRefGoogle Scholar
  23. 23.
    Petrov, S., & Klein, D. (2007). Improved inference for unlexicalized parsing. In Conference of the North American chapter of the association for computational linguistics; proceedings of the main conference (pp. 404–411), Rochester, New York. Google Scholar
  24. 24.
    Roark, B. (2001). Probabilistic top-down parsing and language modeling. Computational Linguistics, 27(2), 249–276. MathSciNetCrossRefGoogle Scholar
  25. 25.
    Salvador, I., & Benedí, J. M. (2002). RNA modeling by combining stochastic context-free grammars and n-gram models. International Journal of Pattern Recognition and Artificial Intelligence, 16(3), 309–315. CrossRefGoogle Scholar
  26. 26.
    San-Segundo, R., Pellom, B., Hacioglu, K., Ward, W., & Pardo, J. M. (2001). Confidence measures for spoken dialogue systems. In IEEE international conference on acoustic speech and signal processing (Vol. 1), Salt Lake City, Utah, USA. Google Scholar
  27. 27.
    Sánchez-Sáez, R., Sánchez, J. A., & Benedí, J. M. (2009). Statistical confidence measures for probabilistic parsing. In Proceedings of the international conference on recent advances in natural language processing (pp. 388–392), Borovets, Bulgaria. Google Scholar
  28. 28.
    Sánchez-Sáez, R., Leiva, L., Sánchez, J. A., & Benedí, J. M. (2010). Confidence measures for error discrimination in an interactive predictive parsing framework. In 23rd International conference on computational linguistics (pp. 1220–1228), Beijing, China. Google Scholar
  29. 29.
    Serrano, N., Sanchis, A., & Juan, A. (2010). Balancing error and supervision effort in interactive-predictive handwriting recognition. In Proceeding of the 14th international conference on intelligent user interfaces (pp. 373–376), Hong Kong, China. CrossRefGoogle Scholar
  30. 30.
    Stolcke, A. (1995). An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. Computational Linguistics, 21(2), 165–200. MathSciNetGoogle Scholar
  31. 31.
    Tarazón, L., Pérez, D., Serrano, N., Alabau, V., Terrades, O. R., Sanchis, A., & Juan, A. (2009). Confidence measures for error correction in interactive transcription of handwritten text. In LNCS: Vol. 5716. Proceedings of the 15th international conference on image analysis and processing (pp. 567–574), Salerno, Italy. Google Scholar
  32. 32.
    Ueffing, N., & Ney, H. (2007). Word-level confidence estimation for machine translation. Computational Linguistics, 33(1), 9–40. MATHCrossRefGoogle Scholar
  33. 33.
    Wessel, F., Schluter, R., Macherey, K., & Ney, H. (2001). Confidence measures for large vocabulary continuous speech recognition. IEEE Transactions on Speech and Audio Processing, 9(3), 288–298. CrossRefGoogle Scholar
  34. 34.
    Wu, D. (1997). Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23(3), 377–404. Google Scholar
  35. 35.
    Yamada, K., & Knight, K. (2002). A decoder for syntax-based statistical MT. In Meeting of the association for computational linguistics, Philadelphia, Pensilvania, USA. Google Scholar
  36. 36.
    Yamamoto, R., Sako, S., Nishimoto, T., & Sagayama, S. (2006). On-line recognition of handwritten mathematical expressions based on stroke-based stochastic context-free grammar. In 10th international workshop on frontiers in handwriting recognition (pp. 249–254), La Baule, France. Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Alejandro Héctor Toselli
    • Enrique Vidal
      • Francisco Casacuberta

        There are no affiliations available

        Personalised recommendations