Complete Search Space Exploration for SITG Inside Probability

  • Guillem Gascó
  • Joan-Andreu Sánchez
  • José-Miguel Benedí
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6218)

Abstract

Stochastic Inversion Transduction Grammars are a very powerful formalism in Machine Translation that allow to parse a string pair with efficient Dynamic Programming algorithms. The usual parsing algorithms that have been previously defined cannot explore the complete search space. In this work, we propose important modifications that consider the whole search space. We formally prove the correctness of the new algorithm. Experimental work shows important improvements in the probabilistic estimation of the models when using the new algorithm.

Keywords

Machine Translation Original Algorithm Parse Tree Statistical Machine Translation Vocabulary Size 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Wu, D.: Stochastic inversion transduction grammars with application to segmentation, bracketing, and alignment of parallel corpora. In: Proc. of the 14th International Conference on Artificial Intelligence, Montreal, vol. 2, pp. 1328–1335 (1995)Google Scholar
  2. 2.
    Wu, D.: Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics 23(3), 377–404 (1997)Google Scholar
  3. 3.
    Sánchez, J., Benedí, J.: Stochastic inversion transduction grammars for obtaining word phrases for phrase-based statistical machine translation. In: Proc. of Workshop on Statistical Machine Translation. HLT-NAACL 2006, New York, USA, June 2006, pp. 130–133 (2006)Google Scholar
  4. 4.
    Huang, S., Zhou, B.: An em algorithm for scfg in formal syntax-based translation. In: ICASSP, Taiwan, China, April 2009, pp. 4813–4816 (2009)Google Scholar
  5. 5.
    Gascó, G., Sánchez, J.A.: Syntax augmented inversion transduction grammars for machine translation. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 427–437. Springer, Heidelberg (2010)Google Scholar
  6. 6.
    Soggard, A., Wu, D.: Empirical lower bounds on translation unit error rate for the full class of inversion transduction grammars. In: Proc. 11th Internatnional Conference on Parsing Technologies, Paris, October 2009, pp. 33–36 (2009)Google Scholar
  7. 7.
    Gascó, G., Sánchez, J., Benedí, J.: Enlarged search space for sitg parsing. In: Proc. 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT), Los Angeles, USA (June 2010)Google Scholar
  8. 8.
    Wu, D.: Trainable coarse bilingual grammars for parallel text bracketing. In: Proceedings of the Third Annual Workshop on Very Large Corpora, pp. 69–81 (1995)Google Scholar
  9. 9.
    Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, pp. 433–440. Association for Computational Linguistics (2006)Google Scholar
  10. 10.
    Paul, M.: Overview of the IWSLT 2009 Evaluation Campaign. In: Proc. of the International Workshop on Spoken Language Translation, Tokyo, Japan, pp. 1–18 (2009)Google Scholar
  11. 11.
    Germann, U.: Aligned hansards of the 36th parliament of canada (2001), http://www.isi.edu/natural-language/download/hansard/

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Guillem Gascó
    • 1
  • Joan-Andreu Sánchez
    • 1
  • José-Miguel Benedí
    • 1
  1. 1.Institut Tecnològic d’InformàticaUniversitat Politècnica de ValènciaValènciaSpain

Personalised recommendations