Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4285))

Included in the following conference series:

  • 1012 Accesses

Abstract

We present a phrase-based SMT approach in which the word-order problem is solved using syntactic transformation in the preprocessing phase (There is no reordering in the decoding phase.) We describe a syntactic transformation model based on the probabilistic context-free grammar. This model is trained by using bilingual corpus and a broad coverage parser of the source language. This phrase-based SMT approach is applicable to language pairs in which the target language is poor in resources. We considered translation from English to Vietnamese and from English to French. Our experiments showed significant BLEU-score improvements in comparison with Pharaoh, a state-of-the-art phrase-based SMT system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bikel, D.M.: Intricacies of Collins’ Parsing Model. Computational Linguistics 30(4), 479–511 (2004)

    Article  Google Scholar 

  2. Brown, P.F., Pietra, S.A.D., Pietra, V.J.D., Mercer, R.L.: The mathematics of statistical machine translation. Computational Linguistics 22(1), 39–69 (1993)

    Google Scholar 

  3. Charniak, E.: A maximum entropy inspired parser. In: Proceedings of HLT-NAACL (2000)

    Google Scholar 

  4. Charniak, E., Knight, K., Yamada, K.: Syntax-based language models for statistical machine translation. In: Proceedings of the MT Summit IX (2003)

    Google Scholar 

  5. Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. PhD Thesis, University of Pennsylvania (1999)

    Google Scholar 

  6. Collins, M., Koehn, P., Kucerova, I.: Clause restructuring for statistical machine translation. In: Proceedings of ACL 2005 (2005)

    Google Scholar 

  7. Goldwater, S., McClosky, D.: Improving statistical MT through morphological analysis. In: Proceedings of EMNLP 2005 (2005)

    Google Scholar 

  8. Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of ACL 2003 (2003)

    Google Scholar 

  9. Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of HLT-NAACL 2003 (2003)

    Google Scholar 

  10. Koehn, P.: Pharaoh: a beam search decoder for phrase-based statistical machine translation models. In: Frederking, R.E., Taylor, K.B. (eds.) AMTA 2004. LNCS (LNAI), vol. 3265, pp. 115–124. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Lehmann, E.L.: Testing Statistical Hypotheses, 2nd edn. Springer, Heidelberg (1986)

    MATH  Google Scholar 

  12. Marcu, D., Wong, W.: A phrase-based, joint probability model for statistical machine translation. In: Proceedings of EMNLP 2002 (2002)

    Google Scholar 

  13. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Buildind a large annotated corpus of English: The Penn TreeBank. Computational Linguistics 19, 313–330 (1993)

    Google Scholar 

  14. Melamed, I.D.: Statistical machine translation by parsing. In: Proceedings of ACL 2004 (2004)

    Google Scholar 

  15. Niessen, S., Ney, H.: Statistical machine translation with scarce resources using morpho-syntactic information. Computational Linguistics 30(2), 181–204 (2004)

    Article  Google Scholar 

  16. Och, F.J., Ney, H.: Improved statistical alignment models. In: Proceedings of ACL 2000 (2000)

    Google Scholar 

  17. Och, F.J., Ney, H.: The alignment template approach to statistical machine translation. Computational Linguistics 30, 417–449 (2004)

    Article  Google Scholar 

  18. Och, F.J., Gildea, D., Khudanpur, S., Sarkar, A., Yamada, K., Fraser, A., Kumar, S., Shen, L., Smith, D., Eng, K., Jain, V., Jin, Z., Radev, D.: A smorgasbord of features for statistical machine translation. In: Proceedings of HLT-NAACL 2004 (2004)

    Google Scholar 

  19. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. Technical Report RC22176 (W0109-022), IBM Research Report (2001)

    Google Scholar 

  20. Shen, L., Sarkar, A., Och, F.J.: Discriminative reranking for machine translation. In: Proceedings of HLT-NAACL 2004 (2004)

    Google Scholar 

  21. Stolcke, A.: SRILM - An Extensible Language Modeling Toolkit. In: Proc. Intl. Conf. Spoken Language Processing, Denver, Colorado (September 2002)

    Google Scholar 

  22. Nguyen, T.P., Nguyen, V.V., Le, A.C.: Vietnam-ese Word Segmentation Using Hidden Markov Model. In: International Workshop for Computer, Information, and Communication Technologies in Korea and Vietnam (2003)

    Google Scholar 

  23. Nguyen, T.P., Shimazu, A.: Improving Phrase-Based SMT with Morpho-Syntactic Analysis and Transformation. In: Proceedings of AMTA 2006 (2006)

    Google Scholar 

  24. Xia, F., McCord, M.: Improving a statistical MT system with automatically learned rewrite patterns. In: Proceedings of COLING 2004 (2004)

    Google Scholar 

  25. Yamada, K., Knight, K.: A syntax-based statistical translation model. In: Proceedings of ACL 2001 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nguyen, T.P., Shimazu, A. (2006). A Syntactic Transformation Model for Statistical Machine Translation. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_7

Download citation

  • DOI: https://doi.org/10.1007/11940098_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49667-0

  • Online ISBN: 978-3-540-49668-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics