Skip to main content

Adaptive Bilingual Sentence Alignment

  • Conference paper
  • First Online:
Machine Translation: From Research to Real Users (AMTA 2002)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2499))

Included in the following conference series:

Abstract

We present a new approach to the problem of aligning English and Chinese sentences in a bilingual corpus based on adaptive learning. While using length information alone produces surprisingly good results for aligning bilingual French and English sentences with success rates well over 95%, it does not fair as well for the alignment of English and Chinese sentences. The crux of the problem lies in greater variability of lengths and match types of the matched sentences. We propose to cope with such variability via a two-pass scheme under which model parameters can be learned from the data at hand. Experiments show that under the approach bilingual English-Chinese texts can be aligned effectively across diverse domains, genres and translation directions with accuracy rates approaching 99%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Behavior Design Co.: The BDC Chinese-English Electronic Dictionary (Version 2.0), Taiwan (1992).

    Google Scholar 

  2. Brown, P.F., Della Pietra, S., Della Pietra, V., Mercer, R.L.: The Mathematic of Statistical Machine Translation: Parameter Estimation. Computational Linguistics 19:2 (1994) 263–311.

    Google Scholar 

  3. Brown, PF, Jennifer C. Lai, and Robert L. Mercer: Aligning Sentences in Parallel Corpora, In Proc. of the 29th Annual Meeting of the ACL (1991) 169–176.

    Google Scholar 

  4. Chang, J.S., Yu, D. and Lee, C.J.: Statistical Translation Model for Phrases, Computational Linguistic and Chinese Language Processing, 6:2 (2001) 43–64 (in Chinese).

    Google Scholar 

  5. Chen, S.F.: Aligning Sentences in Bilingual Corpora Using Lexical Information, In Proc. of 30th Annual Meeting of ACL (1993) 9–16.

    Google Scholar 

  6. Gale, W.A. and Church, K.W.: A program for aligning sentences in bilingual corpora, In Proc. of the 29th Annual Meeting of the ACL (1991) 177–184.

    Google Scholar 

  7. Jutras, J-M.: An Automatic Reviser: The TransCheck System, In Proc. of Applied Natural Language Processing (2000) 127–134.

    Google Scholar 

  8. Kay, M. and Röscheisen, M: Text-Translation Alignment, Computational Linguistics 19:1 (1994) 121–142.

    Google Scholar 

  9. Ker, S.J. and Chang J.S.: A Class-base Approach to Word Alignment, Computational Linguistics, 23:2 (1997) 313–343.

    Google Scholar 

  10. Kueng, T.L. and Su, K.Y.: A Robust Cross-Domain Bilingual Sentence Alignment Model, In Proceedings of the 19th International Conference on Computational Linguistics (2002).

    Google Scholar 

  11. Kwok, K.L.: NTCIR-2 Chinese, Cross-Language Retrieval Experiments Using PIRCS. In Proceedings of the Second NTCIR Workshop Meeting, National Institute of Informatics, Japan (2001) 14–20.

    Google Scholar 

  12. Longman Group.: Longman English-Chinese Dictionary of Contemporary English, Published by Longman Group (Far East) Ltd., Hong Kong (1992).

    Google Scholar 

  13. Melamed, I.D.: Bitext Maps and Alignment via Pattern Recognition, Computational Linguistics 25:1 (1999) 107–130.

    Google Scholar 

  14. Wu, D.K.: Aligning a Parallel English-Chinese Corpus Statistically with Lexical Criteria, In Proc. of the 31st Annual Meeting of the Association for Computational Linguistics (1994) 80–87.

    Google Scholar 

  15. Yamada, K, and Knight, K.: A Syntax-based Approach to Statistical Machine Translation. Proc. of the Conference of the Association for Computational Linguistics (2001) 523–530.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chuang, T.C., You, G.N., Chang, J.S. (2002). Adaptive Bilingual Sentence Alignment. In: Richardson, S.D. (eds) Machine Translation: From Research to Real Users. AMTA 2002. Lecture Notes in Computer Science(), vol 2499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45820-4_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-45820-4_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44282-0

  • Online ISBN: 978-3-540-45820-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics