Abstract
This chapter addresses a number of questions regarding multilingual texts, where multilingual texts is taken as meaning texts represented in more than two languages. In particular, it raises the question of whether there is any real use for mapping out multilingual translation equivalence. The view that is proposed is that multiple versions of a text can (and should) be seen as additional sources of information that can effectively be exploited to produce better bilingual alignments. A general multilingual alignment technique is presented, whose computational complexity, for a given number of texts, is the same as that of bilingual alignment. Experimental results show how this method. improves the accuracy of bilingual alignments on a trilingual corpus (The Gospel According to John, in English, French and Spanish).
This research was funded by the Canadian Department of Foreign Affairs and International Trade (http://www.dfait-maeci.gc.ca/), via the Agence de la Francophonie (http://www.francophonie.org/).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barton, G. J. and Sternberg, M. J. E. (1987). A Strategy for the Rapid Multiple Alignment of Proteine Sequences. Journal of Molecular Biology, 198, 327–337.
Brown, P. F., Della Pietra, S., Della Pietra, V. J. and Mercer, R. L. (1993). The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19 (2), 263-311.
Carillo, H. and Lipman, D. (1988). The Multiple Sequence Alignment Problem in Biology. In SIAM Journal of Applied Mathematics, 48(5), 1073–1082
Chan, S., Wong, A. and Chiu, D. (1992). A Survey Of Multiple Sequence Comparison Methods. In Bulletin of Mathematical Biology, 54 (4), 563–598.
Dagan, I. and Church, K. W. (1994). Termight: identifying and translating technical terminology. Proceedings of the 4` h Conference on Applied Natural Language Processing (ANLP ‘84), University of Stuttgart, Germany, 34–40.
Dimitrova, L., Erjavec, T., Ide, N., Kaalep, H. J., Petkevic, V. and Tufis, D. (1998). Multext-East: Parallel and Comparable Corpora and Lexicons for Six Central and Eastern European Languages. Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics (ACL) and 17th International Conference on Computational Linguistics (COLING’98), Montréal, Canada, 315–319.
Gale, W. A. and Church, K. W. (1991). A program for aligning sentences in bilingual corpora. http://www.up.univ-mrs.fr/—veronis/arcade/ Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics (ACL),Berkeley, 177–184.
Ide, N. and Véronis, J. (1994). MULTEXT (Multilingual Text Tools and Corpora). In Proceedings of the International Conference on Computational Linguistics (COLING) 1994, Kyoto, Japan, 588–592.
Isabelle, P., Dymetman, M., Foster, G. F., Jutras, J.-M., Macklovitch, E., Perrault, F., Ren, X. and Simard, M. (1993). Translation analysis and translation automation. Proceedings of the Fifth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI’93), Kyoto, Japan, 201–217.
Kay, M. and Röscheisen, M. (1993). Text-translation alignment. Computational Linguistics, 19(1), 121–142.
Klavans, J. and Tzoukermann, E. (1995). Combining Corpus and Machine-readable Dictionary Data for Building Bilingual Lexicons. Machine Translation, 10(3), 185–218.
Langlais, Ph., Simard, M., Véronis, J., Armstrong, S., Bonhomme, P., Debili, F., Isabelle, P., Souissi, E. and Théron, P. (1998). ARCADE: A Cooperative Research Project on Parallel Text Alignment Evaluation. Proceedings of the First International Conference on Language Resources and Evaluation (LREC), Granada, Spain, 289–292.
Langlois, L. (1996). Bilingual Concordances: A New Tool for Bilingual Lexicographers. In Proceedings of the Second Conference of the Association for Machine Translation in the Americas (AMTA), Montréal, Canada, 34–42.
Macklovitch, E. (1995). TransCheck — or the Automatic Validation of Human Translations. Proceedings of the Fifth Machine Translation Summit, MT Summit V, Luxembourg [no page numbers in original].
Macklovitch, E. (1996). Peut-on vérifier automatiquement la cohérence terminologique? In META, 41 (3), 299–327.
McEnery, A. M., Wilson, A., Sanchez-Leon, F. and Nieto-Serrano, A. (1997). Multilingual Resources for European Languages: Contributions of the CRATER Project. In Literary and Linguistic Computing, 12 (4), 219–226
Melamed, I. D. (1996). Automatic construction of clean broad-coverage translation lexicons. Proceedings of the 2nd Conference of the Association for Machine Translation in the Americas (AMTA’96), Montreal, 125–134.
Melamed, I. D. (1998) Manual Annotation of Translational Equivalence: The Blinker Project,University of Pennsylvania (IRCS Technical Report #98–07).
Simard, M. (1998a). Projet TRIAL: Appariement de texte trilingue. [Online] Available: http://www-rali.iro.umontreal.ca/Trial.
Simard, M. (1998b). RALI-ARCADE: Analyse des erreurs d’alignement commises par Salign sur les corpus BAF et JOC. [Online] Available: http://www-rali.iro.umontreal.ca/arc-a2/analyseerreurs.
Simard, M. (1998c). The BAF: a corpus of English-French bitext. Proceedings of First International Conference on Language Resources and Evaluation (LREC), Granada, Spain, 489–496.
Simard, M., Foster, G. F. and Isabelle, P. (1992). Using cognates to align sentences in bilingual corpora. Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), Montréal, Canada, 67–81.
Sternberg, M. J. E. (Ed.) (1996). Protein Structure Prediction —A Practical Approach. Oxford University Press, Oxford.
Wagner, R. A. and Fischer, M. J. (1974). The String-to-string Correction Problem. Journal of the ACM, 21 (1), 168–173.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Simard, M. (2000). Multilingual text alignment. In: Véronis, J. (eds) Parallel Text Processing. Text, Speech and Language Technology, vol 13. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2535-4_3
Download citation
DOI: https://doi.org/10.1007/978-94-017-2535-4_3
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5555-2
Online ISBN: 978-94-017-2535-4
eBook Packages: Springer Book Archive