Term alignment in use

Gaussier, Éric; Hull, David; Aït-Mokhtar, Salah

doi:10.1007/978-94-017-2535-4_13

Éric Gaussier⁴,
David Hull⁴ &
Salah Aït-Mokhtar⁴

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 13))

255 Accesses
6 Citations

Abstract

This chapter will describe how parallel text extraction algorithms can be used for machine aided translation, focusing on two particular applications: semi-automatic construction of bilingual terminology lexicons and translation memory. Automatic word alignment and terminology extraction algorithms can be combined to substantially speed the lexicon construction process. Using a highly accurate partial alignment of term constituents, a terminologist need only recognize and correct minor errors in the recognition of term boundaries. The next generation of translation memory systems will certainly use statistical alignment algorithms and shallow parsing technology to improve coverage of current systems, by allowing for linguistic abstraction and partial sentence matching. Abstracting away from lexical units to part-of-speech, number, term, or noun phrase classes will allow these systems to mix and match components.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aït-Mokhtar, S. and Chanod, J.-P. (1997). Incremental Finite-State Parsing. Proceedings of the Fifth Conference on Applied Natural Language Processing (ANLP ‘87), Washington, DC, 7279.
Google Scholar
Barlow, M. (1999). MonoConc 1.5 and ParaConc. International Journal of Corpus Linguistics, 4 (1), 173–184.
Article Google Scholar
Bishop, Y., Fienberg, S. and Holland, P. (1975). Discrete Multivariate Analysis. Cambridge, MA:: MIT Press.
Google Scholar
Bourigault, D. (1994). LEX7’ER, un Logiciel d’EXtraction de TERminologie. Unpublished Doctoral dissertation, École des Hautes Études en Sciences Sociales, Paris.
Google Scholar
Brown, P. F., Della Pietra, S., Della Pietra, V. and Mercer, R. (1993). The Mathematics of Statisti- cal Machine Translation: Parameter Estimation, Computational Linguistics 19 (2), 263–311.
Google Scholar
Brown, R. D. (1999). Adding Linguistic Knowledge to a Lexical Example-based Translation System, Proceedings of the 8th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI’99), Chester, UK, 22–32.
Google Scholar
Chuquet, H. and Paillard, M. (1989). Approche linguistique des problèmes de traduction anglaisfrancais., Gap: Ophrys.
Google Scholar
Dagan, I., Church, K. W. and Gale, W. A. (1993). Robust Bilingual Word Alignment for Machine-Aided Translation. Proceedings of the Workshop on Very Large Corpora: Academic and Industrial Perspectives, Columbus, Ohio, 1–8.
Google Scholar
Daille, B. (1994). Approche mixte pour l’extraction de terminologie: statistique lexicale et filtres linguistiques, Unpublished doctoral dissertation, Université de Paris V II.
Google Scholar
Daille, B., Gaussier, E. and Langé, J.-M. (1994). Towards automatic extraction of monolingual and bilingual terminology. Proceedings of the 15th International Conference on Computational Linguistics (COLING’94), Kyoto, Japan, 712–716.
Google Scholar
Debili, F. and Zribi, A. (1996). Les dépendances syntaxiques au service de l’appariement des mots, Actes du 10ème Congrès Reconnaissance des Formes et Intelligence Artificielle, Rennes, France.
Google Scholar
Dunning, T. E. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19 (1), 61–74.
Google Scholar
Gaussier, E. and Langé, J.-M. (1997). Some methods for the extraction of bilingual terminology. In Jones, D. B. and Somers, H. L. (Eds) (1997). New Methods in Language Processing (pp. 145–153 ), London: UCL Press.
Google Scholar
Gaussier, E. (1995). Modèles statistiques et patrons morphosyntaxiques pour l’extraction de lexiques bilingues. Unpublished Doctoral dissertation, Université Paris V II.
Google Scholar
Gaussier, E. (1998). Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora. Proceedings of the joint 17th International Conference on Computational Linguistics (COLING’98) and 36th Annual Meeting of the Association for Computational Linguistics (ACL’98), Université de Montréal, Montréal, Canada, 444–450.
Google Scholar
Hiemstra, D. (1996). Using Statistical Methods to Create a Bilingual Dictionary, Unpublished Master’s thesis, Universiteit Twente.
Google Scholar
Jacquemin, C. (1997). Variation terminologique: reconnaissance et acquisition automatique de termes et de leurs variantes en corpus. Habilitation à diriger des recherches, IRIN, Université de Nantes.
Google Scholar
Justeson, J. and Katz, S. (1995). Technical Terminology: some Linguistic Properties and an Algorithm for Identification in Text, Natural Language Engineering, 1, 9–27.
Article Google Scholar
Kupiec, J. (1993). An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora. Proceedings of the 31 51 Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 17–22.
Google Scholar
Langé, J.-M., Gaussier, E. and Daille, B. (1997). Bricks and skeletons: some ideas for the near future of MART, Machine Translation, 12 (1–2), 39–52.
Article Google Scholar
Macklovitch, E. and Hannan, M.-L. (1998). Line Em Up: Advances in Alignment Technology and their Impact on Translation Support Tools, Machine Translation, 13 (1), 41–58.
Article Google Scholar
Melamed, I. D. (forthcoming). Word-to-Word Models of Translational Equivalence. Computational Linguistics.
Google Scholar
Nagao, M. (1984). A framework of a mechanical translation between Japanese and English by analogy principle. In Elithorn, A. and Banerji, R. (Eds.), Artificial and Human Intelligence (pp. 173–180 ), Amsterdam: North-Holland.
Google Scholar
Nkwenti-Azeh, B. (1992). Positional and combinational characteristics of satellite communications terms. Technical report, CCI-UMIST, Manchester.
Google Scholar
Smadja, F. A. (1992). How to Compile a Bilingual Collocation Lexicon Automatically, Proceedings of the AAAI Workshop on Statistically-Based NLP Techniques, San Jose, CA, 65–71.
Google Scholar
Smadja, F. A. (1993). Retrieving collocations from text: Xtract. Computational Linguistics, 19 (1), 143–177.
Google Scholar
Trouilleux, F. (1998). Thingfinder prototype. English version 2. 0, Internal Report, Xerox Research Centre Europe.
Google Scholar
van der Eijk, P. (1993). Automating the Acquisition of Bilingual Terminology. Proceedings of the 6 117 Conference of the European Chapter of the Association for Computational Linguistics (EACL’93), Utrecht, 113–119.
Google Scholar
Wu, D. (1997). Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23 (3), 377–404.
Google Scholar

Download references

Author information

Authors and Affiliations

Xerox Research Centre Europe, France
Éric Gaussier, David Hull & Salah Aït-Mokhtar

Authors

Éric Gaussier
View author publications
You can also search for this author in PubMed Google Scholar
David Hull
View author publications
You can also search for this author in PubMed Google Scholar
Salah Aït-Mokhtar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Université de Provence and CNRS, 29, Avenue Robert Schuman, 13100, Aix-en-Provence, France
Jean Véronis

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gaussier, É., Hull, D., Aït-Mokhtar, S. (2000). Term alignment in use. In: Véronis, J. (eds) Parallel Text Processing. Text, Speech and Language Technology, vol 13. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2535-4_13

Download citation

DOI: https://doi.org/10.1007/978-94-017-2535-4_13
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5555-2
Online ISBN: 978-94-017-2535-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics