Abstract
This chapter will describe how parallel text extraction algorithms can be used for machine aided translation, focusing on two particular applications: semi-automatic construction of bilingual terminology lexicons and translation memory. Automatic word alignment and terminology extraction algorithms can be combined to substantially speed the lexicon construction process. Using a highly accurate partial alignment of term constituents, a terminologist need only recognize and correct minor errors in the recognition of term boundaries. The next generation of translation memory systems will certainly use statistical alignment algorithms and shallow parsing technology to improve coverage of current systems, by allowing for linguistic abstraction and partial sentence matching. Abstracting away from lexical units to part-of-speech, number, term, or noun phrase classes will allow these systems to mix and match components.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aït-Mokhtar, S. and Chanod, J.-P. (1997). Incremental Finite-State Parsing. Proceedings of the Fifth Conference on Applied Natural Language Processing (ANLP ‘87), Washington, DC, 7279.
Barlow, M. (1999). MonoConc 1.5 and ParaConc. International Journal of Corpus Linguistics, 4 (1), 173–184.
Bishop, Y., Fienberg, S. and Holland, P. (1975). Discrete Multivariate Analysis. Cambridge, MA:: MIT Press.
Bourigault, D. (1994). LEX7’ER, un Logiciel d’EXtraction de TERminologie. Unpublished Doctoral dissertation, École des Hautes Études en Sciences Sociales, Paris.
Brown, P. F., Della Pietra, S., Della Pietra, V. and Mercer, R. (1993). The Mathematics of Statisti- cal Machine Translation: Parameter Estimation, Computational Linguistics 19 (2), 263–311.
Brown, R. D. (1999). Adding Linguistic Knowledge to a Lexical Example-based Translation System, Proceedings of the 8th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI’99), Chester, UK, 22–32.
Chuquet, H. and Paillard, M. (1989). Approche linguistique des problèmes de traduction anglaisfrancais., Gap: Ophrys.
Dagan, I., Church, K. W. and Gale, W. A. (1993). Robust Bilingual Word Alignment for Machine-Aided Translation. Proceedings of the Workshop on Very Large Corpora: Academic and Industrial Perspectives, Columbus, Ohio, 1–8.
Daille, B. (1994). Approche mixte pour l’extraction de terminologie: statistique lexicale et filtres linguistiques, Unpublished doctoral dissertation, Université de Paris V II.
Daille, B., Gaussier, E. and Langé, J.-M. (1994). Towards automatic extraction of monolingual and bilingual terminology. Proceedings of the 15th International Conference on Computational Linguistics (COLING’94), Kyoto, Japan, 712–716.
Debili, F. and Zribi, A. (1996). Les dépendances syntaxiques au service de l’appariement des mots, Actes du 10ème Congrès Reconnaissance des Formes et Intelligence Artificielle, Rennes, France.
Dunning, T. E. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19 (1), 61–74.
Gaussier, E. and Langé, J.-M. (1997). Some methods for the extraction of bilingual terminology. In Jones, D. B. and Somers, H. L. (Eds) (1997). New Methods in Language Processing (pp. 145–153 ), London: UCL Press.
Gaussier, E. (1995). Modèles statistiques et patrons morphosyntaxiques pour l’extraction de lexiques bilingues. Unpublished Doctoral dissertation, Université Paris V II.
Gaussier, E. (1998). Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora. Proceedings of the joint 17th International Conference on Computational Linguistics (COLING’98) and 36th Annual Meeting of the Association for Computational Linguistics (ACL’98), Université de Montréal, Montréal, Canada, 444–450.
Hiemstra, D. (1996). Using Statistical Methods to Create a Bilingual Dictionary, Unpublished Master’s thesis, Universiteit Twente.
Jacquemin, C. (1997). Variation terminologique: reconnaissance et acquisition automatique de termes et de leurs variantes en corpus. Habilitation à diriger des recherches, IRIN, Université de Nantes.
Justeson, J. and Katz, S. (1995). Technical Terminology: some Linguistic Properties and an Algorithm for Identification in Text, Natural Language Engineering, 1, 9–27.
Kupiec, J. (1993). An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora. Proceedings of the 31 51 Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 17–22.
Langé, J.-M., Gaussier, E. and Daille, B. (1997). Bricks and skeletons: some ideas for the near future of MART, Machine Translation, 12 (1–2), 39–52.
Macklovitch, E. and Hannan, M.-L. (1998). Line Em Up: Advances in Alignment Technology and their Impact on Translation Support Tools, Machine Translation, 13 (1), 41–58.
Melamed, I. D. (forthcoming). Word-to-Word Models of Translational Equivalence. Computational Linguistics.
Nagao, M. (1984). A framework of a mechanical translation between Japanese and English by analogy principle. In Elithorn, A. and Banerji, R. (Eds.), Artificial and Human Intelligence (pp. 173–180 ), Amsterdam: North-Holland.
Nkwenti-Azeh, B. (1992). Positional and combinational characteristics of satellite communications terms. Technical report, CCI-UMIST, Manchester.
Smadja, F. A. (1992). How to Compile a Bilingual Collocation Lexicon Automatically, Proceedings of the AAAI Workshop on Statistically-Based NLP Techniques, San Jose, CA, 65–71.
Smadja, F. A. (1993). Retrieving collocations from text: Xtract. Computational Linguistics, 19 (1), 143–177.
Trouilleux, F. (1998). Thingfinder prototype. English version 2. 0, Internal Report, Xerox Research Centre Europe.
van der Eijk, P. (1993). Automating the Acquisition of Bilingual Terminology. Proceedings of the 6 117 Conference of the European Chapter of the Association for Computational Linguistics (EACL’93), Utrecht, 113–119.
Wu, D. (1997). Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23 (3), 377–404.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Gaussier, É., Hull, D., Aït-Mokhtar, S. (2000). Term alignment in use. In: Véronis, J. (eds) Parallel Text Processing. Text, Speech and Language Technology, vol 13. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2535-4_13
Download citation
DOI: https://doi.org/10.1007/978-94-017-2535-4_13
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5555-2
Online ISBN: 978-94-017-2535-4
eBook Packages: Springer Book Archive