Skip to main content

Term alignment in use

Machine-aided human translation

  • Chapter
Parallel Text Processing

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 13))

Abstract

This chapter will describe how parallel text extraction algorithms can be used for machine aided translation, focusing on two particular applications: semi-automatic construction of bilingual terminology lexicons and translation memory. Automatic word alignment and terminology extraction algorithms can be combined to substantially speed the lexicon construction process. Using a highly accurate partial alignment of term constituents, a terminologist need only recognize and correct minor errors in the recognition of term boundaries. The next generation of translation memory systems will certainly use statistical alignment algorithms and shallow parsing technology to improve coverage of current systems, by allowing for linguistic abstraction and partial sentence matching. Abstracting away from lexical units to part-of-speech, number, term, or noun phrase classes will allow these systems to mix and match components.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aït-Mokhtar, S. and Chanod, J.-P. (1997). Incremental Finite-State Parsing. Proceedings of the Fifth Conference on Applied Natural Language Processing (ANLP ‘87), Washington, DC, 7279.

    Google Scholar 

  • Barlow, M. (1999). MonoConc 1.5 and ParaConc. International Journal of Corpus Linguistics, 4 (1), 173–184.

    Article  Google Scholar 

  • Bishop, Y., Fienberg, S. and Holland, P. (1975). Discrete Multivariate Analysis. Cambridge, MA:: MIT Press.

    Google Scholar 

  • Bourigault, D. (1994). LEX7’ER, un Logiciel d’EXtraction de TERminologie. Unpublished Doctoral dissertation, École des Hautes Études en Sciences Sociales, Paris.

    Google Scholar 

  • Brown, P. F., Della Pietra, S., Della Pietra, V. and Mercer, R. (1993). The Mathematics of Statisti- cal Machine Translation: Parameter Estimation, Computational Linguistics 19 (2), 263–311.

    Google Scholar 

  • Brown, R. D. (1999). Adding Linguistic Knowledge to a Lexical Example-based Translation System, Proceedings of the 8th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI’99), Chester, UK, 22–32.

    Google Scholar 

  • Chuquet, H. and Paillard, M. (1989). Approche linguistique des problèmes de traduction anglaisfrancais., Gap: Ophrys.

    Google Scholar 

  • Dagan, I., Church, K. W. and Gale, W. A. (1993). Robust Bilingual Word Alignment for Machine-Aided Translation. Proceedings of the Workshop on Very Large Corpora: Academic and Industrial Perspectives, Columbus, Ohio, 1–8.

    Google Scholar 

  • Daille, B. (1994). Approche mixte pour l’extraction de terminologie: statistique lexicale et filtres linguistiques, Unpublished doctoral dissertation, Université de Paris V II.

    Google Scholar 

  • Daille, B., Gaussier, E. and Langé, J.-M. (1994). Towards automatic extraction of monolingual and bilingual terminology. Proceedings of the 15th International Conference on Computational Linguistics (COLING’94), Kyoto, Japan, 712–716.

    Google Scholar 

  • Debili, F. and Zribi, A. (1996). Les dépendances syntaxiques au service de l’appariement des mots, Actes du 10ème Congrès Reconnaissance des Formes et Intelligence Artificielle, Rennes, France.

    Google Scholar 

  • Dunning, T. E. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19 (1), 61–74.

    Google Scholar 

  • Gaussier, E. and Langé, J.-M. (1997). Some methods for the extraction of bilingual terminology. In Jones, D. B. and Somers, H. L. (Eds) (1997). New Methods in Language Processing (pp. 145–153 ), London: UCL Press.

    Google Scholar 

  • Gaussier, E. (1995). Modèles statistiques et patrons morphosyntaxiques pour l’extraction de lexiques bilingues. Unpublished Doctoral dissertation, Université Paris V II.

    Google Scholar 

  • Gaussier, E. (1998). Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora. Proceedings of the joint 17th International Conference on Computational Linguistics (COLING’98) and 36th Annual Meeting of the Association for Computational Linguistics (ACL’98), Université de Montréal, Montréal, Canada, 444–450.

    Google Scholar 

  • Hiemstra, D. (1996). Using Statistical Methods to Create a Bilingual Dictionary, Unpublished Master’s thesis, Universiteit Twente.

    Google Scholar 

  • Jacquemin, C. (1997). Variation terminologique: reconnaissance et acquisition automatique de termes et de leurs variantes en corpus. Habilitation à diriger des recherches, IRIN, Université de Nantes.

    Google Scholar 

  • Justeson, J. and Katz, S. (1995). Technical Terminology: some Linguistic Properties and an Algorithm for Identification in Text, Natural Language Engineering, 1, 9–27.

    Article  Google Scholar 

  • Kupiec, J. (1993). An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora. Proceedings of the 31 51 Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 17–22.

    Google Scholar 

  • Langé, J.-M., Gaussier, E. and Daille, B. (1997). Bricks and skeletons: some ideas for the near future of MART, Machine Translation, 12 (1–2), 39–52.

    Article  Google Scholar 

  • Macklovitch, E. and Hannan, M.-L. (1998). Line Em Up: Advances in Alignment Technology and their Impact on Translation Support Tools, Machine Translation, 13 (1), 41–58.

    Article  Google Scholar 

  • Melamed, I. D. (forthcoming). Word-to-Word Models of Translational Equivalence. Computational Linguistics.

    Google Scholar 

  • Nagao, M. (1984). A framework of a mechanical translation between Japanese and English by analogy principle. In Elithorn, A. and Banerji, R. (Eds.), Artificial and Human Intelligence (pp. 173–180 ), Amsterdam: North-Holland.

    Google Scholar 

  • Nkwenti-Azeh, B. (1992). Positional and combinational characteristics of satellite communications terms. Technical report, CCI-UMIST, Manchester.

    Google Scholar 

  • Smadja, F. A. (1992). How to Compile a Bilingual Collocation Lexicon Automatically, Proceedings of the AAAI Workshop on Statistically-Based NLP Techniques, San Jose, CA, 65–71.

    Google Scholar 

  • Smadja, F. A. (1993). Retrieving collocations from text: Xtract. Computational Linguistics, 19 (1), 143–177.

    Google Scholar 

  • Trouilleux, F. (1998). Thingfinder prototype. English version 2. 0, Internal Report, Xerox Research Centre Europe.

    Google Scholar 

  • van der Eijk, P. (1993). Automating the Acquisition of Bilingual Terminology. Proceedings of the 6 117 Conference of the European Chapter of the Association for Computational Linguistics (EACL’93), Utrecht, 113–119.

    Google Scholar 

  • Wu, D. (1997). Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23 (3), 377–404.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Gaussier, É., Hull, D., Aït-Mokhtar, S. (2000). Term alignment in use. In: Véronis, J. (eds) Parallel Text Processing. Text, Speech and Language Technology, vol 13. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2535-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-2535-4_13

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-5555-2

  • Online ISBN: 978-94-017-2535-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics