Skip to main content

From the Rosetta stone to the information society

A survey of parallel text processing

  • Chapter
Book cover Parallel Text Processing

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 13))

Abstract

This introductory chapter provides a survey of the processing and use of parallel texts, i.e., texts accompanied by their translation. Throughout the chapter, the various authors’ contributions to the book are considered and related to the state of the art in the field. Three themes are addressed, corresponding to the three parts of the book: (i) techniques and methodology for the alignment of parallel texts at various levels such as sentences, clauses or words; (ii) applications of parallel texts in fields such as translation, lexicography, and information retrieval; and (iii) available corpus resources and evaluation of alignment methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aarts, J. (1990). Corpus linguistics: An appraisal. In Hammesse, J., Zampolli, A. (Eds.), Computers in literary and linguistic research (pp. 13–28 ).

    Google Scholar 

  • Paris-Genève: Champion Slatkine. Andrews, C. (1981). The Rosetta Stone London: British Museum Publications.

    Google Scholar 

  • Arad, I. (1991). A quasi-statistical approach to automatic generation of linguistic knowledge. Unpublished dissertation. UMIST, Manchester.

    Google Scholar 

  • Barlow, M. (1995). ParaConc: A concordancer for parallel texts. Computers and Texts, 10, 14–16.

    Google Scholar 

  • Barlow, M. (1996). Parallel Texts in Language Teaching. In Botley, S., Glass, J., McEnery, T. and Wilson, A. (Eds.) (1996). Proceedings of Teaching and Language Corpora 1996. (pp. 45–56 ).

    Google Scholar 

  • Technical Paper 9, University Centre for Computer Corpus Research on Language, Lancaster.

    Google Scholar 

  • Blank, I. (1995). Sentence alignment: methods and implementation. Traitement automatique des langues, 36 (1–2), 81–99.

    Google Scholar 

  • Bonfante, L., Chadwick, J., Cook, B. F., Davies, W. V., Healey, J. F., Hooker, J. T. and Walker, C. B. F. (1990). Reading the past: Ancient Writing from Cuneiform to the Alphabet. London: British Museum Publications.

    Google Scholar 

  • Bonhomme, P. and Romary, L. (1995). Projet de Concordances Parallèles Lingua: gestion de textes multilingues pour l’apprentissage des langues. Paper presented at Quinzièmes Journées Internationales IA 95, Montpellier.

    Google Scholar 

  • Bourigault, D. (1992). Surface grammatical analysis for the extraction of terminological noun phrases. Proceedings of the 14th International Conference on Computational Linguistics (COLING ‘82), Nantes, France, 977–981.

    Google Scholar 

  • Brown, P. F., Cocke, J., Della Pietra, S., Della Pietra, V. J., Jelinek, F., Mercer, R. L. and Roossin, P. (1988). A statistical approach to machine translation. Proceedings of the 12th International Conference on Computational Linguistics (COLING’88), Budapest, 71–76.

    Google Scholar 

  • Brown, P. F., Cocke, J., Della Pietra, S., Della Pietra, V. J., Jelinek, F., Lafferty, J., Mercer, R. L. and Roosin, P. (1990). A statistical approach to machine translation. Computational Linguistics, 16 (2), 79–85.

    Google Scholar 

  • Brown, P. F., Della Pietra, S., Della Pietra, V. J., Lafferty, J. and Mercer, R. L. (1992). Analysis, statistical transfer, and synthesis in machine translation. Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI’92), Montréal, 83–100.

    Google Scholar 

  • Brown, P. F., Della Pietra, S., Della Pietra, V. J. and Mercer, R. L. (1991). Word sense disambiguation using statistical methods. Proceedings of the 29th Annual Meeting of Association for Computational Linguistics, Berkeley, California, 264–270.

    Google Scholar 

  • Brown, P. F., Della Pietra, S., Della Pietra, V. J. and Mercer, R. L. (1993). The mathematics of statistical machine translation: parameter estimation. Computational Linguistics,19(2), 263311.

    Google Scholar 

  • Brown, P. F., Lai, J. C. and Mercer, R. L. (1991). Aligning Sentences in Parallel Corpora, Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics. Berkeley, 169–176.

    Google Scholar 

  • Brown, R. D. (1996). Example-Based Machine Translation in the Pangloss System. Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), Copenhagen, 169–174. Available: http://www.cs.cmu.edu/-ralf/ papers.html.

    Google Scholar 

  • Catizone, R., Russell, G. and Warwick, S. (1989). Deriving Translation Data from Bilingual Texts, Proceedings of the First International Lexical Acquisition Workshop. Detroit, 1–7.

    Google Scholar 

  • Chen, S. (1996). Building Probabilistic Models for Natural Language. Unpublished doctoral dissertation, Harvard University, Cambridge, MA.

    Google Scholar 

  • Church, K. W. (1993). Char_align: a program for aligning parallel texts at the character level. Proceedings of the 31“ Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 1–8.

    Google Scholar 

  • Church, K. W., Dagan, I., Gale, W. A., Fung, P., Helfman, J. and Satish, B. (1993). Aligning parallel texts: do methods developed for English-French generalize to Asian languages ? Proceedings of the Pacific Asia Conference on Formal and Computational Linguistics, Taipei: Academica Sinica, 1–12.

    Google Scholar 

  • Church, K. W. and Gale, W. A. (1991). Concordances for Parallel Text. In Using Corpora: Proceedings of the Eight Annual Conference of the UW Centre for the New OED and Text Research (Oxford, September 29–October 1, 1991 ), 40–62.

    Google Scholar 

  • Church, K. W. and Hanks, P. (1990). Word association norms, mutual information and lexicography. Computational Linguistics, 16 (2), 22–29.

    Google Scholar 

  • Dagan, I. and Church, K. W. (1994). Termight: identifying and translating technical terminology. Proceedings of the 4 ll ’ Conference on Applied Natural Language Processing (ANLP ‘84), University of Stuttgart, Germany, 34–40.

    Google Scholar 

  • Dagan, I., Church, K. W. and Gale. W. A. (1993). Robust Bilingual Word Alignment for Machine-Aided Translation. Proceedings of the Workshop on Very Large Corpora: Academic and Industrial Perspectives, Columbus, Ohio, 1–8.

    Google Scholar 

  • Daille, B. (1994). Approche mixte pour l’extraction automatique de terminologie: statistiques lexicales et filtres linguistiques. Unpublished doctoral dissertation, Université de Paris V II.

    Google Scholar 

  • Daille, B., Gaussier, E. and Langé, J.-M. (1994). Towards automatic extraction of monolingual and bilingual terminology. Proceedings of the 15th International Conference on Computational Linguistics (COLING’94), Kyoto, Japan, 712–716.

    Google Scholar 

  • Danielsson, P. and Ridings, D. (1996). PEDANT. Parallel texts in Göteborg. Sprâkbanken, Institutionen för Svenska sprâket, Göteborgs universitet.

    Google Scholar 

  • Davis, M. W. and Dunning, T. E. (1995a). Query translation using evolutionary programming for multi-lingual information retrieval. Proceedings of the Fourth Annual Conference on Evolutionary Programming, San Diego, California, 175–185.

    Google Scholar 

  • Davis, M. W. and Dunning, T. E. (1995b). A TREC evaluation of query translation methods for multi-lingual text retrieval. In Harman, D. K. (Ed.), The Fourth Text Retrieval Conference (TREC-4), NIST, 483–498. Available: http://crl.nmsu.edu/ANG/MWD/Book2/trec4.ps

    Google Scholar 

  • Davis, M. W. and Dunning, T. E. (1996). Query translation using evolutionary programming for multilingual information retrieval II. Proceedings of the Fifth Annual Conference on Evolutionary Programming.

    Google Scholar 

  • Debili, F. and Sammouda, E. (1992). Appariement des Phrases de Textes Bilingues. Proceedings of the 14th International Conference on Computational Linguistics (COLING ‘82), Nantes, France, 517–538.

    Google Scholar 

  • DeClaris, N., Harman, D., Faloutsos, C., Dumais, S. T. and Oard, D. W. (1994). Information filtering and retrieval: Overview, issues and directions. Proceedings of the 16th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 1, 42–49. Available: http://www.ee.umd.edu/medlab/filter/papers/ balt.ps.

    Google Scholar 

  • Devauchelle, D. (1990). La Pierre de Rosette, présentation et traduction Le Havre.

    Google Scholar 

  • Dunning, T. E. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19 (1), 61–74.

    Google Scholar 

  • Ebeling, J. (1998a) The Translation Corpus Explorer: A browser for parallel texts. In Johansson, S. and Oksefjell, S. (Eds.), Corpora and Cross-linguistic Research: Theory, Method and Case Studies (pp. 101–112 ). Amsterdam: Rodopi.

    Google Scholar 

  • Ebeling, J. (1998b). Contrastive linguistics, translation, and parallel corpora. Meta,43(4), 602615.

    Google Scholar 

  • Erjavec, T., Ide, N., Petkevic, V. and Véronis, J. (1995). Multext-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages. TELRI, Proceedings of the First European Seminar, “Language resources for Language Technologies ”, Tihany, Hungary, 87–97.

    Google Scholar 

  • Evans, D. A., Handerson, S. K., Monarch, I. A., Pereiro, J., Delon, L. and Hersh, W. R. (1991). Mapping vocabularies using “latent semantics”. Technical Report CMU-LCL-91–1, Carnegie Mellon University, Laboratory for Computational Linguistics.

    Google Scholar 

  • Fluhr, C. (1995). Multilingual information retrieval. In Cole, R. A., Mariani, J., Uszkoreit, H., Zaenen, A. and Zue, V. (Eds.) Survey of the State of the Art in Human Language Technology (pp. 391–405). Center for Spoken Language Understanding, Oregon Graduate Institute. Available: http://www.cse.ogi.edu/CSLU/HLTsurvey/ch8node7.html

    Google Scholar 

  • Foster, G. F., Isabelle, P. and Plamondon, P. (1997). Target-text mediated interactive machine translation. Machine Translation, 12 (1–2), 175–194.

    Article  Google Scholar 

  • Fung, P. and Church, K. W. (1994). K-vec: A new approach for aligning parallel texts, Proceedings of the 15th International Conference on Computational Linguistics (COLING ‘84), Kyoto, 1096–1102.

    Google Scholar 

  • Fung, P. and McKeown, K. R. (1994). Aligning Noisy Parallel Corpora across Language Groups: Word Pair Feature Matching by Dynamic Time Warping, Proceedings of the Conference of the Association for Machine Translation in the Americas. Columbia, MD, 81–88.

    Google Scholar 

  • Fung, P. and McKeown, K. R. (1997). A Technical Word and Term Translation Aid using Noisy Parallel Corpora Across Language Groups. Machine Translation, 17 (1/2), 53–87.

    Article  Google Scholar 

  • Gale, W. A. and Church, K. W. (1991). A program for aligning sentences in bilingual corpora. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics ( ACL ), Berkeley, 177–184.

    Chapter  Google Scholar 

  • Gale, W. A. and Church, K. W. (1993). A program for aligning sentences in bilingual corpora. Computational Linguistics, 19 (3), 75–102.

    Google Scholar 

  • Gale, W. A., Church, K. W. and Yarowsky, D. (1992). Using bilingual materials to develop word sense disambiguation methods. Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI’92), Montréal, 101–112.

    Google Scholar 

  • Gale, W. A., Church, K. W. and Yarowsky, D. (1993). A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26, 415–439.

    Article  Google Scholar 

  • Garside, R., Hutchinson, J., Leech, G., McEnery, A. M., Oakes, M. P. (1994) The exploitation of parallel corpora in projects ET10/63 and CRATER. In Jones, D. B. (Ed.) New Methods in Language Processing (pp. 108–115), UMIST.

    Google Scholar 

  • Gaussier, E. (1995). Modèles statistiques et patrons morphosyntaxiques pour l’extraction de lexiques bilingues. Unpublished Doctoral dissertation, Université Paris V II.

    Google Scholar 

  • Gaussier, E. (1998). Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora. Proceedings of the joint 17th International Conference on Computational Linguistics (COLING’98) and 36th Annual Meeting of the Association for Computational Linguistics (ACL’98), August 10–14, 1998, Université de Montréal, Montréal, Canada, 444–450.

    Google Scholar 

  • Gaussier, E. and Langé, J.-M. (1995). Modèles statistiques pour l’extraction de lexiques bilingues. Traitement Automatique des Langues, 36(1–2), 133–155.

    Google Scholar 

  • Grishman, R. (1994). Iterative alignment of syntactic structures for a bilingual corpus. Proceedings of the Second Annual Workshop on Very Large Corpora, Kyoto, Japan, 57–68.

    Google Scholar 

  • Grundy, V. (1996). L’utilisation d’un corpus dans la rédaction du dictionnaire bilingue. In Béjoint, H. and Thoiron, Ph. (Eds.), Les dictionnaires bilingues (pp. 127–149). Louvain-laNeuve Duculot.

    Google Scholar 

  • Harris, B. (1988a). Are you bitextual? Language Technology, 7, 41–41.

    Google Scholar 

  • Harris, B. (1988b). Bitexts: A new concept in translation theory. Language Monthly, 54, 8–10.

    Google Scholar 

  • Hartmann, R. R. K. (1980). Contrastive Textology. Comparative Discourse Analysis in Applied Linguistics (Studies in Descriptive Linguistics 5 ). Heidelberg: J. Gross.

    Google Scholar 

  • Hartmann, R. R. K. (1994). The Use of Parallel Text Corpora in the Generation of Translation Equivalents for Bilingual Lexicography. Proceeding of EURALEX ‘84, Amsterdam: Vrije Universiteit, 291–297.

    Google Scholar 

  • Haruno, M. and Yamazaki, T. (1996). High-performance bilingual text alignment using statistical and dictionary information. Proceedings of the 34` 1 Annual Meeting of the Association for Computational Linguistics (ACL’96), Santa Cruz, California, 131–138.

    Google Scholar 

  • Haruno, M. and Yamazaki, T. (1997). High-performance bilingual text alignment using statistical and dictionary information. Journal of Natural Language Engineering, 3 (1), 1–14.

    Article  Google Scholar 

  • Hiemstra, D. (1998). Multilingual domain modeling in Twenty-One: automatic creation of a bidirectional translation lexicon from a parallel corpus. In Coppen, P. A., van Halsteren, H. and Teunissen, L. (Eds.) Computational Linguistics in the Netherlands 1997. Selected Papers from the Eighth CLIN Meeting (pp. 41–58). Language and Computers: Studies in Practical Linguistics, 25, Amsterdam: Rodopi.

    Google Scholar 

  • Hofland, K. and Johansson, S. (1998). The Translation Corpus Aligner: A program for automatic alignment of parallel texts. In Johansson, S. and Oksefjell, S. (Eds.), Corpora in Cross-linguistic Research: Theory and Method, and Case Studies (pp. 87–100 ). Amsterdam: Rodopi.

    Google Scholar 

  • Ide, N. and Véronis, J. (1994). MULTEXT (Multilingual Text Tools and Corpora). Proceedings of the International Conference on Computational Linguistics (COLING) 1994, Kyoto, Japan, 588–592.

    Chapter  Google Scholar 

  • Ide, N. and Véronis, J. (1998). Introduction to the Special Issue on Word Sense Disambiguation: the State of the Art. Computational Linguistic, 24 (1), 1–40.

    Google Scholar 

  • Imbs, P. (1971). Trésor de la Langue Française. Dictionnaire de la langue du XIXè et du XXè siècles (1989–1960). Paris: Editions du Centre National de la Recherche Scientifique.

    Google Scholar 

  • Isabelle, P. (1992a). La bitextualité: vers une nouvelle génération d’aides à la traduction et la terminologie. META, 37 (4), 721–737.

    Google Scholar 

  • Isabelle, P. (1992b). Bitextual Aids for Translators. Screening Words: User Interfaces for Text, Proceedings of the Eight Annual Conference of the UW Centre for the New OED and Text Research (Waterloo, October 18–20, 1992 ), 76–89.

    Google Scholar 

  • Isabelle, P., Dymetman, M., Foster, G. F., Jutras, J.-M., Macklovitch, E., Perrault, F., Ren, X. and Simard, M. (1993). Translation analysis and translation automation. Proceedings of the Fifth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI’93), Kyoto, Japan, 201–217.

    Google Scholar 

  • Jacquemin, C. (1991). Transformation des noms composés. Unpublished doctoral dissertation, Université de Paris V II.

    Google Scholar 

  • Jagtman, M. (1994) COMOLA: A computer system for the analysis of interlanguage data. Second Language Research, 10, 49–83.

    Article  Google Scholar 

  • Johansson, S. (1998) On the role of corpora in cross-linguistic research. In Johansson, S. and Oksefjell, S. (Eds.), Corpora and Cross-linguistic Research: Theory, Method and Case Studies (pp. 3–24 ). Amsterdam: Rodopi.

    Google Scholar 

  • Johansson, S., Ebeling, J. and Hofland, K. (1996). Coding and aligning the English-Norwegian parallel corpus. In Aijmer, K., Altenberg, B., Johansson, M. (Eds), Languages in Contrast. (Papers from a Symposium on Text-based Cross-linguistic Studies, 4–3 March 1994, pp. 85112 ). Lund: Lund University Press.

    Google Scholar 

  • Johansson, S. and Hofland, K. (1994). Towards an English-Norwegian parallel corpus. In Fries, U., Tottie, G. and Schneider, P. (Eds.), Creating and Using English Language Corpora (pp. 2537 ). Amsterdam: Rodopi.

    Google Scholar 

  • Jones, D. B. and Somers, H. L. (1997). Bilingual vocabulary estimation from noisy parallel corpora using variable bag estimation. In Mitkov, R. and Nicolov, N. (Eds.) (1997). Recent advances in natural language processing (pp. 427–437 ). Amsterdam: John Benjamins.

    Google Scholar 

  • Jones, D. B. and Alexa, M. (1994). Towards Automatically Aligning German Compounds with English Word Groups in an Example-Based Translation System. Proceedings of the International Conference on New Methods in Language Processing, Manchester, England, 66–7. Reprinted in Jones, D. B. and Somers, H. L. (Eds) (1997), New Methods in Language Processing (pp. 199–206 ), London: UCL Press.

    Google Scholar 

  • Kaji, H., Kida, Y. and Morimoto, Y. (1992). Learning translation templates from bilingual text. Proceedings of the 14th International Conference on Computational Linguistics (COLING ‘82), Nantes, France, 672–678.

    Google Scholar 

  • Kay, M. (1980). The proper place of men and machines in translation Technical Report CSL-801 1, Xerox Palo Alto Research Center.

    Google Scholar 

  • Kay, M. and Röscheisen, M. (1988). Text-translation alignment. Technical Report. Xerox Palo Alto Research Center.

    Google Scholar 

  • Kay, M. and Röscheisen, M. (1993). Text-translation alignment. Computational Linguistics, 19 (1), 121–142.

    Google Scholar 

  • Kenning, M.-M. (1999). Parallel Concordancing and French Personal Pronouns, Languages in Contrast, 1 (1), 1–21.

    Google Scholar 

  • Kjaersgaard, P. (1987). REFTEX. A context-based translation aid. Proceedings of the 3’ J conference of the European Chapter of the Association for Computational Linguistics, Copenhagen, 109–112.

    Google Scholar 

  • Klavans, J. and Tzoukermann, E. (1990). The BICORD system: combining lexical information from bilingual corpora and machine-readable dictionaries. Proceedings of the 12th International Conference on Computational Linguistics (COLING’90), Helsinki, Finland, 174–179.

    Google Scholar 

  • Knowles, F. (1996). L’informatisation de la fabrication des dictionnaires bilingues. In Béjoint, H. and Thoiron, Ph. Les dictionnaires bilingues (pp. 151–168 ). Louvain-la-Neuve: Duculot.

    Google Scholar 

  • Koutsoudas, A. and Humecky, A. (1957). Ambiguity of syntactic function resolved by linear context. Word, 13 (3), 403–414.

    Google Scholar 

  • Kupiec, J. (1993). An algorithm for finding noun phrase correspondences in bilingual corpora. Proceedings of the 31’ Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 17–22.

    Google Scholar 

  • Lafon, P. (1984). Dépouillements et statistiques en lexicométrie. Genève: Slatkine-Champion. Landauer, T. K. and Littman, M. L. (1990). Fully-automatic cross-language document retrieval using latent semantic indexing. Proceedings of the 6th Conference of the UW Centre for the New OED, Waterloo, Canada, 31–38.

    Google Scholar 

  • Langlais, Ph. and El-Bèze, M. (1997) Alignement de corpus bilingues: algorithmes et évaluation. Actes de 1 ères Journées Scientifiques et Techniques du Réseau Francophone de l’Ingénierie de la langue de l’AUPELF-UREF (JST), Avignon, Avril 1997.

    Google Scholar 

  • Langlois, L. (1996). Bilingual Concordancers: A New Tool for Bilingual Lexicographers, Second international conference of the Association for Machine Translation in the Americas (AMTA ‘86). Montréal, Canada, 34–42.

    Google Scholar 

  • Leech, G. (1991). The state of the art in corpus linguistics. In Aijmer, K. and Altenberg, B. (Eds.), English corpus linguistics (pp. 8–29 ). London: Longman.

    Google Scholar 

  • Léon, J. (1996–1997). Les premières machines à traduire (1948–1960) et la filiation cybernétique. Bulag, 22, 9–33.

    Google Scholar 

  • Léon, J. (1998). Les premiers outils pour la traduction automatique. Demande sociale, technologie et linguistique (1948–1960). Bulag, 23, 273–295.

    Google Scholar 

  • Macklovitch, E. (1991). The Translators’s Workstation… in plain prose. Proceedings of the 32 nd Annual Conference of the American Translators Association, Salt Lake City, Utah.

    Google Scholar 

  • Macklovitch, E. (1992). Corpus-based tools for translators. Proceedings of the 33rd Annual Conference of the American Translators Association, San Diego, California.

    Google Scholar 

  • Macklovitch, E. (1993). Le PTT, ou les aides à la traduction. In Bouillon, P. and Clas, A. (Eds.), La traductique: Études et Recherches de traduction par ordinateur. Montréal: Les Presses de l’Université de Montréal, 281–287.

    Google Scholar 

  • Macklovitch, E. (1995a). Can terminological consistency be validated automatically Technical report. Laval, Canada: Centre d’innovation en technologies de l’information. 15 pages.

    Google Scholar 

  • Macklovitch, E. (1995b). TransCheck - or the automatic validation of human translations. Proceedings of the Fifth Machine Translation Summit, MT Summit V, Luxembourg [no page numbers in original].

    Google Scholar 

  • Macklovitch, E. and Hannan, M.-L. (1998), Line Em Up: Advances in Alignment Technology and their Impact on Translation Support Tools, Machine Translation, 13 (1), 41–58.

    Article  Google Scholar 

  • Matsumoto, Y., Ishimoto, H. and Utsuro, T. (1993). Structural matching of parallel text. Proceedings of the 31 s ` Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 23–30.

    Google Scholar 

  • McEnery, A. M., Langé, J.-M., Oakes, M. P. and Véronis, J. (1997). The exploitation of multilingual annotated corpora for term extraction. In Garside, R., Leech, G. and McEnery, A. M. (Eds.) Corpus Annotation: Linguistic Information from Computer Text Corpora (pp. 220–230 ). London: Addison Wesley Longman.

    Google Scholar 

  • McEnery, A. M. and Oakes, M. P. (1995). Sentence and word alignment in the CRATER project: methods and assessment. Proceedings of the EACL-SIGDAT Workshop “From Texts to Tags: Issues in Multilingual Language Analysis ”. Dublin, Ireland. 77–86.

    Google Scholar 

  • Melamed, I. D. (1996a). Automatic detection of omissions in translations. Proceedings of the 16th International Conference on Computational Linguistics (COLING’96), Copenhagen, 764–769.

    Google Scholar 

  • Melamed, I. D. (1996b). Automatic construction of clean broad-coverage translation lexicons. Proceedings of the 2nd Conference of the Association for Machine Translation in the Americas (AMTA’96), Montreal, 125–134.

    Google Scholar 

  • Melamed, I. D. (1997a). A scalable architecture for bilingual lexicography Dept. of Computer and Information Science Technical Report #MS-CIS-91–01, University of Pennsylvania.

    Google Scholar 

  • Melamed, I. D. (1997b). Automatic discovery of non-compositional compounds in parallel data. Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP’97), Providence, RI, 97–108.

    Google Scholar 

  • Melamed, I. D. (1997c). A word-to-word model of translational equivalence. Proceedings of the 35th Conference of the Association for Computational Linguistics (ACL’97), Madrid, 490–497.

    Google Scholar 

  • Melamed, I. D. (1998) Manual Annotation of Translational Equivalence: The Blinker Project, University of Pennsylvania (IRCS Technical Report #98–07).

    Google Scholar 

  • Melamed, I. D. (forthcoming). Word-to-word models of translational equivalence. Computational Linguistics

    Google Scholar 

  • Melby, A. K. (1981). A bilingual concordance system and its use in linguistic studies. Proceedings of the Eighth LACUS Forum, Columbia, SC, 541–549.

    Google Scholar 

  • Nagao, M. (1984). A framework of a mechanical translation between japanese and english by analogy principle. In Elithorn, A. and Banerji, R. (Eds.), Artificial and Human Intelligence: Edited Review Papers Presented at the International NATO Symposium on Artificial and Human Intelligence (pp. 173–180 ). Amsterdam: North-Holland.

    Google Scholar 

  • Oard, D. W., DeClaris, N., Don B. J and Faloutsos, C. (1994). On automatic filtering of multilingual texts. Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics,Vol. 2, 1645–1650. Available: http://www.ee.umd.edu/medlab/ fillter/papers/smc.ps.

    Google Scholar 

  • Oard, D. W. and Dorr, B. J. (1996). A survey of multilingual text retrieval. Technical Report UMIACS-TR-96–19,University of Maryland, Institute for Advanced Computer Studies, April 1996. Available: http://www.glue.umd.edu/-oard//research.html

    Google Scholar 

  • Oksefjell, S. (1999). A description of the English-Norwegian parallel corpus: compilation and further developments. International Journal of Corpus Linguistics, 4 (2), 197–219.

    Article  Google Scholar 

  • Papageorgiou, H. (1997). Clause recognition in the framework of alignment. In Mitkov, R. and Nicolov, N. (Eds.) (1997). Recent advances in natural language processing (pp. 417–425 ). Amsterdam: John Benjamins.

    Google Scholar 

  • Papageorgiou, H., Cranias, L. and Piperidis, S. (1994). Automatic Alignment in Parallel Corpora, Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (Student Session). Las Cruces, NM, 334–336.

    Google Scholar 

  • Paulussen, H. (1995). Compiling a trilingual parallel corpus. Quarterly Newsletter of the Contrastive Grammar Research Group of the University of Gent, 3. [Online] Available: http://bank.rug.ac.be/contragram/newslet3.html

    Google Scholar 

  • Picchi, E., Peters, C. and Marinai, E. (1992) A translator’s workstation. Proceedings of the 14th International Conference on Computational Linguistics (COLING ‘82), Nantes, France, 972976.

    Google Scholar 

  • Pienemann, M. (1992) COALA–A computational system for interlanguage analysis. Second Language Research, 8, 59–92.

    Article  Google Scholar 

  • Resnik, P. (1998). Parallel Strands: A Preliminary Investigation into Mining the Web for Bilingual Text, Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA-98), Langhorne, PA, October, 1998.

    Google Scholar 

  • Resnik, P. and Melamed, I. D. (1997). Semi-automatic acquisition of domain-specific translation lexicons. Proceedings of the Fifth Conference on Applied Natural Language Processing (AN LP’97), Washington, DC, 340–347.

    Google Scholar 

  • Roberts, R. P. and Montgomery, C. (1996). The Use of Corpora in Bilingual Lexicography. Proceedings of Proceedings of the 7th Euralex International Congress on Lexicography (EURALEX ‘86), Göteborg, Sweden.

    Google Scholar 

  • Romary, L., Mehl, N. and Woolls, D. (1995). The Lingua Parallel Concordancing Project: Managing Multilingual Texts for Educational Purpose, Text Technology, 5 (3), 206–220.

    Google Scholar 

  • Sadler, V. (1989a). The bilingual knowledge bank: a new conceptual basis for MT. Technical report. Utrecht: BSO/Research.

    Google Scholar 

  • Sadler, V. (1989b). Translating with a simulated bilingual knowledge bank. Technical report. Utrecht: BSO/Research.

    Google Scholar 

  • Salkie, R. (1995). Parallel Corpora, Translation Equivalence and Contrastive Linguistics. Conference Abstracts: ACH/ALLC ‘85, University of California, Santa Barbara, 106–108.

    Google Scholar 

  • Sato, S. and Nagao, M. (1990). Toward memory-based translation. Proceedings of the 12th Interna- tional Conference on Computational Linguistics, COLING’90, Helsinki, Finland, 247–252.

    Google Scholar 

  • Simard, M. (1998). The BAF: a corpus of English-French bitext. Proceedings of First International Conference on Language Resources and Evaluation (LREC), Granada, Spain, 489–496.

    Google Scholar 

  • Simard, M., Foster, G. F. and Isabelle, P. (1992). Using cognates to align sentences in bilingual corpora. Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), Montréal, Canada, 25–27 June 1992, 67–81.

    Google Scholar 

  • Simard, M., Foster, G. F. and Perrault, F. (1993). TransSearch: a bilingual concordance tool_ Technical Report. Laval, Canada: Centre d’innovation en technologies de l’information.

    Google Scholar 

  • Sinclair, J. (Ed.) (1987). Looking up: An account of the COBUILD project in lexical computing. London: Collins.

    Google Scholar 

  • Smadja, F. A. (1993). Retrieving collocations from text: Xtract. Computational Linguistics, 19 (1), 143–177.

    Google Scholar 

  • Smadja, F. A. and McKeown, K. R. (1990). Automatically extracting and representing collocations for language generation. Proceedings of the 28` h Annual Meeting of the Association for Computational Linguistics, Pittsburgh, Pennsylvania, 252–259.

    Google Scholar 

  • Smadja, F. A., McKeown, K. R. and Hatzivassiloglou, V. (1996). Translation Collocations for Bilingual Lexicons: A Statistical Approach. Computational Linguistics, 22 (1), 1–38.

    Google Scholar 

  • Sperberg-McQueen, C. M. and Burnard, L. (1994), Guidelines for Electronic Text Encoding and Interchange, Text Encoding Initiative, Chicago and Oxford.

    Google Scholar 

  • Sumita, E., Iida, H and Kohyama, H. (1990). Translating with examples: a new approach to machine translation. Proceedings of the Third International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages (7’MI’90). Austin, Texas, 203–212.

    Google Scholar 

  • Sumita, E. and Tsutsumi, Y. (1988). A translation aid system using flexible text-retrieval based on syntax matching. TRL Research report TR-87–1019. Tokyo Research Laboratory, IBM.

    Google Scholar 

  • van der Eijk, P. (1993). Automating the acquisition of bilingual terminology. Proceedings of the 6’ h Conference of the European Chapter of the Association for Computational Linguistics (EACL ‘83), Utrecht, 113–119.

    Google Scholar 

  • Warwick, S. and Russell, G. (1990). Bilingual concordancing and bilingual lexicography. Proceedings of the Fourth International EURALEX Conference, Malaga, 1–4.

    Google Scholar 

  • Weaver, W. (1949). Translation. Mimeographed, 12 pp., July 15, 1949. Reprinted in Locke, W. N. and Booth, A. D. (1955) (Eds.), Machine translation of languages (pp. 15–23 ). New York: John Wiley and Sons.

    Google Scholar 

  • Wu, D. (1994). Aligning a Parallel English-Chinese Corpus Statistically with Lexical Criteria. Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics. Las Cruces, 80–87.

    Google Scholar 

  • Wu, D. (1997). Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23 (3), 377–404.

    Google Scholar 

  • Wu, D. and Xia, X. (1994). Learning an English-Chinese Lexicon from a Parallel Corpus. Proceedings of the 1st Conference of the Association for Machine Translation in the Americas, Columbia, Maryland.

    Google Scholar 

  • Yang, Y., Brown R. D., Frederking, R. E., Carbonell, J. G., Geng, G. and Lee, D. (1997). Bilingual-corpus Based Approaches to Translingual Information Retrieval. Proceedings of The 2“` t Workshop on Multilinguality in Software Industry: The AI Contribution (MULSAIC’97).

    Google Scholar 

  • Yang, Y., Carbonell, J. G., Brown, R. D. and Frederking, R. E. (1998). Translingual Information Retrieval: Learning from Bilingual Corpora. Artificial Intelligence Journal (Special issue: Best of IJCAI-97), 103, 323–345. Available: http://www.cs.cmu.edu/-ralf/ papers.html.

    Google Scholar 

  • Zanettin, F. (1994) Parallel words: Designing a bilingual database for translation activities. In Wilson, A. and McEnery, A. M. (Eds.), Corpora and language research: A selection of papers from Talc94. UCREL Technical Papers Special Issue, Lancaster University, 99–111.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Véronis, J. (2000). From the Rosetta stone to the information society. In: Véronis, J. (eds) Parallel Text Processing. Text, Speech and Language Technology, vol 13. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2535-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-2535-4_1

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-5555-2

  • Online ISBN: 978-94-017-2535-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics