Skip to main content

Abstract

We discuss ways in which EuroWordNet (EWN) can be used in multilingual information retrieval activities, focusing on two approaches to Cross-Language Text Retrieval that use the EWN database as a large-scale multilingual semantic resource. The first approach indexes documents and queries in terms of the EuroWordNet Inter-Lingual-Index, thus turning term weighting and query/document matching into language-independent tasks. The second describes how the information in the EWN database could be integrated with a corpus-based technique, thus allowing retrieval of domain-specific terms that may not be present in our multilingual database. Our objective is to show the potential of EuroWordNet as a promising alternative to existing approaches to Cross-Language Text Retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Abbreviations

CLTR:

Cross-Language Text Retrieval

EWN:

EuroWordNet

ILI:

Inter-Lingual-Index

IR:

Information Retrieval

NLP:

Natural Language Processing

POS:

Part of Speech

WSD:

Word Sense Disambiguation

References

  • Alonge, A., N. Calzolari, P. Vossen, L. Bloksma, I. Castellon, T. Marti and W. Peters. “The Linguistic Design of the EuroWordNet Database”. Computers and the Humanities, Special Issue on EuroWordNet (this volume) (1998).

    Google Scholar 

  • Ballesteros, L. and W. Croft. “Dictionary-based Methods for Cross-lingual Information Retrieval”. In Proceedings of the 7th International DEXA Conference on Database and Expert Systems Applications, 1996, pp. 791–801.

    Google Scholar 

  • Brill, E. “A Simple Rule-based Part of Speech Tagger”. In Proceedings of the Third Conference on Applied Natural Language Processing, 1992.

    Google Scholar 

  • Carbonell, J., Y. Yang, R. Frederking, R. Brown, Y. Geng and D. Lee. “Translingual Information Retrieval”. In Proceedings of IJCAI’97, 1997.

    Google Scholar 

  • Chai, J. and A. Bierman. “The Use of Lexical Semantics in Information Extraction”. Proceedings of the ACL/EACL’97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

    Google Scholar 

  • Church, K. and P. Hanks. “Word Association Norms, Mutual Information and Lexicography”. Computational Linguistics, 16 (1) (1990), 22–29.

    Google Scholar 

  • Dumais, S., T. Landauer and M. Littman. “Automatic Cross-linguistic Information Retrieval Using Latent Semantic Indexing”. In Working Notes of the Workshop on Cross-Linguistic Information Retrieval, ACM SIGIR’96, 1996, pp. 16–23.

    Google Scholar 

  • Dunning, T. `Accurate Methods for the Statistics of Surprise and Coincidence“. Computational Linguistics, 19 (1) (1993).

    Google Scholar 

  • Fujii, A., T. Hasegawa, T. Tokunaga and H. Tanaka. “Integration of Hand-crafted and Statistical Resources in Measuring Word Similarity”. Proceedings of the ACLEACL’97 Workshop on

    Google Scholar 

  • Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

    Google Scholar 

  • Gilarranz, J., J. Gonzalo and M. Verdejo. “An Approach to Cross-language Text Retrieval with the EuroWordNet Semantic Database”. In MAI Spring Symposium on Cross-Language Text and Speech Retrieval. AAAI Press SS-97–05, 1997, pp. 49–55.

    Google Scholar 

  • Gonzalo, J., M. F. Verdejo, I. Chugur and J. Cigarrân. “Indexing with WordNet Synsets can Improve Text Retrieval”. In Proceedings of the ACl/COLING Workshop on Usage of WordNet for Natural Language Processing, 1998.

    Google Scholar 

  • Grishman, R., C. Macleod and J. Sterling. “New York University Description of the Proteus System as Used for MUC-4”. In Proceedings of the Fourth Message Understanding Conference, 1992, pp. 223–241.

    Google Scholar 

  • Harman, D. K. “The First Text Retrieval Conference (trec-1)”. Information Processing and Management, 29 (4) (1993), 411–414.

    Article  MathSciNet  Google Scholar 

  • Hull, D. and G. Grefenstette. “Querying across Languages. A Dictionary-based Approach to Multilingual Information Retrieval”. In Proceedings of the 19 th ACM SIGIR Conference, 1996, pp. 49–57.

    Google Scholar 

  • Krovetz, R. and W. Croft. “Lexical Ambiguity and Information Retrieval”. ACM Transactions on Information Systems, 10 (2), 1992, 115–141.

    Article  Google Scholar 

  • Kurohashi, S. and M. Nagao. “A Method of Case Structure Analysis for Japanese Sentences Based on Examples in Case Frame Dictionary”. IEEE Transactions on Information and Systems, E77-D(2) (1994), 227–239.

    Google Scholar 

  • Li, H. and N. Abe. “Generalizing Case Frames Using a Thesaurus and the Mdl Principle”. In Proceedings of Recent Advances in Natural Language Processing, 1995, pp. 239–248.

    Google Scholar 

  • McCarthy, D. “Word Sense Disambiguation for Acquisition of Selectional Preferences”. In Proceedings of the ACL/EACL’97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

    Google Scholar 

  • Miller, G., C. Beckwith, D. Fellbaum, D. Gross and K. Miller. Five Papers on WordNet, CSL Report 43. Technical report, Cognitive Science Laboratory, Princeton University, 1990.

    Google Scholar 

  • Miller, G. A., C. Leacock, R. Tengi and R. T. Bunker. “A Semantic Concordance”. In Proceedings

    Google Scholar 

  • of the ARPA Workshop on Human Language Technology. Morgan Kauffman, 1993.

    Google Scholar 

  • Màrquez, L. and L. Padró. “A Flexible POS Tagger Using an Automatically Acquired Language

    Google Scholar 

  • Model“. In Proceedings of ACIJEACL’97,1997.

    Google Scholar 

  • Ng, H. T. “Exemplar-based Word Sense Disambiguation: Some Recent Improvements”. In Proceedings of the Second Conference on Empirical Methods in NLP, 1997.

    Google Scholar 

  • Peters, W., P. Vossen, P. Díez-Orzas and G. Adriaens. “The Multilingual Design of the EuroWordNet Database”. Computers and the Humanities, Special Issue on EuroWordNet (this volume), 1998.

    Google Scholar 

  • Picchi, E. and C. Peters. “Cross Language Information Retrieval: A System for Comparable Corpus Querying”. In Working Notes of the Workshop on Cross-Linguistic Information Retrieval, ACM SIGIR ‘86. Ed. G. Grefenstette, 1996, pp. 24–33.

    Google Scholar 

  • Resnik, P. “Using Information Content to Evaluate Semantic Similarity in a Taxonomy”. In Proceedings of IJCAI, 1995.

    Google Scholar 

  • Ribas, F. “On Learning more Appropriate Selectional Restrictions”. In Proceedings of the Seventh Conference of the European Chapter of the Association for Computational Linguistics, 1995, pp. 112–118.

    Google Scholar 

  • Richardson, R. and A. Smeaton. “Using WordNet in a Knowledge-based Approach to Information Retrieval”. In Proceedings of the BCS-IRSG Colloquium, Crewe, 1995.

    Google Scholar 

  • Rodriguez, H., S. Climent, P. Vossen, L. Bloksma, A. Roventini, F. Bertagna, A. Alonge and W. Peters. “The Top-down Strategy for Building EuroWordNet: Vocabulary Coverage, Base Concepts and Top Ontology”. Computers and the Humanities, Special Issue on EuroWordNet (this volume), 1998.

    Google Scholar 

  • Salton, G. (ed.). The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall, 1971.

    Google Scholar 

  • Sanderson, M. “Word Sense Disambiguation and Information Retrieval”. In Proceedings of 17t h International Conference on Research and Development in Information Retrieval, 1994.

    Google Scholar 

  • Sanfilippo, A. “Using Semantic Similarity to Acquire Co-occurrence Restrictions from Corpora”. In Proceedings of the ACL/EACL’97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

    Google Scholar 

  • Segond, F., A. Schiller, G. Grefenstette and J. Chanod. “An Experiment in Semantic Tagging Using Hidden Markov Model Tagging”. Proceedings of the ACL/EACL’97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

    Google Scholar 

  • Sheridan, P. and J. Ballerini. “Experiments in Multilingual Information Retrieval Using the Spider System”. In Proceedings of the 19 th ACM SIGIR Conference, 1996, pp. 58–65.

    Google Scholar 

  • Smeaton, A., F. Kelledy and R. O’Donnell. “TREC-4 Experiments at Dublin City University: Thresolding Posting Lists, Query Expansion with WordNet and POS Tagging of Spanish”. In Proceedings of TREC-4, 1995.

    Google Scholar 

  • Smeaton, A. and A. Quigley. “Experiments on Using Semantic Distances between Words in Image Caption Retrieval”. In Proceedings of the 19 th International Conference on Research and Development in IR, 1996.

    Google Scholar 

  • Voorhees, E. M. “Query Expansion Using Lexical-semantic Relations”. In Proceedings of the 17 th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, 1994.

    Google Scholar 

  • Vossen, P. “Introduction to EuroWordNet”. Computers and the Humanities, Special Issue on EuroWordNet (this volume), 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Gonzalo, J., Verdejo, F., Peters, C., Calzolari, N. (1998). Applying EuroWordNet to Cross-Language Text Retrieval. In: Vossen, P. (eds) EuroWordNet: A multilingual database with lexical semantic networks. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-1491-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-1491-4_5

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-5120-2

  • Online ISBN: 978-94-017-1491-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics