Skip to main content

SemEQUAL: Multilingual Semantic Matching in Relational Systems

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3453))

Abstract

In an increasingly multilingual world, it is critical that information management tools organically support the simultaneous use of multiple natural languages. A pre-requisite for efficiently achieving this goal is that the underlying database engines must provide seamless matching of text data across languages. We propose here SemEQUAL, a new SQL functionality for semantic matching of multilingual attribute data. Our current implementation defines matches based on the standard WordNet linguistic ontologies. A performance evaluation of SemEQUAL, implemented using standard SQL:1999 features on a suite of commercial database systems indicates unacceptably slow response times. However, by tuning the schema and index choices to match typical linguistic features, we show that the performance can be improved to a level commensurate with online user interaction.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The British National Corpus, Oxford University Press, http://www.comp.lancs.ac.uk

  2. Centre for Indian Language Technology, IIT-Bombay, http://www.cfilt.iitb.ac.in

  3. Chen, H., Lin, C., Lin, W.: Building a Chinese-English WordNet for Translingual Applications. ACM Transactions on Asian Languages Information Processing (2002)

    Google Scholar 

  4. Deerwester, S., Dumais, S.T., Ogden, W.C.: Indexing by Latent Semantic Analysis. Jour. of American Soc. of Information Sciences ( September 1990)

    Google Scholar 

  5. The EuroSpider, http://www.eurospider.ch

  6. The Euro-WordNet, http://www.illc.uva.nl/EuroWordNet

  7. Fellbaum, C., Miller, G.A.: WordNet: An electronic lexical database (language, speech and communication). MIT Press, Cambridge (1998)

    Google Scholar 

  8. Fluhr, C., et al.: Multilingual Database and Crosslingual Interrogation in a Real Internet Application. In: AAAI Sym. on Crosslanguage Text and Speech Retrieval (1997)

    Google Scholar 

  9. Gey, F., Chen, A., Buckland, M., Larson, R.: Translingual Vocabulary Mapping for Multilingual Information Access. In: Proc. of 25th ACM SIGIR Conf. (2002)

    Google Scholar 

  10. The Global WordNet Association, http://www.globalwordnet.org

  11. Han, J., et al.: Some Performance Results on Recursive Query Processing in Relational Database Systems. In: Proc. of 2nd ICDE Conf. (1986)

    Google Scholar 

  12. Ioannidis, Y.: On the Computation of TC of Relational Operators. In: Proc. of 12th VLDB Conf. (1986)

    Google Scholar 

  13. Jayaram, B.D., Bhattacharyya, P.: Report on Indo-WordNet Workshop. Central Institute of Indian Languages (January 1999)

    Google Scholar 

  14. Kumaran, A., Haritsa, J.R.: On Multilingual Performance of Database Systems. In: Proc. of 29th VLDB Conf. (2003)

    Google Scholar 

  15. Kumaran, A., Haritsa, J.R.: Supporting Multiscript Matching in Database Systems. In: Prof. of 9th EDBT Conf. (2004)

    Google Scholar 

  16. Kumaran, A., Haritsa, J.R.: Multilingual Semantic Operator in SQL. Technical Report TR-2004-03, DSL/SERC, Indian Institute of Science (2004)

    Google Scholar 

  17. Liberman, M., Church, K.: Text Analysis and Word Pronunciation in TTS Synthesis. Advances in Speech Processing (1992)

    Google Scholar 

  18. The Computer Scope Ltd., http://www.NUA.ie/Surveys

  19. Richardson, R., Smeaton, A.F.: Using WordNet in a Knowledge-based Approach to Information Retrieval. Working Paper CA-0395, Dublin City University (1999)

    Google Scholar 

  20. Soergel, D.: Multilingual thesauri in cross-language text and speech retrieval. In: AAAI Sym. on Cross-Language Text and Speech Retrieval (March 1997)

    Google Scholar 

  21. The Semantic Web, http://www.w3.org/2001/sw

  22. The WebFountain, http://www.almaden.ibm.com/WebFountain

  23. The WordNet, http://www.cogsci.princeton.edu/~wn

  24. Word Discover, http://www.worddiscover.com

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kumaran, A., Haritsa, J.R. (2005). SemEQUAL: Multilingual Semantic Matching in Relational Systems. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_20

Download citation

  • DOI: https://doi.org/10.1007/11408079_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25334-1

  • Online ISBN: 978-3-540-32005-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics