Advertisement

Integrating Machine Readable Dictionary and Thesaurus for Conceptual Context Representation of Word Sense

  • Jen Nan Chen
  • Jason S. Chang
Chapter
Part of the Text, Speech and Language Technology book series (TLTB, volume 10)

Abstract

This paper describes a heuristic approach to automatic acquisition of contextual representations of senses in a machine-readable dictionary (MRD). Including contextual information in an MRD-based lexical database offers several benefits. First, the representation can be used to merge closely related senses and construct a coarser sense division, so unnecessarily fine sense distinctions can be avoided in word sense disambiguation (WSD). The contextual information can also be used as a knowledge base to develop a WSD system. Furthermore, if the algorithms run on several MRDs, the contextual representation also provides a means of linking relevant senses across multiple MRDs to create an integrated lexical database. The algorithms are based primarily on information retrieval techniques to build a list of topics that are most relevant to the definition of each MRD sense. An implementation of the method using definition sentences in the Longman Dictionary of Contemporary English is described. To this end, the topical word lists and topical cross-references in the Longman Lexicon of Contemporary English are used. We have conducted a series of experiments and evaluations to assess the performance of the proposed approach.

Keywords

Word Sense Word Sense Disambiguation Computational Linguistics Machine Readable Dictionary Virtual Document 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ageno, A., I. Castellon, M. A. Marti, G. Rigau, F. Ribas, H. Rodriguez, M. Taule and F. Verdejo. 1992. SEISD: An Environment for Extraction of Semantic Information from On-Line Dictionaries. In the Proceedings of the 3rd Conference on Applied Natural Language Processing, pp. 253-254, Trento, Italy.Google Scholar
  2. Ahlswede, T. and M. Evens. 1988. Parsing vs. Text Processing in the Analysis of Dictionary Definitions. In the Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics, pp. 217-224.Google Scholar
  3. Alshawi, H. 1987. Processing Dictionary Definitions with Phrasal Pattern Hierarchies. American Journal of Computational Linguistics, Vol. 13. no. 3 ., pp. 195–202.Google Scholar
  4. Alshawi, H., B. Boguraev and D. Carter. 1989. Placing the Dictionary On-Line. In B. Boguraev and T. Briscoe (eds.), Computational Lexicography for Natural Language Processing, pp. 41-63,Longman, London.Google Scholar
  5. Amsler, R. A. 1984a. Machine-Readable Dictionaries. Annual Review of Information Science and Technology, 19, pp. 161–209.Google Scholar
  6. Amsler, R. A. 1984b. Lexical Knowledge Bases, Panel Session on Machine-Readable Dictionaries. In the Proceedings of the Tenth International Congress on Computational Linguistics,pp. 458-459, Stanford, CA.Google Scholar
  7. Amsler, R. A. 1987. Words and Words. In the Proceedings of the Third Workshop on Theoretical Issues in Natural Language Processing,pp. 7-9, New Mexico State Unive:sity at Las Cruces, NM.Google Scholar
  8. Brown, P. F., S. A. Della Pietra, V. J. Della Pietra and R. L. Mercer. 1991. Word Sense Disambiguation using Statistical Methods. In the Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics,pp. 264-270.Google Scholar
  9. Bruce, R. and J. Wiebe. 1994 Word Sense Disambiguation using Decomposable Models. In the Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics,pp. 139-145.Google Scholar
  10. Chang, J. S., J. N. Chen, H. H. Sheng and S. J. Ker. 1996. Combining Machine Readable Lexical Resources and Bilingual Corpora for Broad Word Sense Disambiguation. In the Proceedings of the Second Conference of the Association for Machine Translation, pp. 115–124. Montréal, Québec, Canada.Google Scholar
  11. Chen, J. N. and J. S. Chang. 1994 Towards Generality and Modularity in Statistical Word Sense Disambiguation. In the Proceedings of the 8th Asian Conference on Language, Information and Computation,pp. 45-48.Google Scholar
  12. Chen, J. N. and J. S. Chang. 1998. Topical Clustering of MRD Sense based on Information Retrieval Techniques. Computational Linguistics, Vol.24 no.1, pp. 61–95.Google Scholar
  13. Chodorow, M. S., R. J. Byrd and G. E. Heidorn. 1985. Extracting Semantic Hierarchies from a Large On-Line Dictionary. In the Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics,pp. 299-304.Google Scholar
  14. Church, K. W. 1988. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In the Proceedings of the 2nd Conference on Applied Natural Language Processing, pp. 136–143, Austin, Texas, USA.Google Scholar
  15. Copestake, A. 1990. An Approach to Building the Hierarchical Element of a Lexical Knowledge base from a Machine Readable Dictionary. In the Proceedings of the First International Workshop on Inheritance in Natural Language Processing,pp. 19-29, Tilburg, The Netherlands.Google Scholar
  16. Cowie, J., J. Guthrie and L. Guthrie. 1992. Lexical Disambiguation using Simulated Annealing. In the Proceedings of the 14th International Conference on Computational Linguistics,pp. 359-365.Google Scholar
  17. Dagan, I. and A. Itai. 1994. Word Sense Disambiguation using a Second Language Monolingual Corpus. Computational Linguistics,Vol. 20 no. 4, pp. 563-596.Google Scholar
  18. Dagan, I., A. Itai and U. Schwall. 1991. Two Languages are More Informative than One. In the Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics,pp. 130-137.Google Scholar
  19. Dolan, W. B. 1994. Word Sense Disambiguation: Clustering Related Senses. In the Proceedings of the 15th International Conference on Computational Linguistics,pp. 712716.Google Scholar
  20. Gale, W., K. W. Church and D. Yarowsky. 1992. Using Bilingual Materials to Develop Word Sense Disambiguation Methods. In the Proceedings of the 4th International Conference on Theoretical and Methodological Issues in Machine Translation,pp. 101-112.Google Scholar
  21. Guthrie L., B. M. Slator, Y. Wilks and R. Bruce. 1990. Is there Contents in Err pty Heads? In the Proceedings of the 13th International Conference on Computational Linguistics,pp. 138-143.Google Scholar
  22. Ide N. and J. Véronis. 1998. Introduction to the Special Issue on Word Sense Disambiguation: the State of the Art. Computational Linguistics, Vol.24 no.1, pp. 1–10.Google Scholar
  23. Jensen, K. and J. L. Binot. 1987. Disambiguating Prepositional Phrase Attachments by using On-Line Dictionary Definitions. Computational Linguistics, Vol.13 no.4, pp. 251–260.Google Scholar
  24. Klavans, J. L., M. S. Chodorow and N. Wacholder. 1990. From Dictionary to Knowledge Base via Taxonomy. In the Proceedings of the Sixth Conference of the University of Waterloo Centre for the New Oxford English Dictionary and Text Research: Electronic Text Research,pp. 110-132, University of Waterloo, Canada.Google Scholar
  25. Krovetz, R. and W. B. Croft. 1992. Lexical Ambiguity and Information Retrieval. ACM Transaction on Information Systems, pp. 115–141.Google Scholar
  26. Krovetz, R. 1992. Sense-Linking in a Machine Readable Dictionary. In the Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pp. 330–332.Google Scholar
  27. Lesk, M. E. 1986. Automated Sense Disambiguation using Machine-Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone. In the Proceedings of the ACM SIGDOC Conference,pp. 24-26, Toronto, Ontario.Google Scholar
  28. Liddy, E. D., W. Paik and E. S. Yu. 1993. Document Filtering using Semantic Information from a Machine Readable Dictionary. In the Proceedings of the ACL Workshop on Very Large Corpora,pp. 20-29.Google Scholar
  29. Longman. 1978. Longman Dictionary of Contemporary English, P. Proctor (ed.) London: Longman Group.Google Scholar
  30. Luk, A. K. 1995. Statistical Sense Disambiguation with Relatively Small Corpora using Dictionary Definitions. In the Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics,pp. 181-188.Google Scholar
  31. McArthur, T. 1992. Longman Lexicon of Contemporary English. Longman Group ( Far East) Ltd., Hong Kong.Google Scholar
  32. McRoy, S. 1992. Using Multiple Knowledge Sources for Word Sense Discrimination. Com putational Linguistics,Vol. 18 no. 1, pp. 1-30.Google Scholar
  33. Miller, G. A., R. Beckwith, C. Fellbaum, D. Gross and K. Miller. 1990. Word-Net: An On-Line Lexical Database. International Journal of Lexicography,Vol. 3 no. 4:235244.Google Scholar
  34. Montemagni, S. and L. Vanderwende. 1992. Structural Pattern vs. String Pattern for Extracting Semantic Information from Dictionaries. In the Proceedings of the fifteenth International Conference on Computational Linguistics,pp. 546-552.Google Scholar
  35. Ng, H. T. and H. B. Lee. 1996. Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Examplar-Based Approach. In the Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics,pp. 40-47, Santa Cruz, CA, USA.Google Scholar
  36. Okumura, A. and E. Hovy. 1994. Lexicon-to-Ontology Concept Association using a Bilingual Dictionary. In the Proceedings of the First Conference of the Association for Machine Translation in the Americas,pp. 177-184, Columbia, MD.Google Scholar
  37. Ostler, N., and B. T. S. Atkins. 1991. Predictable Meaning Shift: Some Linguistic Properties of Lexical Implication Rules. In the Proceedings of the 1991 ACL Workshop on Lexical Semantics and Knowledge Representation,pp. 76-87.Google Scholar
  38. Putstejovsky, J. 1991. The Generative Lexicon. Computational Linguistics,Vol. 17 no. 4, pp. 409-441.Google Scholar
  39. Putstejovsky, J. and P. Bouillon. 1994. On the Proper Role of Coercion in Semantic Typing. In the Proceedings of the 15th International Conference on Computational Linguistics,pp. 706-711.Google Scholar
  40. Ravin, Y. 1990. Disambiguating and Interpreting Verb Definitions. In the Proceedings of the 28h Annual Meeting of the Association for Computational Linguistics, pp. 260–267.Google Scholar
  41. Roget’s Thesaurus of English words and Phrases. 1987. Longman Group UK Limited. Sanfilippo, A. and V. Poznanski. 1992. The Acquisition of Lexical Knowledge from Combined Machine-Readable Dictionary Sources. In the Proceedings of the 3rd Conference on Applied Natural Language Processing (ANLP-92),pp. 80-87, Trento, Italy.Google Scholar
  42. Schütze, H. 1992. Word Sense Disambiguation with Sublexical Representations. In the Proceedings of the 1992 AAAI Workshop on Statistically-based Natural Language Programming Techniques, pp. 100–104.Google Scholar
  43. Vanderwende, L. 1994. Algorithm for Interpretation of Noun Sequence. In the Proceedings of the 15th International Conference on Computational Linguistics, pp. 782–788.Google Scholar
  44. Vossen, P., W. Meijs and M. D. Broeder. 1989. Meaning and Structure in Dictionary Definitions. In B. Boguraev and T. Briscoe (eds.) Computational Lexicography for Natural Language Processing, London: Longman Group UK Limited, pp. 171–190.Google Scholar
  45. Webster’s Seventh New Collegiate Dictionary. 1967. C. and C. Merriam company, Springfield, Massachusetts.Google Scholar
  46. Wilks, Y. A., D. C. Fass, C. M. Guo, J. E. McDonald, T. Plate, and B. A. Slator. 1990. Providing Tractable Dictionary Tools. In J. Pustejovsky (ed.) Semantics and the Lexicon, MIT Press, Cambridge, M.A.Google Scholar
  47. Witten, I. H., A. Moffat and T. C. Bell. 1994. Managing Gigabytes, Van Nostrand Reinhold, New York.Google Scholar
  48. Yarowsky, D. 1992. Word Sense Disambiguation using Statistical Models of Roget’s Categories Trained on Large Corpora. In the Proceedings of the 14th International Conference on Computational Linguistics,pp. 454-460, Nantes, France.Google Scholar
  49. Yarowsky, D. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In the Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pp. 189–196.Google Scholar
  50. Zernik, U. 1991. Trainl vs. Train2: Tagging Word Senses in Corpus. In the Proceedings of Intelligent Systems: Current Research in Text Analysis, Information Extraction and Retrieval. GE Research and Development Center, Schenectady, New York.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 1999

Authors and Affiliations

  • Jen Nan Chen
    • 1
  • Jason S. Chang
    • 2
  1. 1.Department of Information ManagementMing Chuan University TaipeiTaiwan, ROC
  2. 2.Department of Computer ScienceNational Tsing Hua University HsinchuTaiwan, ROC

Personalised recommendations