Advertisement

Disambiguating Noun Groupings with Respect to WordNet Senses

  • P. Resnik
Part of the Text, Speech and Language Technology book series (TLTB, volume 11)

Abstract

Word groupings useful for language processing tasks are increasingly available, as thesauri appear on-line, and as distributional word clustering techniques improve. However, for many tasks, one is interested in relationships among word senses, not words. This paper presents a method for automatic sense disambiguation of nouns appearing within sets of related nouns — the kind of data one finds in on-line thesauri, or as the output of distributional clustering algorithms. Disambiguation is performed with respect to WordNet senses, which are fairly fine-grained; however, the method also permits the assignment of higher-level WordNet categories rather than sense labels. The method is illustrated primarily by example, though results of a more rigorous evaluation are also presented.

Keywords

Semantic Similarity Test Instance Semantic Network Query Expansion Word Sense 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Basili, R., Pazienza, M. T. and Velardi, P. 1994. The noisy channel and the braying donkey. In Klavans and Resnik (eds), Proceedings of the ACL Workshop on Combining Symbolic and Statistical Approaches to Language (The Balancing Act), pp. 21–28.Google Scholar
  2. Bensch, P. A. and Savitch, W. J. 1992. An occurrence-based model of word categorization. Presented at 3rd Meeting on Mathematics of Language (MOL3).Google Scholar
  3. Brill, E. 1991. Discovering the lexical features of a language. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics. Berkeley, CA, pp. 339–340.Google Scholar
  4. Brown, P. F., Della Pietra, V. J., deSouza, P. V., Lai, J. C. and Mercer, R. L. 1992. Class-based n-gram models of natural language. Computational Linguistics, 18 (4): 467–480.Google Scholar
  5. Church, K. and Hanks, P. 1989. Word association Norms, Mutual Information, and Lexicography. In Proceedings of the 27th Meeting of the Association for Computational Linguistics. Vancouver, B.C., pp. 76–83.Google Scholar
  6. Cowie, J., Guthrie, J. and Guthrie, L. 1992. Lexical disambiguation using simulated annealing. In Proceedings of the 14th International Conference on Computational Linguistics (COLING-92), pp. 359–365, Nantes, France.Google Scholar
  7. Grefenstette, G. 1994. Explorations in Automatic Thesaurus Discovery. Kluwer.Google Scholar
  8. Hearst, M. A. and Schütze, H. 1996. Customizing a lexicon to better suit a computational task. In Boguraev and Pustejovsky (eds), Corpus Processing for Lexical Acquisition. MIT Press, Cambridge, MA, pp. 77–96.Google Scholar
  9. Hearst, M. 1991. Noun homograph disambiguation using local context in large corpora. In Proceedings of the 7th Annual Conference of the University of Waterloo Centre for the New OED and Text Research, Oxford, UK, pp. 1–22.Google Scholar
  10. Leacock, C. and Chodorow, M. 1994. Filling in a sparse training space for word sense identification. ins.Google Scholar
  11. Lee, J. H., Kim, M. H. and Lee, Y. J. 1993. Information retrieval based on conceptual distance in IS-A hierarchies. Journal of Documentation,49(2), pp. 188–207,.June.Google Scholar
  12. Lesk, M. 1986. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 1986 SIGDOC Conference, pp. 24–26.Google Scholar
  13. Marcus, M. P., Santorini, B. and Marcinkiewicz, M. A. 1993. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, 19 (2): 313–330.Google Scholar
  14. McKeown, K. and Hatzivassiloglou, V. 1993. Augmenting lexicons automatically: Clustering semantically related adjectives. In Bates (ed), ARPA Workshop on Human Language Technology. Morgan Kaufmann.Google Scholar
  15. Miller, G. 1990. WordNet: An on-line lexical database. International Journal of Lexicography, 3(4). (Special Issue).Google Scholar
  16. Pereira, P., Tishby, N. and Lee, L. 1993. Distributional clustering of English words. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (A CL-93), Columbus, OH, pp. 183–190.Google Scholar
  17. Rada, R., Mili, H., Bicknell, E. and Blettner, M. 1989. Development and application of a metric on semantic nets. IEEE Transaction on Systems, Man, and Cybernetics, 19 (1): 17–30.CrossRefGoogle Scholar
  18. Resnik, P. 1993. Selection and Information: A Class-Based Approach to Lexical Relationships. Ph.D. thesis, University of Pennsylvania. ftp://ftp.cis.upenn.edu/pub/ires/tr/93 - 42.ps.Z.Google Scholar
  19. Resnik, P. 1995. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAJ-95). (cmp-íg/9511007).Google Scholar
  20. Richardson, R., Smeaton, A. F. and Murphy, J. 1994. Using WordNet as a knowledge base for measuring semantic similarity between words. Working Paper CA-1294, Dublin City University, School of Computer Applications, Dublin, Ireland. ftp://ftp.compapp.dcu.ie/pub/w-papers/1994/CAl294.ps.Z.
  21. Schütze, H. 1993. Word space. In Hanson, Cowan, and Lee Giles (eds) Advances in Neural Information Processing Systems 5, pp. 895–902. Morgan Kaufmann Publishers, San Mateo, CA.Google Scholar
  22. Sussna, M. 1993. Word sense disambiguation for free-text indexing using a massive semantic network. In Proceedings of the Second International Conference on Information and Knowledge Management (CIKM-93), Arlington, Virginia.Google Scholar
  23. Voorhees, E. M. 1994. Query expansion using lexical-semantic relations. In 17th International Conference on Research and Development in Information Retrieval (SIGIR ‘84), Dublin, Ireland.Google Scholar
  24. Yarowsky, D. 1992. Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. In Proceedings of the 14th International Conference on Computational Linguistics (COLING-92), pp. 454–460, Nantes, France.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 1999

Authors and Affiliations

  • P. Resnik

There are no affiliations available

Personalised recommendations