Advertisement

Making Sense About Sense

  • Nancy Ide
  • Yorick Wilks
Part of the Text, Speech and Language Technology book series (TLTB, volume 33)

We suggest that the standard fine-grained division of senses and (larger) homographs by a lexicographer for use by a human reader may not be an appropriate goal for the computational WSD task. We argue that the level of sense-discrimination that natural language processing (NLP) needs corresponds roughly to homographs, though we discuss psycholinguistic evidence that there are broad sense divisions with some etymological derivation (i.e., non-homographic) that are as distinct for humans as homographic ones and they may be part of the broad class of sense-divisions we seek to identify here. We link this discussion to the observation that major NLP tasks like machine translation (MT) and information retrieval (IR) seem not to need independent WSD modules of the sort produced in the Research field, even though they are undoubtedly doing WSD by other means. Our conclusion is that WSD should continue to focus on these broad discriminations, at which it can do very well, thereby possibly offering the close-to-100% success that most NLP seemingly requires, with the possible exception of very fine questions of target word choice in MT. This proposal can be seen as reorienting WSD to what it can actually perform at the standard success levels, but we argue that this, rather than some more idealized vision of sense inherited from lexicography, is what humans and machines can reliably discriminate.

Keywords

Word Sense Mental Lexicon Parallel Corpus Lexical Semantic Lexical Resource 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, Richard C. & Andrew Ortony. 1975. On putting apples into bottles - A problem of polysemy. Cognitive Psychology, 7: 167-180.CrossRefGoogle Scholar
  2. Antal, László. 1965. Content, Meaning and Understanding. The Hague: Mouton.Google Scholar
  3. Bird, Steven, David Day, John Garofolo, John Henderson, Christophe Laprun & Mark Liberman. 2000. ATLAS: A flexible and extensible architecture for linguistic annotation. Proceedings of the Second International Language Resources and Evaluation Conference (LREC), May, 2000, Athens, Greece, 1699-1706.Google Scholar
  4. Brown, Peter F., John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Frederick Jelinek, John Lafferty, Robert Mercer & Paul Roosin. 1990. A statistical approach to machine translation. Computational Linguistics, 16(2): 79-85.Google Scholar
  5. Calzolari, Nicoletta, Claudia Soria, Francesca Bertagna, & Francesco Barsotti. 2002. Evaluating lexical resources using Senseval. Natural Language Engineering, 8(4): 375-390.CrossRefGoogle Scholar
  6. Caramazza, Alfonso & Ellen Grober. 1976. Polysemy and the structure of the subjective lexicon. Semantics: Theory and application (27th Georgetown University Round Table on Languages and Linguistics), ed. by Clea Rameh, 181-206. Washington, DC: Georgetown University Press.Google Scholar
  7. Chen, Jen Nan & Jason S. Chang. 1998. Topical clustering of MRD senses based on information retrieval techniques. Computational Linguistics, 24(1): 61-95.Google Scholar
  8. Church, Kenneth W. & Patrick Hanks. 1990. Word association norms, mutual in- formation, and lexicography. Computational Linguistics, 16(1): 22-29.Google Scholar
  9. Cowie, James, Joe Guthrie & Louise Guthrie. 1992. Lexical disambiguation using simulated annealing. Proceedings of the 14th International Conference on Computational Linguistics (COLING), Nantes, France, 359-365.CrossRefGoogle Scholar
  10. Cruse, David. 1986. Lexical semantics. Cambridge: Cambridge University Press. Cunningham, Hamish. 2002. GATE, A general architecture for text engineering. Computers and the Humanities, 36(2): 223-254.Google Scholar
  11. Dagan, Ido & Alon Itai. 1994. Word sense disambiguation using a second language monolingual corpus. Computational Linguistics, 20(4): 563-596.Google Scholar
  12. Diab, Mona & Philip Resnik. 2002. An unsupervised method for word sense tagging using parallel corpora. Proceedings of the 40th Meeting of the Association for Computational Linguistics (ACL), Philadelphia, U.S.A., 255-262.Google Scholar
  13. Dolan, William. 1994. Word sense ambiguation: Clustering related senses. Proceedings of the 14th International Conference on Computational Linguistics (COLING-94), Kyoto, Japan, 712-716.Google Scholar
  14. Durkin, Kevin & Jocelyn Manning. 1989. Polysemy and the subjective lexicon: Semantic relatedness and the salience of intraword senses. Journal of Psycholinguistic Research, 18: 577-612.CrossRefGoogle Scholar
  15. Dyvik, Helge. 1998. Translations as semantic mirrors. Proceedings of the ECAI Workshop on Multilinguality in the Lexicon II, Brighton, U.K., 24-44.Google Scholar
  16. Dyvik, Helge. 2004. Translations as semantic mirrors: From parallel corpus to Wordnet. Language and Computers, 1: 311-326.Google Scholar
  17. Edmonds, Philip & Graeme Hirst. 2002. Near-synonymy and lexical choice. Computational Linguistics, 28(2): 105-144.CrossRefGoogle Scholar
  18. Edmonds, Philip & Adam Kilgarriff. 2002. Introduction to the special issue on evaluating word sense disambiguation systems. Journal of Natural Language Engineering, 8(4): 279-291.CrossRefGoogle Scholar
  19. Hanks, Patrick. 1994. personal communication.Google Scholar
  20. Hanks, Patrick. 2000. Do word meanings exist? Computers and the Humanities, 34(2): 205-215.CrossRefGoogle Scholar
  21. Hanks, Patrick. 2003. WordNet: What is to be done? Panel presentation at Prague Workshop on Lexico-Semantic Classification and Tagging Linguistic and Knowledge-Based Foundations, Existing Schemes and Taxonomies, and Possible Applications, Prague, Czech.Google Scholar
  22. Hanks, Patrick & James Pustejovsky. 2005. A pattern dictionary for natural language processing. Revue Française de Linguistique Appliquée, 10(2).Google Scholar
  23. Heine, Bernd. 1992. Grammaticalization chains. Studies in Language, 16: 335-368.Google Scholar
  24. Ide, Nancy. 1998. Cross-lingual sense determination: Can it work?. Computers and the Humanities, 34(1-2): 223-34.Google Scholar
  25. Ide, Nancy, Tomaz Erjavec & Dan Tufiú. 2001. Automatic sense tagging using parallel corpora. Proceedings of the 6th Natural Language Processing Pacific Rim Symposium, Tokyo, Japan, 212-219.Google Scholar
  26. Ide, Nancy, Tomaz Erjavec & Dan Tufiú. 2002. Sense discrimination with parallel corpora. Proceedings of the ACL SIGLEX Workshop on Word Sense Disambiguation: Recent Successes and Future Directions, Philadelphia, U.S.A., 56-60.Google Scholar
  27. Ide, Nancy & Jean Véronis. 1990. Mapping dictionaries: A spreading activation approach. Proceedings of the 6th Annual Conference of the Centre for the New Oxford English Dictionary, Waterloo, Canada, 52-64.Google Scholar
  28. Ide, Nancy & Jean Véronis. 1993. Extracting knowledge bases from machinereadable dictionaries: Have we wasted our time? Proceedings of the First International Conference on Building and Sharing of Very Large-Scale Knowledge Bases (KB&KS), Tokyo, Japan, 257-266.Google Scholar
  29. Ide, Nancy & Jean Véronis. 1994. MULTEXT: Multilingual text tools and corpora. Proceedings of the 15th International Conference on Computational Linguistics (COLING), Kyoto, Japan, 588-592.Google Scholar
  30. Ide, Nancy & Jean Véronis. 1998. Word sense disambiguation: The state of the art. Computational Linguistics, 24(1): 1-40.Google Scholar
  31. Kilgarriff, Adam. 1993. Dictionary word sense distinctions: An enquiry into their nature. Computers and the Humanities, 26: 356-387Google Scholar
  32. Kilgarriff, Adam. 1997. “I don’t believe in word senses”. Computers and the Humanities, 31(2): 91-113.CrossRefGoogle Scholar
  33. Kilgarriff, Adam & David Tugwell. 2001. WASP-Bench: An MT lexicographers’ workstation supporting state-of-the-art lexical disambiguation. Proceedings of 7 th Machine Translation Summit, Santiago de Compostela, Spain, 187-190.Google Scholar
  34. Klein, Devorah & Gregory Murphy. 2001. The representation of polysemous words. Journal of Memory and Language, 45: 259-82.CrossRefGoogle Scholar
  35. Klein, Devorah & Gregory Murphy. 2002. Paper has been my ruin: Conceptual reations of polysemous senses. Journal of Memory and Language, 47: 548-70.CrossRefGoogle Scholar
  36. Krovetz, Robert & Bruce Croft. 1992. Lexical ambiguity and information retrieval. ACM Transactions on Information Systems (TOIS), 10(2): 115-141.CrossRefGoogle Scholar
  37. Lakoff, George. 1987. Women, Fire, and Dangerous Things. Chicago: University of Chicago Press.Google Scholar
  38. Malt, Barbara C., Steven A. Sloman, Silvia Gennari, Meiyi Shi & Yuan Wang. 1999. Knowing vs. naming: Similarity and the linguistic categorization of artifacts. Journal of Memory and Language, 40: 230-262.CrossRefGoogle Scholar
  39. McKelvie, David, Chris Brew & Henry Thompson. 1998. Using SGML as a basis for data-intensive natural language processing. Computers and the Humanities, 31(5): 367-388.Google Scholar
  40. Ng, Hwee Tou, Bin Wang & Yee Seng Chan. 2003. Exploiting parallel texts for word sense disambiguation: An empirical study. Proceedings of the 41 st Annual Meeting of the Association for Computational Linguistics (ACL), Sapporo, Japan, 455-462.CrossRefGoogle Scholar
  41. Nunberg, Geoffrey. 1979. The non-uniqueness of semantic solutions: Polysemy. Linguistics and Philosophy, 3: 143-184.CrossRefGoogle Scholar
  42. Olney, John, Carter Revard & Panl Ziff. 1966. Some Monsters in Noah’s Ark. Research Memorandum, Systems Development Corp., Santa Monica, U.S.A.Google Scholar
  43. Palmer, Martha, Christiane Fellbaum & Hoa Trang Dang. 2006. Making finegrained and coarse-grained sense distinctions, both manually and automatically. Natural Language Engineering, 12(3).Google Scholar
  44. Peters, Wim, Piek Vossen, Pedro Diez-Orzas & Geert Adrians. 1998. Crosslinguistic alignment of wordnets with an inter-lingual index. Computers and the Humanities, 32(2-3): 221-51.CrossRefGoogle Scholar
  45. Procter, Paul, ed. 1978. Longman Dictionary of Contemporary English. Harlow, UK: Longman Group.Google Scholar
  46. Pustejovsky, James. 1995. The Generative Lexicon. Cambridge, U.S.A.: MIT Press.Google Scholar
  47. Resnik, Philip & David Yarowsky. 1997a. Distinguishing systems and distinguishing senses: New evaluation methods for word sense disambiguation. Natural Language Engineering, 5(2): 113-133.Google Scholar
  48. Resnik, Philip & David Yarowsky. 1997b. A perspective on word sense disambiguation methods and their evaluation. Proceedings of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How?, Washington, U.S.A, 79-86.Google Scholar
  49. Resnik, Philip & David Yarowsky. 2000. Distinguishing systems and distinguishing senses: New evaluation methods for word sense disambiguation”. Natural Language Engineering, 5(2): 113-133.Google Scholar
  50. Rodd, Jennifer, M. Gareth Gaskell & William Marslen-Wilson. 2002. Making sense of semantic ambiguity: Semantic competition in lexical access. Journal of Memory and Language, 46: 245-266.CrossRefGoogle Scholar
  51. Rodd, Jennifer, M. Gareth Gaskell & William Marslen-Wilson. 2004. Modelling the effects of semantic ambiguity in word recognition. Cognitive Science, 28: 89-104.CrossRefGoogle Scholar
  52. Ruhl, Charles. 1989. On Monosemy: A Study in Linguistic Semantics. Albany: State University of New York Press.Google Scholar
  53. Sanderson, Mark. 1994. Word sense disambiguation and information retrieval. Proceedings of the 17th ACM Special Interest Group on Information Retrieval Conference (SIGIR), 142-151.Google Scholar
  54. Schütze. Hinrich. 1998. Automatic word sense discrimination. Computational Linguistics, 24(1): 97-124.Google Scholar
  55. Sparck Jones, Karen. 1986/1964. Synonymy and Semantic Classification. Edinburgh: Edinburgh University Press.Google Scholar
  56. Stevenson, Mark & Paul Clough. 2004. EuroWordNet as a resource for crosslanguage information retrieval. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), Lisbon, Portugal, 777-780.Google Scholar
  57. Stevenson, Mark & Yorick Wilks. 1999. Combining weak knowledge sources for sense disambiguation. Proceedings of the International Joint Conference for Artificial Intelligence (IJCAI), Stockholm, Sweden, 884-889.Google Scholar
  58. Tufiú, Dan, Radu Ion & Nancy Ide. 2004. Fine-grained word sense disambiguation based on parallel corpora, word alignment, word clustering, and aligned WordNets. Proceedings of the 20 th International Conference on Computational Linguistics (COLING), Geneva, Switzerland, 1312-1318.Google Scholar
  59. Voorhees, Ellen. 1999. Natural language processing and information retrieval. Information Extraction: Towards Scalable, Adaptable Systems, ed. by Maria Teresa Pazienza, 32-48. Germany: Springer.Google Scholar
  60. Vossen, Piek. 2001. Extending, trimming and fusing WordNet for technical documents. Proceedings of the NAACL Workshop on WordNet and Other Lexical Resources Applications, Extensions and Customizations, Pittsburgh, U.S.A.Google Scholar
  61. Vossen, Piek, ed. 1998. EuroWordNet: A Multilingual Database With Lexical Semantic Networks. Amsterdam: Kluwer.Google Scholar
  62. Wierzbicka, Anna. 1989. Semantic primitives and lexical universals. Quaderni di Semantica, X(1): 103-121.Google Scholar
  63. Wilks, Yorick. 1972. Grammar, Meaning and the Machine Analysis of Language. London and Boston: Routledge.Google Scholar
  64. Wilks, Yorick & Roberta Catizone. 2002. What is lexical tuning?. Journal of Semantics, 19(2): 167-190.CrossRefGoogle Scholar
  65. Wilks, Yorick & Mark Stevenson. 1998. The grammar of sense: Using partof-speech tags as a first step in semantic disambiguation. Natural Language Engineering, 4(2): 74-87.Google Scholar
  66. Wilks, Yorick, Brian Slator & Louise Guthrie. 1996. Electric Words: Dictionaries, Computers and Meanings. Cambridge, U.S.A.: MIT Press.Google Scholar
  67. Yarowsky, David. 2000. Hierarchical decision lists for word sense disambiguation. Computers and the Humanities, 34(2): 179-186.CrossRefGoogle Scholar
  68. Zgusta, Ladislav. 1971. Manual of Lexicography. The Hague: Mouton.Google Scholar

Copyright information

© Springer 2007

Authors and Affiliations

  • Nancy Ide
    • 1
  • Yorick Wilks
    • 2
  1. 1.Department of Computer ScienceVassar CollegePoughkeepsieUSA
  2. 2.Department of Computer ScienceUniversity of SheffieldUK

Personalised recommendations