Supervised Corpus-Based Methods for WSD

Màrquez, Lluís; Escudero, Gerard; Martínez, David; Rigau, German

doi:10.1007/978-1-4020-4809-8_7

Supervised Corpus-Based Methods for WSD

Lluís Màrquez⁵,
Gerard Escudero⁶,
David Martínez⁷ &
…
German Rigau⁸

Chapter

983 Accesses
20 Citations

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 33))

In this chapter, the supervised approach to word sense disambiguation is presented, which consists of automatically inducing classification models or rules from annotated examples. We start by introducing the machine learning framework for classification and some important related concepts. Then, a review of the main approaches in the literature is presented, focusing on the following issues: learning paradigms, corpora used, sense repositories, and feature representation. We also include a more detailed description of five statistical and machine learning algorithms, which are experimentally evaluated and compared on the DSO corpus. In the final part of the chapter, the current challenges of the supervised learning approach to WSD are briefly discussed.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abney, Steven. 2002. Bootstrapping. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, U.S.A., 360-367.
Google Scholar
Abney, Steven. 2004. Understanding the Yarowsky algorithm. Computational Linguistics, 30(3): 365-395.
Article Google Scholar
Agirre, Eneko & David Martínez. 2000. Exploring automatic word sense disambiguation with decision lists and the Web. Proceedings of the Semantic Annotation and Intelligent Annotation Workshop, organized by COLING. Luxembourg, 11-19.
Google Scholar
Agirre, Eneko & David Martínez. 2001. Knowledge sources for WSD. Proceedings of the Fourth International Text Speech and Dialogue Conference (TSD), Plzen , Czech Republic, 1-10.
Google Scholar
Agirre, Eneko & David Martínez. 2004a. The Basque Country University system: English and Basque tasks. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 44-48.
Google Scholar
Agirre, Eneko & David Martínez. 2004b. Smoothing and word sense disambiguation. Proceedings of España for Natural Language Processing (EsTAL), Alicante, Spain, 360-371.
Google Scholar
Agirre, Eneko & David Martínez. 2004c. Unsupervised WSD based on automatically retrieved examples: the importance of bias. Proceedings of the 10th Conference on Empirical Methods in Natural Language Processing (EMNLP), Barcelona, Spain, 25-32.
Google Scholar
Agirre Eneko, Oier Lopez de Lacalle, & David Martínez. 2005. Exploring feature spaces with SVD and unlabeled data for word sense disambiguation. Proceedings of the 5th Conference on Recent Advances on Natural Language Processing (RANLP), Borovets, Bulgary, 32-38.
Google Scholar
Argamon-Engelson, Shlomo & Ido Dagan. 1999. Committee-based sample selection for probabilistic classifiers. Journal of Artificial Intelligence Research, 11: 335-460.
Google Scholar
Berger, Adam, Steven Della Pietra & Vincent Della Pietra. 1996. A maximum entropy approach to natural language processing. Computational Linguistics, 22 (1): 39-72.
Google Scholar
Boser, Bernhard E., Isabelle M. Guyon & Vladimir N. Vapnik. 1992. A training algorithm for optimal margin classifiers. Proceedings of the 5th Annual Workshop on Computational Learning Theory (CoLT), Pittsburgh, U.S.A., 144-152.
Chapter Google Scholar
Blum, Avrim & Thomas Mitchell. 1998. Combining labeled and unlabeled data with co-training. Proceedings of the 11th Annual Conference on Computational Learning Theory (CoLT), 92-100.
Google Scholar
Bruce, Rebecca & Janice Wiebe. 1994. Word-sense disambiguation using decomposable models. Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL), Las Cruces, U.S.A., 139-146.
Google Scholar
Cabezas, Clara, Indrajit Bhattacharya & Philip Resnik. 2004. The University of Maryland Senseval-3 system descriptions. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 83-87.
Google Scholar
Cardie, Claire & Raymond Mooney. 1999. Guest editors’ introduction: Machine learning and natural language. Machine Learning, 34: 5-9.
Article Google Scholar
Carletta, Jean C. 1996. Assessing agreement of classification tasks: The Kappa statistic. Computational Linguistics, 22(2): 249-254.
Google Scholar
Carpuat, Marine, Weifeng Su & Dekai Wu. 2004. Augmenting ensemble classification for word sense disambiguation with a kernel PCA model. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 88-92.
Google Scholar
Chen, Stanley F. 1996. Building Probabilistic Models for Natural Language. Ph.D. thesis, Technical Report TR-02-96, Center for Research in Computing Technology, Harvard University.
Google Scholar
Ciaramita, Massimiliano & Mark Johnson. 2004. Multi-component word sense disambiguation. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 97-100.
Google Scholar
Chan, Yee S. & Hwee T. Ng. 2005. Scaling up word sense disambiguation via parallel texts. Proceedings of the 20th National Conference on Artificial Intelligence (AAAI), Pittsburgh, U.S.A., 1037-1042.
Google Scholar
Chklovski, Timothy & Rada Mihalcea. 2002. Building a sense tagged corpus with Open Mind Word Expert. Proceedings of the ACL Workshop on Word Sense Disambiguation: Recent Successes and Future Directions, Philadelphia, U.S.A., 116-122.
Chapter Google Scholar
Clark, Stephen, James Curran & Miles Osborne. 2003. Bootstrapping POS taggers using unlabelled data. Proceedings of 7th Conference of Natural Language Learning (CoNLL), Edmonton, Canada, 164-167.
Google Scholar
Cohen, Jacob. 1960. A coefficient of agreement for nominal scales. Journal of Educational and Psychological Measurement, 20: 37-46.
Article Google Scholar
Collins, Michael & Yoram Singer. 1999. Unsupervised models for named entity classification. Proceedings of the Joint SIGDAT Conference on Empirica. Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC), College Park, U.S.A., 100-110.
Google Scholar
Cost, Scott & Steven. Salzberg. 1993. A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10(1): 57-78.
Google Scholar
Cristianini, Nello & John Shawe-Taylor. 2000. An Introduction to Support Vector Machines. Cambridge, U.K.: Cambridge University Press.
Google Scholar
Cuadros, Montse, Jordi Atserias, Mauro Castillo & German Rigau. 2004. Automatic acquisition of sense examples using exretriever. Proceedings of the Iberamia Workshop on Lexical Resources and The Web for Word Sense Dismabiguation, Puebla, México, 97-104.
Google Scholar
Dagan, Ido, Yael Karov & Dan Roth. 1997. Mistake-driven learning in text categorization. Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP), Providence, U.S.A., 55-63.
Google Scholar
Daudé Jordi, Lluís Padró & German Rigau. 1999. Mapping multilingual hierarchies using relaxation labelling. Proceedings of Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC), College Park, U.S.A., 12-19.
Google Scholar
Daudé Jordi, Lluís Padró & German Rigau. 2000. Mapping WordNets using structural information. Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL), Hong Kong, China, 504-511.
Google Scholar
Daudé Jordi, Lluís Padró & German Rigau. 2001. A complete WN1.5 to WN1.6 mapping. Proceedings of NAACL Workshop on WordNet and Other Lexical Resources: Applications, Extensions and Customizations, Pittsburg, U.S.A., 83-88.
Google Scholar
Daelemans, Walter, Antal Van den Bosch & Jakub Zavrel. 1999. Forgetting exceptions is harmful in language learning. Machine Learning, 34: 11-41.
Article Google Scholar
Daelemans, Walter & Véronique Hoste. 2002. Evaluation of machine learning methods for natural language processing tasks. Proceedings of the 3 ^rd International Conference on Language Resources and Evaluation (LREC), Las Palmas, Spain, 755-760.
Google Scholar
Decadt Bart, Véronique Hoste, Walter Daelemans & Antal van den Bosch. 2004. GAMBL, genetic algorithm optimization of memory-based WSD. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 108-112.
Google Scholar
Dietterich, Thomas G. 1997. Machine learning research: four current directions. Artificial Intelligence Magazine, 18(4): 97-136.
Google Scholar
Dietterich, Thomas G. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7): 1895-1923.
Article Google Scholar
Duda, Richard O., Peter E. Hart & David G. Stork. 2001. Pattern classification, 2nd Edition. New York: John Wiley & Sons.
Google Scholar
Edmonds, Philip & Scott Cotton. 2001. Senseval-2: Overview. Proceedings of Senseval-2: Second International Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France, 1-6.
Google Scholar
Escudero, Gerard, Lluís Màrquez & German Rigau. 2000a. Boosting applied to word sense disambiguation. Proceedings of the 12th European Conference on Machine Learning (ECML), Barcelona, Spain, 129-141.
Google Scholar
Escudero, Gerard, Lluís Màrquez & German Rigau. 2000b. Naive bayes and exemplar-based approaches to word sense disambiguation revisited. Proceedings of the 14th European Conference on Artificial Intelligence (ECAI), Berlin, Germany, 421-425.
Google Scholar
Escudero, Gerard, Lluís Màrquez & German Rigau. 2000c. On the portability and tuning of supervised word sense disambiguation systems. Proceedings of the joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC), Hong Kong, China, 172-180.
Google Scholar
Escudero, Gerard, Lluís Màrquez & German Rigau. 2001. Using LazyBoosting for word sense disambiguation. Proceedings of Senseval-2: Second International Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France.
Google Scholar
Escudero, Gerard, Lluís Màrquez & German Rigau. 2004. TALP system for the English lexical sample task. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, 113-116, Barcelona, Spain.
Google Scholar
Fellbaum, Christiane, ed. 1998. WordNet: An Electronic Lexical Database. Cambridge, U.S.A.: The MIT Press.
Google Scholar
Florian, Radu, Silviu Cucerzan, C. Schafer & David Yarowsky. 2002. Combining classifiers for word sense disambiguation. Natural Language Engineering, 8 (4): 327-341.
Article Google Scholar
Francis, W. Nelson & Henry Kuþera. 1982. Frequency analysis of English usage: Lexicon and grammar. Boston: Houghton Mifflin Company.
Google Scholar
Fujii, Atsushi, Kentaro Inui, Takenobu Tokunaga & Hozumi Tanaka. 1998. Selective sampling for example-based word sense disambiguation. Computational Linguistics, 24(4): 573-598.
Google Scholar
Gale, William, Kenneth Church & David Yarowsky. 1992. One sense per discourse. Proceedings of the DARPA Speech and Natural Language Workshop, 233-237.
Google Scholar
Gale, William, Kenneth Church & David Yarowsky. 1993. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26: 415-439.
Article Google Scholar
Grozea, Cristian. 2004. Finding optimal parameter settings for high performance word sense disambiguation. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 125-128.
Google Scholar
Hoste, Véronique, Anne Kool & Walter Daelemans. 2001. Classifier optimization and combination in the English all words task. Proceedings of Senseval-2: Second International Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France, 83-86.
Google Scholar
Hoste, Véronique, Walter Daelemans, Iris Hendrickx & Antal van den Bosch. 2002a. Evaluating the results of a memory-based word-expert approach to unrestricted word sense disambiguation. Proceedings of the Workshop on Word Sense Disambiguation: Recent Successes and Future Directions, Philadelphia, U.S.A., 95-101.
Chapter Google Scholar
Hoste, Véronique, Iris Hendrickx, Walter Daelemans & Antal van den Bosch. 2002b. Parameter optimization for machine-learning of word sense disambiguation. Natural Language Engineering, 8(4): 311-325.
Article Google Scholar
Kilgarriff, Adam. 1998. Senseval: An exercise in evaluating word sense disambiguation programs. Proceedings of the European Conference on Lexicography (EURALEX), 176-174,
Google Scholar
Liege, Belgium. Also in Proceedings of the 1st Conference on Language Resources and Evaluation (LREC), Granada, Spain, 581-588.
Google Scholar
Kilgarriff, Adam & Joseph Rosenzweig. 2000. English Senseval: Report and results. Proceedings of the 2nd Conference on Language Resources and Evaluation (LREC), Athens, Greece, 1239-1244.
Google Scholar
Kohomban, Upali S. & Wee S. Lee. 2005. Learning semantic classes for word sense disambiguation. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), Ann Harbor, U.S.A., 34-41.
Chapter Google Scholar
Leacock, Claudia, Geoffrey Towell & Ellen Voorhees. 1993. Towards building contextual representations of word senses using statistical models. Proceedings of the ACL SIGLEX Workshop on Acquisition of Lexical Knowledge from Text, 10-20.
Google Scholar
Leacock, Claudia, Martin Chodorow & George A. Miller. 1998. Using corpus statistics and WordNet relations for sense identication. Computational Linguistics, 24(1): 147-165.
Google Scholar
Lee, Yoong K. & Hwee T. Ng. 2002. An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation. Proceedings of the 7th Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, U.S.A., 41-48.
Chapter Google Scholar
Lee, Yoong K., Hwee T. Ng & Tee K. Chia. 2004. Supervised word sense disambiguation with support vector machines and multiple knowledge sources. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 137-140.
Google Scholar
Lewis, David & William Gale. 1994. Training text classifiers by uncertainty sampling. Proceedings of the International ACM Conference on Research and Development in Information Retrieval, 3-12.
Google Scholar
Manning, Christopher & Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing, Cambridge, U.S.A.: The MIT Press.
Google Scholar
Martínez, David, Eneko Agirre & Lluís Màrquez. 2002. Syntactic features for high precision word sense disambiguation. Proceedings of the 19th International Conference on Computational Linguistics (COLING), Taipei, Taiwan, 1-7.
Chapter Google Scholar
Martínez David & Eneko Agirre. 2000. One sense per collocation and genre/topic variations. Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC), Hong Kong, China, 207-215.
Google Scholar
McCarthy, Diana, Rob Koeling, Julie Weeds & John Carroll. 2004. Finding predominant senses in untagged text. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL). Barcelona, Spain, 151-154.
Google Scholar
Mihalcea, Rada. 2002a. Bootstrapping large sense tagged corpora. Proceedings of the 3rd International Conference on Languages Resources and Evaluation (LREC), Las Palmas, Spain.
Google Scholar
Mihalcea, Rada. 2002b. Instance based learning with automatic feature selection applied to word sense disambiguation. Proceedings of the 19th International Conference on Computational Linguistics (COLING), Taipei, Taiwan.
Google Scholar
Mihalcea Rada. 2004. Co-training and self-training for word sense disambiguation. Proceedings of the Conference on Natural Language Learning (CoNLL). Boston, U.S.A., 33-40.
Google Scholar
Mihalcea, Rada & Philip Edmonds, eds. 2004. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain. (http://www.senseval.org / )
Mihalcea, Rada & Dan Moldovan. 1999. An automatic method for generating sense tagged corpora. Proceedings of the 16th National Conference on Artificial Intelligence (AAAI), Orlando, U.S.A., 461-466.
Google Scholar
Miller, George. 1990. WordNet: An on-line lexical database. International Journal of Lexicography, 3(4): 235-312.
Article Google Scholar
Miller, George A., Claudia Leacock, Randee Tengi & Ross T. Bunker. 1993. A semantic concordance. Proceedings of the ARPA Workshop on Human Language Technology, Princeton, U.S.A., 303-308.
Chapter Google Scholar
Mitchell, Tom. 1997. Machine Learning. McGraw Hill.
Google Scholar
Montoyo Andrés, Armando Suárez, German Rigau & Manuel Palomar. 2005. Combining knowledge- and corpus-based word-sense-disambiguation methods. Journal of Artificial Intelligence Research, 23: 299-330.
Google Scholar
Mooney, Raymond J. 1996. Comparative experiments on disambiguating word senses: an illustration of the role of bias in machine learning. Proceedings of the 1st Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, U.S.A., 82-91.
Google Scholar
Murata, Masaki, Masao Utiyama, Kiyotaka Uchimoto, Qing Ma, & Hitoshi Isahara. 2001. Japanese word sense disambiguation using the simple Bayes and support vector machine methods. Proceedings of Senseval-2: Second International Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France, 135-138.
Google Scholar
Ng, Hwee T. & Hian B. Lee. 1996. Integrating multiple knowledge sources to disambiguate word senses: An exemplar-based approach. Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics (ACL), Santa Cruz, U.S.A., 40-47.
Chapter Google Scholar
Ng, Hwee T. 1997a. Exemplar-based word sense disambiguation: Some recent improvements. Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP), Providence, U.S.A., 208-213.
Google Scholar
Ng, Hwee T. 1997b. Getting serious about word sense disambiguation. Proceedings of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How?, Washington, U.S.A., 1-7.
Google Scholar
Ng, Hwee T., C. Y. Lim & Foo, S. K. 1999. A case study on inter-annotator agreement for word sense disambiguation. Proceedings of the ACL SIGLEX Workshop on Standarizing Lexical Resources, College Park, U.S.A., 9-13.
Google Scholar
Nigam, Kamal & Rayid Ghani. 2000. Analyzing the effectiveness and applicability of co-training. Proceedings of the 9th International Conference on Information and Knowledge Management (CIKM), McLean, U.S.A., 86-93.
Chapter Google Scholar
Niu, Chen, Wei Li, Rohini K. Srihari, & Huifeng Li. 2005. Word independent context pair classification model for word sense disambiguation. Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL), Ann Arbor, U.S.A., 33-39.
Google Scholar
Pedersen, Ted & Rebecca Bruce. 1997. A new supervised learning algorithm for word sense disambiguation. Proceedings of the 14th National Conference on Artificial Intelligence (AAAI), Providence, U.S.A., 604-609.
Google Scholar
Pedersen, Ted. 2001. A decision tree of bigrams is an accurate predictor of word senses. Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), Pittsburgh, U.S.A., 79-86.
Google Scholar
Pham, Thanh P., Hwee T. Ng, & Wee S. Lee. 2005. Word sense disambiguation with semi-supervised learning. Proceedings of the 20th National Conference on Artificial Intelligence (AAAI), Pittsburgh, U.S.A., 1093-1098.
Google Scholar
Popescu, Marius. 2004. Regularized least-squares classification for word sense disambiguation. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 209-212.
Google Scholar
Procter, Paul, ed. 1978. Longman Dictionary of Contemporary English. London: Longman Group.
Google Scholar
Quinlan, John R. 1993. C4.5: Programs for Machine Learning. San Mateo, U.S.A.: Morgan Kaufmann.
Google Scholar
Resnik, Philip & David Yarowsky. 1997. A perspective on word sense disambiguation methods and their evaluation. Proceedings of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How?, Washington, U.S.A., 79-86.
Google Scholar
Rivest, Ronald. 1987. Learning decision lists. Machine Learning, 2(3): 229-246.
Google Scholar
Schapire, Robert E. & Yoram Singer. 1999. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3): 297-336.
Article Google Scholar
Schapire, Robert E. & Yoram Singer. 2000. Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3)135-168.
Article Google Scholar
Schapire, Robert E. 2003. The boosting approach to machine learning: An overview. Nonlinear Estimation and Classification, ed. by D. D. Denison, M. H. Hansen, C. C. Holmes, B. Mallick, & B. Yu. New York, U.S.A.: Springer.
Google Scholar
Snyder, Benjamin & Martha Palmer. 2004. The English all-words task. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 41-43.
Google Scholar
Stevenson, Mark & Yorick Wilks. 2001. The interaction of knowledge sources in word sense disambiguation. Computational Linguistics, 27(3): 321-349.
Article Google Scholar
Strapparava, Carlo, Alfio Gliozzo & Claudio Giuliano. 2004. Pattern abstraction and term similarity for word sense disambiguation: IRST at Senseval-3. Proceedings of Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 229-234.
Google Scholar
Suárez, Armando & Manuel Palomar. 2002. A maximum entropy-based word sense disambiguation system. Proceedings of the 19th International Conference on Computational Linguistics (COLING), Taipei, Taiwan, 960-966.
Google Scholar
Towell, Geoffrey, Ellen Voorhees & Claudia Leacock. 1998. Disambiguating highly ambiguous words. Computational Linguistics, 24(1): 125-146.
Google Scholar
Tufiú, Dan, Radu Ion & Nancy Ide. 2004. Fine-grained word sense disambiguation based on parallel corpora, word alignment, word clustering and aligned wordnets. Proceedings of the 20th International Conference on Computational Linguistics (COLING), Geneva, Switzerland, 1312-1318.
Google Scholar
Vapnik, Vladimir. 1998. Statistical Learning Theory. New York, U.S.A.: John Wiley.
Google Scholar
Véronis, Jean. 1998. A study of polysemy judgements and inter-annotator agreement. Programme and Advanced Papers of Senseval-1: The First International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Herstmonceux, England, 2-4.
Google Scholar
Vossen, Piek, ed. 1998. EuroWordNet. A multilingual database with lexical semantic networks. Dordrecht, Germany: Kluwer Academic Publishers.
Google Scholar
Wilks, Yorick, Dan Fass, Cheng-ming Guo, James McDonald, Tony Plate & Brian M. Slator. 1993. Providing machine tractable dictionary tools. Semantics and the Lexicon, ed. by James Pustejowsky, 341-401.
Google Scholar
Wu, Dekai, Weifeng Su & Marine Carpuat. 2004. A kernel PCA method for superior word sense disambiguation. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL), Barcelona, Spain, 637-644.
Chapter Google Scholar
Yarowsky, David. 1992. Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. Proceedings of the 14th International Conference on Computational Linguistics (COLING), Nantes, France, 454-460.
Google Scholar
Yarowsky, David. 1993. One sense per collocation. Proceedings of the ARPA Human Language Technology Workshop, Princeton, U.S.A., 266-271.
Chapter Google Scholar
Yarowsky, David. 1994. Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French. Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL), Las Cruces, U.S.A., 88-95.
Chapter Google Scholar
Yarowsky, David. 1995a. Unsupervised word sense disambiguation rivaling supervised methods. Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL), Cambridge, U.S.A., 189-196.
Google Scholar
Yarowsky, David. 1995b. Three Machine Learning Algorithms for Lexical Ambiguity Resolution. Ph.D. Thesis, Department of Computer and Information Sciences, University of Pennsylvania.
Google Scholar
Yarowsky, David. 2000. Hierarchical decision lists for word sense disambiguation. Computers and the Humanities, 34(2): 179-186.
Article Google Scholar
Yarowsky, David, Silviu Cucerzan, Radu Florian, Charles Schafer & Richard Wicentowski. 2001. The Johns Hopkins Senseval-2 system descriptions. Proceedings of Senseval-2: Second International Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France.
Google Scholar
Yarowsky, David & Radu Florian. 2002. Evaluating sense disambiguation performance across diverse parameter spaces. Natural Language Engineering 8 (4): 293-310.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Polytechnical University of Catalunya (UPC), C/ Jordi Girona Salgado 1-3, E-08034, Barcelona, Catalonia, Spain
Associate Professor Lluís Màrquez
Universitat Politècnica de Catalunya, Urgell 187, E-08036, Barcelona, Catalonia, Spain
Assistant Professor Gerard Escudero
Department of Computer Science, University of Sheffield, S1 4DP, Sheffield, UK
David Martínez
Department of Computer Science, University of the Basque Country, Manuel de ardizabal 1, E-20018, Donostia, Basque Country, Spain
Associate Professor German Rigau

Authors

Associate Professor Lluís Màrquez
View author publications
You can also search for this author in PubMed Google Scholar
Assistant Professor Gerard Escudero
View author publications
You can also search for this author in PubMed Google Scholar
David Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Associate Professor German Rigau
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of the Basque Country, Manuel de Lardizabal 1, E-20018, Donostia, Basque Country, Spain
Eneko Agirre
Sharp Laboratories of Europe Limited, Oxford Science Park, OX4 4GB, Oxford, UK
Philip Edmonds

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Màrquez, L., Escudero, G., Martínez, D., Rigau, G. (2007). Supervised Corpus-Based Methods for WSD. In: Agirre, E., Edmonds, P. (eds) Word Sense Disambiguation. Text, Speech and Language Technology, vol 33. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-4809-8_7

Download citation

DOI: https://doi.org/10.1007/978-1-4020-4809-8_7
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-4808-1
Online ISBN: 978-1-4020-4809-8
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)

Publish with us

Policies and ethics

Buying options