Learning Rules for Large-Vocabulary Word Sense Disambiguation: A Comparison of Various Classifiers

Paliouras, Georgios; Karkaletsis, Vangelis; Androutsopoulos, Ion; Spyropoulos, Constantine D.

doi:10.1007/3-540-45154-4_35

Georgios Paliouras²,
Vangelis Karkaletsis²,
Ion Androutsopoulos² &
…
Constantine D. Spyropoulos²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1835))

Included in the following conference series:

International Conference on Natural Language Processing

922 Accesses
2 Citations

Abstract

In this article we compare the performance of various machine learning algorithms on the task of constructing word-sense disambiguation rules from data. The distinguishing characteristic of our work from most of the related work in the field is that we aim at the disambiguation of all content words in the text, rather than focussing on a small number of words. In an earlier study we have shown that a decision tree induction algorithm performs well on this task. This study compares decision tree induction with other popular learning methods and discusses their advantages and disadvantages. Our results confirm the good performance of decision tree induction, which outperforms the other algorithms, due to its ability to order the features used for disambiguation, according to their contribution in assigning the correct sense.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Black E.: An experiment in computational discrimination of English word senses. IBM Journal of Research and Development, v. 32, n. 2, (1988) 185–194
Article Google Scholar
Bruce, R. and Guthrie, L.: Genus Disambiguation: A study in weighted preference. In Proceedings of the International Conference on Computational Linguistics, (1992) 1187–1191
Google Scholar
Clark, P. and Niblett, T.: The CN2 algorithm. Machine Learning, 3(4), (1989) 261–283.
Google Scholar
Cowie, J., Guthrie, J. A. and Guthrie, L.: Lexical disambiguation using simulated annealing. In Proceedings of the International Conference on Computational Linguistics, (1992) 359–365
Google Scholar
Duda, R. O. and Hart, P. E.: Pattern Classification and Scene Analysis, John Wiley, (1973)
Google Scholar
Gale, W. A., Church, K. W. and Yarowsky, D.: A method for disambiguating word senses in a large corpus. Computers and the Humanities, v. 26, (1993) 415–439
Article Google Scholar
Ide, N. and Veronís, J.: Introduction to the special issue on Word Sense Disambiguation: The state of the art. Computational Linguistics, v. 24, n. 1, (1998) 1–40
Google Scholar
Kohavi R: The Power of Decision Tables. In Proceedings of the European Conference on Machine Learning, (1995) 174–189
Google Scholar
Leacock, C., Towell, G. and Voorhees, E. M.: Corpus-based statistical sense resolution. In Proceedings of the ARPA Human Languages Technology Workshop (1993)
Google Scholar
Leacock, C., Chodrow, M. and Miller, G. A.: Using corpus statistics and WordNet relations for sense identification. Computational Linguistics, v. 24, n. 1, (1998) 147–165
Google Scholar
Lesk, M.: Automated sense disambiguation using machine-readable dictionaries: How to tell an pine cone from an ice cream cone. In Proceedings of the SIGDOC Conference, (1986) 24–26
Google Scholar
Michalski, R. S., Mozetic, I., Hong, J. and Lavrac, N.: The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. In Proceedings of the National Conference on Artificial Intelligence, (1986) 1041–1045.
Google Scholar
Mitchell, T. M., Machine Learning, McGraw-Hill (1997)
Google Scholar
Paliouras, G., Karkaletsis, V. and Spyropoulos, C. D.,: Machine Learning for Domain-Adaptive Word Sense Disambiguation. In Proceedings of the Workshop on Adapting Lexical and Corpus Resources to Sublanguages and Applications, International Conference on Language Resources and Evaluation, Granada, Spain, May 26 (1998)
Google Scholar
Paliouras, G., Karkaletsis V. and Spyropoulos, C. D.: Learning Rules for Large Vocabulary Word Sense Disambiguation. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’ 99), v. 2, 674–679 (1999)
Google Scholar
Quinlan, J. R.: C4.5: Programs for machine learning, Morgan-Kaufmann (1993)
Google Scholar
Sch:utze, H.: Automatic word sense discrimination. Computational Linguistics, v. 24, n. 1, (1998) 97–124
MathSciNet Google Scholar
Towell, G. and Voorhees, E. M.: Disambiguating highly ambiguous words. Computational Linguistics, v. 24, n. 1, (1998) 125–146
Google Scholar
Wilks, Y. A., Fass, D. C, Guo, M., MacDonald, J. E., Plate, T. and Slator, B.M: Providing machine tractable dictionary tools. Machine Translation, v. 5, (1990) 99–154
Article Google Scholar
Witten, I.H. and Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan-Kaufmann (1999)
Google Scholar
Yarowsky, D.: Word-sense disambiguation using statistical models of Rogetrss categories trained on large corpora. In Proceedings of the International Conference in Computational Linguistics, (1992) 454–460
Google Scholar
Yarowsky, D.: Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, (1994) 88–95
Google Scholar
Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, (1995) 189–196
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Informatics & Telecommunications, NCSR “Demokritos”, Aghia Paraskevi Attikis, Athens, 15310, Greece
Georgios Paliouras, Vangelis Karkaletsis, Ion Androutsopoulos & Constantine D. Spyropoulos

Authors

Georgios Paliouras
View author publications
You can also search for this author in PubMed Google Scholar
Vangelis Karkaletsis
View author publications
You can also search for this author in PubMed Google Scholar
Ion Androutsopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Constantine D. Spyropoulos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Engineering Department and Computer Technology Institute, University of Patras, 26500, Patras, Greece
Dimitris N. Christodoulakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Paliouras, G., Karkaletsis, V., Androutsopoulos, I., Spyropoulos, C.D. (2000). Learning Rules for Large-Vocabulary Word Sense Disambiguation: A Comparison of Various Classifiers. In: Christodoulakis, D.N. (eds) Natural Language Processing — NLP 2000. NLP 2000. Lecture Notes in Computer Science(), vol 1835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45154-4_35

Download citation

DOI: https://doi.org/10.1007/3-540-45154-4_35
Published: 25 May 2000
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67605-8
Online ISBN: 978-3-540-45154-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics