Abstract
We present a novel approach to incorporating semantic information to the problems of natural language processing, in particular to the document classification task. The approach builds on the intuition that semantic relatedness of words can be viewed as a non-static property of the words that depends on the particular task at hand. The semantic relatedness information is incorporated using feature transformations, where the transformations are based on a feature ontology and on the particular classification task and data. We demonstrate the approach on the problem of classifying MEDLINE-indexed documents using the MeSH ontology. The results suggest that the method is capable of improving the classification performance on most of the datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rada, R., Bicknell, E.: Ranking documents with a thesaurus. Journal of the American Society for Information Science 40, 304–310 (1989)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Mellish, C. (ed.) Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453. Morgan Kaufmann, San Francisco (1995)
Budanitsky, A.: Lexical semantic relatedness and its application in natural language processing. Technical Report CSRG390, University of Toronto (1999)
Baker, D., McCallum, A.: Distributional clustering of words for text classification. In: Croft, W.B., Moffat, A., van Rijsbergen, C.J., Wilkinson, R., Zobel, J. (eds.) Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 96–103. ACM Press, New York (1998)
Scott, S., Matwin, S.: Text classification using WordNet hypernyms. In: Harabagiu, S. (ed.) Use of WordNet in Natural Language Processing Systems: Proceedings of the Conference, pp. 38–44, Somerset, New Jersey. Association for Computational Linguistics (1998)
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Cohen, W.W., Hirsh, H. (eds.) Proceedings of the 11th International Conference on Machine Learning, pp. 121–129. Morgan Kaufmann, San Francisco (1994)
Witten, I.H., Frank, E.: Data Mining. Morgan Kauffman, San Francisco (2000)
Dietterich, T.G.: Approximate statistical test for comparing supervised classification learning algorithms. Neural Computation 10, 1895–1923 (1998)
Alpaydm, E.: Combined 5 × 2 cv F test for comparing supervised classification learning algorithms. Neural Computation 11, 1885–1892 (1999)
Ng, H.T.: Exemplar-based word sense disambiguation: Some recent improvements. In: Cardie, C., Weischedel, R. (eds.) Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 208–213, Somerset, New Jersey. Association for Computational Linguistics (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ginter, F., Pyysalo, S., Boberg, J., Järvinen, J., Salakoski, T. (2004). Ontology-Based Feature Transformations: A Data-Driven Approach. In: Vicedo, J.L., Martínez-Barco, P., Muńoz, R., Saiz Noeda, M. (eds) Advances in Natural Language Processing. EsTAL 2004. Lecture Notes in Computer Science(), vol 3230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30228-5_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-30228-5_25
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23498-2
Online ISBN: 978-3-540-30228-5
eBook Packages: Springer Book Archive