A Semantic Kernel to Exploit Linguistic Knowledge

Basili, Roberto; Cammisa, Marco; Moschitti, Alessandro

doi:10.1007/11558590_30

A Semantic Kernel to Exploit Linguistic Knowledge

Roberto Basili²⁰,
Marco Cammisa²⁰ &
Alessandro Moschitti²⁰

Conference paper

690 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3673))

Abstract

Improving accuracy in Information Retrieval tasks via semantic information is a complex problem characterized by three main aspects: the document representation model, the similarity estimation metric and the inductive algorithm. In this paper an original kernel function sensitive to external semantic knowledge is defined as a document similarity model. This semantic kernel was tested over a text categorization task, under critical learning conditions (i.e. poor training data). The results of cross-validation experiments suggest that the proposed kernel function can be used as a general model of document similarity for IR tasks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bekkerman, R., El-Yaniv, R., Tishby, N., Winter, Y.: On feature distributional clustering for text categorization. In: Proceedings of SIGIR 2001, New Orleans, Louisiana, United States. ACM Press, New York (2001)
Google Scholar
Strzalkowski, T., Carballo, J.P.: Natural language information retrieval: TREC-6 report. In: Text REtrieval Conference (1997)
Google Scholar
Voorhees, E.M.: Using wordnet to disambiguate word senses for text retrieval. In: Proceedings of SIGIR 1993, Pittsburgh, PA, USA (1993)
Google Scholar
Salton, G.: Automatic text processing: the transformation, analysis and retrieval of information by computer. Addison-Wesley, Reading (1989)
Google Scholar
Yang, Y.: Expert network: effective and efficient learning from human decisions in text categorisation and retrieval. In: Proceedings of SIGIR 1994, Dublin, IE (1994)
Google Scholar
Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning (1999)
Google Scholar
Strzalkowski, T., Carballo, J.P., Karlgren, J., Tapanainen, A.H.P., Jarvinen, T.: Natural language information retrieval: TREC-8 report. In: Text REtrieval Conference (1999)
Google Scholar
Lewis, D.D.: An evaluation of phrasal and clustered representations on a text categorization task. In: Proceedings of SIGIR 1992, Kobenhavn, DK, pp. 37–50 (1992)
Google Scholar
Moschitti, A.: Natural Language Processing and Automated Text Categorization: a study on the reciprocal beneficial interactions. PhD thesis, Computer Science Department, Univ. of Rome Tor Vergata (2003)
Google Scholar
Moschitti, A., Basili, R.: Complex linguistic features for text classification: a comprehensive study. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 181–196. Springer, Heidelberg (2004)
Chapter Google Scholar
Smeaton, A.F.: Using NLP or NLP resources for information retrieval tasks. In: Strzalkowski, T. (ed.) Natural language information retrieval. Kluwer Academic Publishers, Dordrecht (1999)
Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
MATH Google Scholar
Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: Proceedings of CKIM 1993 (1993)
Google Scholar
Voorhees, E.M.: Query expansion using lexical-semantic relations. In: Proceedings of SIGIR 1994, Dublin, Ireland (1994)
Google Scholar
Fernandez-Amoros, D., Gonzalo, J., Verdejo, F.: The role of conceptual relations in word sense disambiguation. In: Proceedings of the 6th international workshop on applications of Natural Language for Information Systems (NLDB 2001) (2001)
Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
MATH Google Scholar
Clark, S., Weir, D.: Class-based probability estimation using a semantic hierarchy. Computional Linguistics (2002)
Google Scholar
Li, H., Abe, N.: Generalizing case frames using a thesaurus and the mdl principle. Computational Linguistics (1998)
Google Scholar
Resnik, P.: Selectional preference and sense disambiguation. In: Proceedings of ACL Siglex Workshop on Tagging Text with Lexical Semantics, Why, What and How?, Washington, April 4-5 (1997)
Google Scholar
Agirre, E., Rigau, G.: Word sense disambiguation using conceptual density. In: Proceedings of COLING 1996, Copenhagen, Danmark, pp. 16–22 (1996)
Google Scholar
Basili, R., Cammisa, M., Zanzotto, F.M.: A similarity measure for unsupervised semantic disambiguation. In: Proceedings of Language Resources and Evaluation Conference (2004)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Google Scholar
Haussler, D.: Convolution kernels on discrete structures. Technical report ucs-crl-99-10, University of California Santa Cruz (1999)
Google Scholar
Yang, Y.: An evaluation of statistical approaches to text categorization. Information Retrieval Journal (1999)
Google Scholar
Kontostathis, A., Pottenger, W.: Improving retrieval performance with positive and negative equivalence classes of terms (2002)
Google Scholar
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the American Society of Information Science (1990)
Google Scholar
Scott, S., Matwin, S.: Feature engineering for text classification. In: Bratko, I., Dzeroski, S. (eds.) Proceedings of ICML 1999, San Francisco, US (1999)
Google Scholar
Siolas, G., d’Alché Buc, F.: Support vector machines based on a semantic kernel for text categorization. In: Proceedings of IJCNN 2000. IEEE Computer Society, Los Alamitos (2000)
Google Scholar
Cristianini, N., Shawe-Taylor, J., Lodhi, H.: Latent semantic kernels. J. Intell. Inf. Syst. (2002)
Google Scholar
Kandola, J., Shawe-Taylor, J., Cristianini, N.: Learning semantic similarity. In: NIPS 2002. MIT Press, Cambridge (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Rome “Tor Vergata”, 00133, Roma, Italy
Roberto Basili, Marco Cammisa & Alessandro Moschitti

Authors

Roberto Basili
View author publications
You can also search for this author in PubMed Google Scholar
Marco Cammisa
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Moschitti
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Center on Complex Systems and Artificial Intelligence (CSAI) Department of Computer Science, Systems and Communication (DISCo), University of Milan, Bicocca viale Sarca, 336, 20126, Milan, (Italy)
Stefania Bandini
CSAI - Complex Systems & Artificial Intelligence Research Centre, University of Milano–Bicocca,
Sara Manzoni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Basili, R., Cammisa, M., Moschitti, A. (2005). A Semantic Kernel to Exploit Linguistic Knowledge. In: Bandini, S., Manzoni, S. (eds) AI*IA 2005: Advances in Artificial Intelligence. AI*IA 2005. Lecture Notes in Computer Science(), vol 3673. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11558590_30

Download citation

DOI: https://doi.org/10.1007/11558590_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29041-4
Online ISBN: 978-3-540-31733-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics