Skip to main content

A Semantic Kernel to Exploit Linguistic Knowledge

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3673))

Abstract

Improving accuracy in Information Retrieval tasks via semantic information is a complex problem characterized by three main aspects: the document representation model, the similarity estimation metric and the inductive algorithm. In this paper an original kernel function sensitive to external semantic knowledge is defined as a document similarity model. This semantic kernel was tested over a text categorization task, under critical learning conditions (i.e. poor training data). The results of cross-validation experiments suggest that the proposed kernel function can be used as a general model of document similarity for IR tasks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bekkerman, R., El-Yaniv, R., Tishby, N., Winter, Y.: On feature distributional clustering for text categorization. In: Proceedings of SIGIR 2001, New Orleans, Louisiana, United States. ACM Press, New York (2001)

    Google Scholar 

  2. Strzalkowski, T., Carballo, J.P.: Natural language information retrieval: TREC-6 report. In: Text REtrieval Conference (1997)

    Google Scholar 

  3. Voorhees, E.M.: Using wordnet to disambiguate word senses for text retrieval. In: Proceedings of SIGIR 1993, Pittsburgh, PA, USA (1993)

    Google Scholar 

  4. Salton, G.: Automatic text processing: the transformation, analysis and retrieval of information by computer. Addison-Wesley, Reading (1989)

    Google Scholar 

  5. Yang, Y.: Expert network: effective and efficient learning from human decisions in text categorisation and retrieval. In: Proceedings of SIGIR 1994, Dublin, IE (1994)

    Google Scholar 

  6. Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning (1999)

    Google Scholar 

  7. Strzalkowski, T., Carballo, J.P., Karlgren, J., Tapanainen, A.H.P., Jarvinen, T.: Natural language information retrieval: TREC-8 report. In: Text REtrieval Conference (1999)

    Google Scholar 

  8. Lewis, D.D.: An evaluation of phrasal and clustered representations on a text categorization task. In: Proceedings of SIGIR 1992, Kobenhavn, DK, pp. 37–50 (1992)

    Google Scholar 

  9. Moschitti, A.: Natural Language Processing and Automated Text Categorization: a study on the reciprocal beneficial interactions. PhD thesis, Computer Science Department, Univ. of Rome Tor Vergata (2003)

    Google Scholar 

  10. Moschitti, A., Basili, R.: Complex linguistic features for text classification: a comprehensive study. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 181–196. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Smeaton, A.F.: Using NLP or NLP resources for information retrieval tasks. In: Strzalkowski, T. (ed.) Natural language information retrieval. Kluwer Academic Publishers, Dordrecht (1999)

    Google Scholar 

  12. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  13. Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: Proceedings of CKIM 1993 (1993)

    Google Scholar 

  14. Voorhees, E.M.: Query expansion using lexical-semantic relations. In: Proceedings of SIGIR 1994, Dublin, Ireland (1994)

    Google Scholar 

  15. Fernandez-Amoros, D., Gonzalo, J., Verdejo, F.: The role of conceptual relations in word sense disambiguation. In: Proceedings of the 6th international workshop on applications of Natural Language for Information Systems (NLDB 2001) (2001)

    Google Scholar 

  16. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)

    MATH  Google Scholar 

  17. Clark, S., Weir, D.: Class-based probability estimation using a semantic hierarchy. Computional Linguistics (2002)

    Google Scholar 

  18. Li, H., Abe, N.: Generalizing case frames using a thesaurus and the mdl principle. Computational Linguistics (1998)

    Google Scholar 

  19. Resnik, P.: Selectional preference and sense disambiguation. In: Proceedings of ACL Siglex Workshop on Tagging Text with Lexical Semantics, Why, What and How?, Washington, April 4-5 (1997)

    Google Scholar 

  20. Agirre, E., Rigau, G.: Word sense disambiguation using conceptual density. In: Proceedings of COLING 1996, Copenhagen, Danmark, pp. 16–22 (1996)

    Google Scholar 

  21. Basili, R., Cammisa, M., Zanzotto, F.M.: A similarity measure for unsupervised semantic disambiguation. In: Proceedings of Language Resources and Evaluation Conference (2004)

    Google Scholar 

  22. Cristianini, N., Shawe-Taylor, J.: An introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  23. Haussler, D.: Convolution kernels on discrete structures. Technical report ucs-crl-99-10, University of California Santa Cruz (1999)

    Google Scholar 

  24. Yang, Y.: An evaluation of statistical approaches to text categorization. Information Retrieval Journal (1999)

    Google Scholar 

  25. Kontostathis, A., Pottenger, W.: Improving retrieval performance with positive and negative equivalence classes of terms (2002)

    Google Scholar 

  26. Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the American Society of Information Science (1990)

    Google Scholar 

  27. Scott, S., Matwin, S.: Feature engineering for text classification. In: Bratko, I., Dzeroski, S. (eds.) Proceedings of ICML 1999, San Francisco, US (1999)

    Google Scholar 

  28. Siolas, G., d’Alché Buc, F.: Support vector machines based on a semantic kernel for text categorization. In: Proceedings of IJCNN 2000. IEEE Computer Society, Los Alamitos (2000)

    Google Scholar 

  29. Cristianini, N., Shawe-Taylor, J., Lodhi, H.: Latent semantic kernels. J. Intell. Inf. Syst. (2002)

    Google Scholar 

  30. Kandola, J., Shawe-Taylor, J., Cristianini, N.: Learning semantic similarity. In: NIPS 2002. MIT Press, Cambridge (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Basili, R., Cammisa, M., Moschitti, A. (2005). A Semantic Kernel to Exploit Linguistic Knowledge. In: Bandini, S., Manzoni, S. (eds) AI*IA 2005: Advances in Artificial Intelligence. AI*IA 2005. Lecture Notes in Computer Science(), vol 3673. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11558590_30

Download citation

  • DOI: https://doi.org/10.1007/11558590_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29041-4

  • Online ISBN: 978-3-540-31733-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics