Improving User Modelling with Content-Based Techniques

  • Bernardo Magnini
  • Carlo Strapparava
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2109)


SiteIF is a personal agent for a bilingual news web site that learns user’s interests from the requested pages.

In this paper we propose to use a content-based document representation as a starting point to build a model of the user’s interests. Documents passed over are processed and relevant senses (disambiguated over WORDNET) are extracted and then combined to form a semantic network. A filtering procedure dynamically predicts new documents on the basis of the semantic network.

There are two main advantages of a content-based approach: first, the model predictions, being based on senses rather then words, are more accurate; second, the model is language independent, allowing navigation in multilingual sites. We report the results of a comparative experiment that has been carried out to give a quantitative estimation of these improvements.


Content-Based User Modelling Natural Language Processing WORDNET 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A. Artale, B. Magnini, and C. Strapparava. WordNet for italian and its use for lexical discrimination. In AI*IA97: Advances in Artificial Intelligence. Springer Verlag, 1997.Google Scholar
  2. 2.
    C. Fellbaum. WordNet. An Electronic Lexical Database. The MIT Press, 1998.Google Scholar
  3. 3.
    J. Gonzalo, F. Verdejio, Chugur, and J. Cigarran. Indexing with wordnet synsets can improve text retrieval. In S. Harabagiu, editor, Proceeding of the Workshop “Usage of WordNet in Natural Language Processing Systems”, Montreal, Quebec, Canada, August 1998.Google Scholar
  4. 4.
    J. Gonzalo, F. Verdejio, C. Peters, and N. Calzolari. Applying eurowordnet to cross-language text retrieval. Computers and Humanities, 32(2-3):185–207, 1998.CrossRefGoogle Scholar
  5. 5.
    Henry Lieberman, Neil W. Van Dyke, and Adrian S. Vivacqua. Let’s browse: A collaborative web browsing agent. In Proceedings of the 1999 International Conference on Intelligent User Interfaces, Collaborative Filtering and Collaborative Interfaces, pages 65–68, 1999.Google Scholar
  6. 6.
    B. Magnini and G. Cavaglià. Integrating subject field codes into WordNet. In Proceedings of LREC-2000, Second International Conference on Language Resources and Evaluation, Athens, Greece, June 2000.Google Scholar
  7. 7.
    B. Magnini and C. Strapparava. Experiments in word domain disambiguation for parallel texts. In Proc. of SIGLEX Workshop on Word Senses and Multi-linguality, Hong-Kong, October 2000. held in conjunction with ACL 2000.Google Scholar
  8. 8.
    M. Minio and C. Tasso. User modeling for information filtering on internet services: Exploiting an extended version of the UMT shell. In Proc. of Workshop on User Modeling for Information Filtering on the World Wide Web, Kailia-Kuna Hawaii, January 1996. held in conjunction with UM’96.Google Scholar
  9. 9.
    A. Stefani and C. Strapparava. Personaliziong access to web sites: The siteif project. In Proc. of second Workshop on Adaptive Hypertext and Hypermedia, Pittsburgh, June 1998. held in conjunction with HYPERTEXT’ 98.Google Scholar
  10. 10.
    C. Strapparava, B. Magnini, and A. Stefani. Sense-based user modelling for web sites. In Adaptive Hypermedia and Adaptive Web-Based Systems-Lecture Notes in Computer Science 1892. Springer Verlag, 2000.Google Scholar
  11. 11.
    Y. Wilks and M. Stevenson. Word sense disambiguation using optimised combination of knowledge sources. In Proc. of COLING-ACL’98, 98.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Bernardo Magnini
    • 1
  • Carlo Strapparava
    • 1
  1. 1.ITC-irst, Istituto per la Ricerca Scientifica e TecnologicaTrentoItaly

Personalised recommendations