Skip to main content
Log in

A content-collaborative recommender that exploits WordNet-based user profiles for neighborhood formation

  • Original Paper
  • Published:
User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Abstract

Collaborative and content-based filtering are the recommendation techniques most widely adopted to date. Traditional collaborative approaches compute a similarity value between the current user and each other user by taking into account their rating style, that is the set of ratings given on the same items. Based on the ratings of the most similar users, commonly referred to as neighbors, collaborative algorithms compute recommendations for the current user. The problem with this approach is that the similarity value is only computable if users have common rated items. The main contribution of this work is a possible solution to overcome this limitation. We propose a new content-collaborative hybrid recommender which computes similarities between users relying on their content-based profiles, in which user preferences are stored, instead of comparing their rating styles. In more detail, user profiles are clustered to discover current user neighbors. Content-based user profiles play a key role in the proposed hybrid recommender. Traditional keyword-based approaches to user profiling are unable to capture the semantics of user interests. A distinctive feature of our work is the integration of linguistic knowledge in the process of learning semantic user profiles representing user interests in a more effective way, compared to classical keyword-based profiles, due to a sense-based indexing. Semantic profiles are obtained by integrating machine learning algorithms for text categorization, namely a naïve Bayes approach and a relevance feedback method, with a word sense disambiguation strategy based exclusively on the lexical knowledge stored in the WordNet lexical database. Experiments carried out on a content-based extension of the EachMovie dataset show an improvement of the accuracy of sense-based profiles with respect to keyword-based ones, when coping with the task of classifying movies as interesting (or not) for the current user. An experimental session has been also performed in order to evaluate the proposed hybrid recommender system. The results highlight the improvement in the predictive accuracy of collaborative recommendations obtained by selecting like-minded users according to user profiles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Adomavicius G., Sankaranarayanan R., Sen S., Tuzhilin A. (2005) Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans. Inf. Sys. 23(1): 103–145

    Article  Google Scholar 

  • Adomavicius G., Tuzhilin A. (2005) Towards the next generation of recommender systems, a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowledge Data Eng. 17(6):734–749

    Article  Google Scholar 

  • Asnicar F., Tasso C. (1997) ifWeb: a prototype of user model-based intelligent agent for documentation filtering and navigation in the word wide web. In: Tasso C., Jameson A., Paris C.L. (eds) Proceedings of the First International Workshop on Adaptive Systems and User Modeling on the World Wide Web, Sixth International Conference on User Modeling. Chia Laguna, Sardinia Italy, pp. 3–12

    Google Scholar 

  • Balabanovic M., Shoham Y. (1997) Fab: content-based, collaborative recommendation. Commun. ACM 40(3): 66–72

    Article  Google Scholar 

  • Basu C., Hirsh H., Cohen W.: Recommendation as classification: using social and content-based information in recommendation. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98) and of the Tenth Conference on Innovative Applications of Artificial Intelligence (IAAI-98), pp. 714–720. Menlo Park, AAAI Press (1998)

  • Billsus D., Pazzani M.J. Learning collaborative information filters. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 46–54. Morgan Kaufmann, San Francisco, CA (1998)

  • Bloedhorn S., Hotho A.: Boosting for Text Classification with Semantic Features. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Mining for and from the Semantic Web Workshop, pp. 70–87. Seattle, WA, USA, (2004)

  • Bradley P.S., Fayyad U.M. (1998) Refining initial points for K-means clustering. In: Shavlik J. (eds) Proceedings of the Fifteenth International Conference on Machine Learning (ICML ’98). California, Morgan Kaufmann, pp. 91–99

    Google Scholar 

  • Breese J.S., Heckerman D., Kadie C. (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Cooper, G.F., Moral S. (eds) Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence. Morgan, Kaufmann, pp. 43–52

    Google Scholar 

  • Budanitsky A., Hirst G.: Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures. In: Proceedings of the Workshop on WordNet and other Lexical Resources, Second Meeting of the North American Chapter of the Association for Computational Linguistics, pp. 29–34. Pittsburgh, PA (2001)

  • Burke R. (2002) Hybrid recommender systems: survey and experiments. User Model. User-Adapted Interaction 12(4): 331–370

    Article  MATH  Google Scholar 

  • Claypool M., Gokhale A., Miranda T., Murnikov P., Netes D., Sartin M.: Combining content-based and collaborative filters in an online newspaper. In: Proceedings of ACM SIGIR Workshop on Recommender Systems: Algorithms and Evaluation. Berkeley, California, USA, ACM Press, New York, NY, USA (1999)

  • Cutting D., Karger D., Pedersen J., Tukey J.: Scatter/gather: a cluster based approach to browsing large document collection. In: Proceedings of the Fifteenth ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329, Copenhagen, Denmark, ACM Press, New York, NY, USA (1992)

  • Degemmis M.: Learning User Profiles from Text for Personalized Information Access. Ph.D. thesis, Department of Informatics, University of Bari (2005)

  • Degemmis M., Lops P., Semeraro G.: WordNet-based Word Sense Disambiguation for Learning User Profiles. In: Proceedings of the Second European Web Mining Forum, ECML/PKDD 2005, pp. 16–27. Porto, Portugal, (2005)

  • Degemmis M., Lops P., Semeraro G., Costabile M., Guida S., Licchelli O.: Improving collaborative recommender systems by means of user profiles. In: Karat C.-M., Blom J., Karat J. (eds.) Designing personalized user experiences in eCommerce, pp. 253–274. Kluwer Academic (2004)

  • Delgado J., Ishii N.: Memory-based weighted-majority prediction for recommender systems. In: Proceedings of the ACM SIGIR Workshop on Recommender Systems: Algorithms and Evaluation. Berkeley, California, USA, ACM Press, New York, NY, USA (1999)

  • Fellbaum C. WordNet: An Electronic Lexical Database. MIT Press (1998)

  • Hartigan J. (1975) Clustering Algorithms. John Wiley & Sons, New York, NY

    MATH  Google Scholar 

  • Hartigan J., Wong M. (1979) Algorithm AS136: a k-means clustering algorithm. Appl. Stat. 28, 100–108

    Article  MATH  Google Scholar 

  • Herlocker J.L., Konstan J.A., Borchers A., Riedl J.: An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 230–237. Berkeley, California, USA, ACM Press New York, NY, USA. (1999)

  • Herlocker J.L., Konstan J.A., Riedl J.: Explaining collaborative filtering recommendations. In: Proceedings of the ACM 2000 Conference on Computer Supported Cooperative Work, pp. 241–250. Philadelphia, Pennsylvania, United States, ACM Press New York, NY, USA. (2000)

  • Herlocker J.L., Konstan J.A., Terveen L.G., Riedl J.T. (2004) Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1): 5–53

    Article  Google Scholar 

  • Hotho A., Staab S., Stumme G.: Wordnet improves text document clustering. In: Proceedings of the Semantic Web Workshop at SIGIR 2003, 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Toronto, Canada, ACM Press New York, NY, USA (2003)

  • Kohavi R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 1137–1145. San Mateo, CA: Morgan Kaufmann (1995)

  • Larsen B., Aone C.: Fast and Effective text mining using linear-time document clustering. In: Chaudhuri S. Madigan D. (eds.) Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 16–22. N.Y., ACM Press, (1999)

  • Leacock C., Chodorow M.: Combining local context and WordNet similarity for word sense identification. In: Fellbaum C. (ed.) WordNet: An Electronic Lexical Database, pp. 266–283. MIT Press. (1998)

  • Lee W.S. Collaborative learning for recommender systems. In: Proceedings of the Eighteenth International Conference on Machine Learning. pp. 314–321. Morgan Kaufmann, San Francisco, CA, (2001)

  • Linden G., Smith B., York J. (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comp. 7(1): 76–80

    Article  Google Scholar 

  • Lops P.: Hybrid recommendation techniques based on user profiles. Ph.D. thesis, Department of Informatics, University of Bari (2005)

  • Magnini B., Strapparava C.: Improving user modelling with content-based techniques. In: Proceedings of the Eighth International Conference on User Modeling, pp. 74–83. Sonthofen, Germany, Springer (2001)

  • Manning C., Schütze H.: Foundations of statistical natural language processing, Chapt. 7: Word Sense Disambiguation, pp. 229–264. The MIT Press, Cambridge, US (1999)

  • Massa P.: Trust-aware decentralized recommender systems. Ph.D. thesis, International Doctorate School in Information and Communication Technologies, University of Trento (2006)

  • Mavroeidis D., Tsatsaronis G., Vazirgiannis M., Theobald M., Weikum G.: Word sense disambiguation for exploiting hierarchical thesauri in text classification. In: Proceedings of the Ninth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), vol. 3721 of Lecture Notes in Computer Science, pp. 181–192. Porto, Portugal, Springer (2005)

  • McCallum A., Nigam K.: A comparison of event models for naïve Bayes text classification. In: Proceedings of the AAAI/ICML-98 Workshop on Learning for Text Categorization, pp. 41–48. AAAI Press (1998)

  • Melville P., Mooney R.J., Nagarajan R.: Content-Boosted collaborative filtering for improved recommendations. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI-02), pp. 187–192. Menlo Parc, CA, USA, AAAI Press (2002)

  • Miller, G. WordNet: an on-line lexical database. Int. J. Lexicogr. 3(4), 235–312. (Special Issue) (1990)

    Google Scholar 

  • Mitchell T. (1997) Machine Learning. McGraw-Hill, New York

    MATH  Google Scholar 

  • Mladenic D. (1999) Text-learning and related intelligent agents: a survey. IEEE Intelligent Syst. 14(4): 44–54

    Article  Google Scholar 

  • Mooney R.J., Roy L.: Content-based book recommending using learning for text categorization. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 195–204. San Antonio, US, ACM Press, New York, US (2000)

  • Nakamura A., Abe N.: Collaborative filtering using weighted majority prediction algorithms. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 395–403. Morgan Kaufmann (1998)

  • Orkin M., Drogin R. (1990) Vital Statistics. McGraw-Hill, New York

    Google Scholar 

  • Patwardhan S., Banerjee S., Pedersen T.: Using measures of semantic relatedness for word sense disambiguation. In: Gelbukh, A.F. (ed.) Computational Linguistics and Intelligent Text Processing, Fourth International Conference, CICLing 2003, Proceedings, vol. 2588 of Lecture Notes in Computer Science, pp. 241–257. Springer (2003)

  • Pazzani M., Billsus D. (1997) Learning and revising user profiles: the identification of interesting web sites. Machine Learning 27(3): 313–331

    Article  Google Scholar 

  • Pazzani M.J. (1999) A Framework for collaborative, content-based and demographic filtering. Artificial Intelligence Rev. 13(5–6): 393–408

    Article  Google Scholar 

  • Resnick P., Iacovou N., Suchak M., Bergstrom P., Riedl J.: GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of the ACM 1994 Conference on Computer Supported Cooperative Work, pp. 175–186. Chapel Hill, North Carolina, ACM Press New York, NY, USA (1994)

  • Resnick P., Varian H. (1997) Recommender systems. Commun. ACM 40(3): 56–58

    Article  Google Scholar 

  • Resnik P.: WordNet and class-based probabilities. In: Fellbaum C. (ed.) WordNet: An Electronic lexical database, pp. 239–263, MIT Press (1998)

  • Rocchio J. (1971) Relevance feedback information retrieval. In: Salton G. (eds) The SMART Retrieval System – Experiments in Automated Document Processing. Prentice-Hall, Englewood Cliffs, NJ, pp. 313–323

    Google Scholar 

  • Rodriguez M.d.B., Gomez-Hidalgo J.M., Diaz-Agudo B.: Using WordNet to complement training information in text categorization. In: Second International Conference on Recent Advances in NLP, pp. 150–157 (1997)

  • Rosso P., Ferretti E., Jimenez D., Vidal V.: Text categorization and information retrieval using WordNet synsets. In: Sojka P., Pala K., Smrž, P., Fellbaum C., Vossen P. (eds.) Proceedings of the Second International WordNet Conference, pp. 299–304. Masaryk University Brno, Czech Republic (2004)

  • Sarwar B.M., Karypis G., Konstan J., Reidl J.: Recommender systems for large-scale E-Commerce: scalable neighborhood formation using clustering. In: Proceedings of the Fifth International Conference on Computer and Information Technology (ICCIT). Dhaka, Bangladesh (2002)

  • Sarwar B.M., Karypis G., Konstan J.A., Riedl J.: Analysis of recommendation algorithms for E-commerce. In: ACM Conference on Electronic Commerce, pp. 158–167. Minneapolis, Minnesota, USA, (2000a)

  • Sarwar B.M., Karypis G., Konstan J.A., Riedl J.: Application of dimensionality reduction in recommender systems: a case study. In: Proceedings of the WebKDD 2000 Workshop at the ACM-SIGKDD Conference on Knowledge Discovery in Databases (KDD’00). Boston, MA (2000b)

  • Schwab I., Kobsa A., Koychev I. (2001) Learning User Interests through Positive Examples using Content Analysis and Collaborative Filtering. Draft from Fraunhofer Institute for Applied Information Technology, Germany

    Google Scholar 

  • Scott S., Matwin S.: Text classification using WordNet hypernyms. In: Harabagiu S. (ed.) COLING-ACL Workshop on Usage of WordNet in NLP Systems, pp. 45–51. Somerset, New Jersey, Association for Computational Linguistics (1998)

  • Sebastiani F. (2002) Machine learning in automated text categorization. ACM Comp. Surveys 34(1): 1–47

    Article  Google Scholar 

  • Semeraro G., Degemmis M., Lops P., Basile P.: Combining learning and word sense disambiguation for intelligent user profiling. In: Twentieth International Joint Conference on Artificial Intelligence, 2007. Hyderabad, India. (Forthcoming) (2007)

  • Shardanand U., Maes P.: Social information filtering: algorithms for automating/word of mouth. In: Proceedings of ACM CHI’95 Conference on Human Factors in Computing Systems, vol. 1, pp. 210–217. Denver, Colorado, United States (1995)

  • Soboroff I., Nicholas C.: Combining content and collaboration in text filtering. In: IJCAI’99 Workshop: Machine Learning for Information Filtering, pp. 86–91. Stockholm, Sweden (1999)

  • Stevenson M. (2003) Word Sense Disambiguation: The Case for Combinations of Knowledge Sources. CSLI Publications Stanford, CA, USA

    Google Scholar 

  • Terveen L., Hill W.: Human-computer collaboration in recommender systems, pp. 223–242. In: Carroll J. (ed.) HCI on the new Millennium, Addison Wesley (2001)

  • Theobald M., Schenkel R., Weikum G.: Exploting structure, annotation, and ontological knowledge for automatic classification of XML data. In: Proceedings of the Seventh International Workshop on Web and Databases, pp. 1–6. Maison de la Chimie, Paris, France (2004)

  • Ungar L., Foster D.: Clustering methods for collaborative filtering. In: Proceedings of the Workshop on Recommendation Systems. AAAI Press, Menlo Park California (1998)

  • Vozalis E., Margaritis K.G.: Analysis of recommender systems algorithms. In: Proceedings of the Sixth Hellenic European Conference on Computer Mathematics and its Applications (HERCMA). Athens, Greece (2003)

  • Witten I., Bell T. (1991) The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression. IEEE Trans. Inf. Theory 37(4): 1085–1094

    Article  Google Scholar 

  • Yang Y. Pedersen J.O.: A comparative study on feature selection in text categorization. In: Fisher D.H. (ed.) Proceedings of ICML-97, Fourteenth International Conference on Machine Learning, pp. 412–420. Nashville, US, Morgan Kaufmann Publishers, San Francisco, US (1997)

  • Yao Y.Y. (1995) Measuring retrieval effectiveness based on user preference of documents. J. Am. Soc. Inf. Sci. 46(2): 133–145

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Degemmis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Degemmis, M., Lops, P. & Semeraro, G. A content-collaborative recommender that exploits WordNet-based user profiles for neighborhood formation. User Model User-Adap Inter 17, 217–255 (2007). https://doi.org/10.1007/s11257-006-9023-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11257-006-9023-4

Keywords

Navigation