Skip to main content
Log in

Search personalization through query and page topical analysis

  • Original Paper
  • Published:
User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Abstract

Thousands of users issue keyword queries to the Web search engines to find information on a number of topics. Since the users may have diverse backgrounds and may have different expectations for a given query, some search engines try to personalize their results to better match the overall interests of an individual user. This task involves two great challenges. First the search engines need to be able to effectively identify the user interests and build a profile for every individual user. Second, once such a profile is available, the search engines need to rank the results in a way that matches the interests of a given user. In this article, we present our work towards a personalized Web search engine and we discuss how we addressed each of these challenges. Since users are typically not willing to provide information on their personal preferences, for the first challenge, we attempt to determine such preferences by examining the click history of each user. In particular, we leverage a topical ontology for estimating a user’s topic preferences based on her past searches, i.e. previously issued queries and pages visited for those queries. We then explore the semantic similarity between the user’s current query and the query-matching pages, in order to identify the user’s current topic preference. For the second challenge, we have developed a ranking function that uses the learned past and current topic preferences in order to rank the search results to better match the preferences of a given user. Our experimental evaluation on the Google query-stream of human subjects over a period of 1 month shows that user preferences can be learned accurately through the use of our topical ontology and that our ranking function which takes into account the learned user preferences yields significant improvements in the quality of the search results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agichtein, E., Brill, E., Dumais, S., Ragno, R.: Learning user interaction models for predicting web search result preferences. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, pp. 3–10. Seattle, WA (2006)

  • Aktas, M., Nacar, M., Menczer, F.: Personalizing pageRank based on domain profiles. In: Proceedings of WebKDD 2004: KDD workshop on web mining and web usage analysis. Seattle, WA (2004)

  • Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. In: Intelligent scalable text summarization workshop (ISTS’97). ACL, Madrid, Spain (1997)

  • Bentivogli, L., Forner, P., Magnini, B., Pianta, E.: Revising WordNet domains hierarchy: semantics, coverage, and balancing. In: Proceedings of COLING 2004 workshop on multilingual linguistic resources, pp. 101–108. Geneva, Switzerland (2004)

  • Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the web. Comput. Netw. 29 (8–13): 1157–1166 (1997)

    Article  Google Scholar 

  • Chen, L., Sycara, K.: WebMate: a personal agent for browsing and searching. In: Proceedings of the second international conference on autonomous agents, pp. 132–139. Minneapolis, MN (1998)

  • Chirita, P.A., Firan, C.S., Nejdl, W.: Personalized query expansion for the web. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, pp. 7–14. Amsterdam, The Netherlands (2007)

  • Dou, Z., Song, R., Wen, J.-R.: A large-scale evaluation and analysis of personalized search strategies. In: WWW ’07: Proceedings of the 16th international conference on World Wide Web, pp. 581–590. ACM, Banff, Alberta, Canada (2007)

  • Fellbaum, C.: WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press (1998)

  • Fox, S., Karnawat, K., Mydland, M., Dumais, S., White, T.: Evaluating implicit measures to improve web search. ACM Trans. Inf. Syst. 23(2): 147–168 (2005)

    Article  Google Scholar 

  • Gauch, S., Chaffee, J., Pretschner, A.: Ontology-based personalized search and browsing. Web Intell. Agent Syst. 1(3-4): 219–234 (2003)

    Google Scholar 

  • Gliozzo, A., Strapparava, C., Dagan, I.: Unsupervised and supervised exploitation of semantic domains in lexical disambiguation. Comput. Speech Lang. 3(18): 275–299 (2004)

    Article  Google Scholar 

  • Gulli, A., Signorini, A.: The indexable web is more than 11.5 billion pages. In: Special interest tracks and posters of the 14th international conference on World Wide Web, pp. 902–903. Chiba, Japan (2005)

  • Haveliwala, T.H.: Topic-sensitive pagerank. In: Proceedings of the eleventh World Wide Web conference, pp. 517–526. Honolulu, HI (2002)

  • Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. Inf. Process. Manage. 36(2): 207–227 (2000)

    Article  Google Scholar 

  • Jeh, G., Widom, J.: Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web, pp. 271–279. Budapest, Hungary (2003)

  • Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, pp. 154–161. Salvador, Brazil (2005)

  • Krikos, V., Stamou, S., Kokosis, P., Ntoulas, A., Christodoulakis, D.: DirectoryRank: ordering pages in web directories. In: WIDM ’05: Proceedings of the 7th annual ACM international workshop on web information and data management, pp. 17–22. ACM, Bremen, Germany (2005)

  • Liu, F., Yu, C., Meng, W.: Personalized web search by mapping user queries to categories. In: CIKM ’02: Proceedings of the eleventh international conference on information and knowledge management, pp. 558–565. ACM, McLean, VA (2002)

  • Ma, Z., Pant, G., Sheng, O.R.L.: Interest-based personalized search. ACM Trans. Inf. Syst. 25(1): 5 (2007)

    Article  Google Scholar 

  • My Yahoo!: My Yahoo! http://my.yahoo.com (2007)

  • Pazzani, M.J., Muramatsu, J., Billsus, D.: Syskill & webert: identifying interesting web sites. In: Proceedings of the 13th national conference on artificial intelligence and 8th conference on innovative applications of artificial intelligence, vol. 1, pp. 54–61. Portland, OR (1996)

  • Pease, A., Niles, I., Li, J.: The suggested upper merged ontology: a large ontology for the semantic web and its applications. In: Working notes of the AAAI-2002 workshop on ontologies and the semantic web. Edmonton, Canada (2002)

  • Pretschner, A., Gauch, S.: Ontology based personalized search. In: Proceedings of the 11th IEEE international conference on tools with artificial intelligence, pp. 391–298. IEEE Computer Society, Chicago, IL (1999)

  • Qiu, F., Cho, J.: Automatic identification of user interest for personalized search. In: Proceedings of the 15th international conference on World Wide Web, pp. 727–736. Edinburgh, Scotland (2006)

  • Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the international joint conference on artificial intelligence, pp. 448–453. Montreal, Quebec, Canada (1995)

  • Richardson, M., Domingos, P.: The intelligent surfer: probabilistic combination of link and content information in PageRank. In: Advances in neural information processing systems 14, pp. 1441–1448. MIT Press, Cambridge, MA (2002)

  • Shen, X., Zhai, C.: Exploiting query history for document ranking in interactive information retrieval. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, pp. 377–378. Toronto, Canada (2003)

  • Song, Y.I., Han, K.S., Rim, H.C.: A term weighting method based on lexical chain for automatic summarization. In: Proceedings of the international conference on intelligent text processing and computational linguistics, pp. 636–639. Seoul, Korea (2004)

  • Speretta, M., Gauch, S.: Personalizing search based on user search histories. In: Proceedings of the 2004 CIKM conference on information and knowledge management. Washington, DC (2004)

  • Stamou, S., Ntoulas, A., Krikos, V., Kokosis, P., Christodoulakis, D.: Classifying web data in directory structures. In: Proceedings of the 8th Asia pacific web conference, pp. 238–249. Harbin, China (2006)

  • Stamou, S., Ntoulas, A., Christodoulakis, D.: TODE: an ontology based model for the dynamic population of web directories. In: Data management with ontologies: implementations, findings and frameworks, Published by Idea Group Inc., pp. 1–17 (2007)

  • Sugiyama, K., Hatano, K., Yoshikawa, M.: Adaptive web search based on user profile constructed without any effort from users. In: Proceedings of the 13th international conference on World Wide Web, pp. 675–684. New York, NY (2004)

  • Sun, J.T., Zeng, H.J., Liu, H., Lu, Y., Chen, Z.: CubeSVD: a novel approach to personalized web search. In: Proceedings of the 14th international conference on World Wide Web, pp. 382–390. Chiba, Japan (2005)

  • Teevan, J., Dumais, S.T., Horvitz, E.: Personalizing search via automated analysis of interests and activities. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, pp. 449–456. Salvador, Brazil (2005)

  • Teevan, J., Adar, E., Jones, R., Potts, M.: Information re-retrieval: repeat queries in Yahoo’s Logs. In: SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, pp. 151–158. ACM, Amsterdam, Netherlands (2007)

  • Turney, P.: Word sense disambiguation by web mining for word co-Occurrence probabilities. In: Proceedings of the third international workshop on the evaluation of systems for the semantic analysis of text (SENSEVAL-3), pp. 239–242. Barcelona, Spain (2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sofia Stamou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stamou, S., Ntoulas, A. Search personalization through query and page topical analysis. User Model User-Adap Inter 19, 5–33 (2009). https://doi.org/10.1007/s11257-008-9056-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11257-008-9056-y

Keywords

Navigation