An automatic methodology to evaluate personalized information retrieval systems

Vicente-López, Eduardo; de Campos, Luis M.; Fernández-Luna, Juan M.; Huete, Juan F.; Tagua-Jiménez, Antonio; Tur-Vigil, Carmen

doi:10.1007/s11257-014-9148-9

An automatic methodology to evaluate personalized information retrieval systems

Original Paper
Published: 26 June 2014

Volume 25, pages 1–37, (2015)
Cite this article

User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Eduardo Vicente-López¹,
Luis M. de Campos¹,
Juan M. Fernández-Luna¹,
Juan F. Huete¹,
Antonio Tagua-Jiménez² &
…
Carmen Tur-Vigil²

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Due to the information overload we are faced with nowadays, personalization services are becoming almost essential, in order to find relevant information tailored to each individual or group of people with common interests. Therefore, it is very important to be able to build efficient and robust personalization techniques to be part of these services. The evaluation step is a crucial stage in their development and improvement, so much more research is needed to overcome this issue. We have proposed an automatic evaluation methodology for personalized information retrieval systems (ASPIRE), which joins the advantages of both system-centred (repeatable, comparable and generalizable results) and user-centred (considers the user) evaluation approaches, and makes the evaluation process easy and fast. Its reliability and robustness have been assessed by means of a user-oriented evaluation. ASPIRE may be considered as an interesting alternative to the costly and difficult user studies, able to discriminate between either different personalization techniques or different parameter configurations of a given personalization method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Practical Online Retrieval Evaluation

A Relevance-Focused Search Application for Personalised Ranking Model

Designing Human-Readable User Profiles for Search Evaluation

Notes

A Kendall \(\tau \) correlation always below 0.5.
The fact of focusing on XML information retrieval requires the adaptation of some search engine components. For example the retrievable elements are not only complete documents but document components (called structural units), which may overlap. However this does not represent any problem when using ASPIRE.
Lucene is a popular open-source search software. It provides indexing and search technologies, which is frequently used by several applications all over the world, ranging from mobile devices to sites like Twitter, Apple and Wikipedia. This search engine is designed to work with plain (non-structured) documents http://lucene.apache.org/.
It should be noticed that the user did not judge if a given retrieved result was the best possible one, but only whether or not its content was relevant to the given query and profile (binary assessments).
The source of the problem is the limitation of judging only the first 50 results retrieved by the IRS, but it was necessary since the evaluation of a great number of results would require too much time and effort from the users.
As Hard reranking only considers the list of results of the original non personalized query, it does not introduce any relevance assessments not present in the original results list.
The correlation values in this case are greater than those in Fig. 2, because here we are correlating the averaged NDCG values for ASPIRE and the user study, not the underlying and more diverse 126 evaluation triplets of each of these combinations.

References

Abowd, G.D., Dey, A.K., Brown, P.J., Davies, N., Smith, M., Steggles, P.: Towards a better understanding of context and context-awareness. Handheld Ubiquitous Comput. LNCS 1707, 304–307 (1999)
Article Google Scholar
Allan, J.: Hard track overview in TREC 2003: High accuracy retrieval from documents. In: Proceedings of the 14th Text Retrieval Conference, Gaithersburg-Maryland, USA (2005)
Azzopardi, L., de Rijke, M., Balog, K.: Building simulated queries for known-item topics: an analysis using six european languages. In: Proceedings of the 30th ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, pp. 455–462 (2007)
Baiely, P., Craswell, N., Soboroff, I., Thomas, P., de Vries, A.P., Yilmaz, E.: Relevance assessment: are judges exchangeable and does it matter? In: Proceedings of the 31st ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, pp. 667–674 (2008)
Borlund, P.: The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. J. Inf. Res. 8(3), 152 (2003)
Google Scholar
Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: Proceedings of the 27th ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, UK, pp. 25–32 (2004)
Bystrom, K., Jarvelin, K.: Task complexity affects information seeking and use. Inf. Process. Manag. 31(2), 191–213 (1995)
Article Google Scholar
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F.: Using context information in structured document retrieval: an approach using influence diagrams. Inf. Process. Manag. 40(5), 829–847 (2004)
Article Google Scholar
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F.: Improving the context-based influence diagram model for structured document retrieval: removing topological restrictions and adding new evaluation methods. Lect. Notes Comput. Sci. Adv. Inf. Retr. 3408, 215–229 (2005)
Article Google Scholar
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F., Romero, A.E.: Garnata: An information retrieval system for structured documents based on probabilistic graphical models. In: Proceedings of the 11th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Paris, France, pp. 1024–1031 (2006)
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F., Martín-Dancausa, C., Romero, A.E.: New utility models for the Garnata information retrieval system at INEX’08. Lect. Notes Comput. Sci. Adv. Focus. Retr. 5631, 39–45 (2009)
Article Google Scholar
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F., Martín-Dancausa, C., Tagua-Jiménez, A., Tur-Vigil, C.: An integrated system for managing the andalusian parliament’s digital library. Prog. Electron. Lib. Inf. Syst. 43(2), 156–174 (2009)
Google Scholar
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F., Vicente-López, E.: XML search personalization strategies using query expansion, reranking and a search engine modification. In: Proceedings of the 28th ACM Symposium on Applied Computing, Coimbra, Portugal, pp. 872–877 (2013)
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F., Vicente-López, E.: Using personalization to improve XML retrieval. IEEE Trans. Knowl. Data Eng. 26(5), 1280–1292 (2014)
Carterette, B., Pavlu, V., Kanoulas, E., Aslam, J., Allan, J.: Evaluation over thousands of queries. In: Proceedings of the 31st ACM International Conference on Reseach and Developments in Information Retrieval, Singapore, pp. 651–658 (2008)
Carterette, B., Soboroff, I.: The effect of Assessor Error on IR System Evaluation. In: Proceedings of the 33rd ACM International Conference on Reseach and Developments in Information Retrieval, Geneva, Switzerland, pp. 539–546 (2010)
Chin, D.N.: Empirical evaluation of user models and user-adapted systems. User Model. User-Adapt. Interact. 11(1–2), 181–194 (2001)
Article MATH Google Scholar
Cleverdon, C.W., Mills, J., Keen, M.: Factors determining the performance of indexing systems, vol. 1—design. ASLIB Cranfield Project. Technical Report (1966)
Daoud, M., Tamine-Lechani, L., Boughanem, M.: A contextual evaluation protocol for a session-based personalized search. In: Proceedings of the 2nd Workshop on Contextual Information Access, Seeking and Retrieval Evaluation (CIRSE) in conjunction with the 32nd European Conference on Information Retrieval, Toulouse, France (2009)
Díaz, A., García, A., Gervás, P.: User-centred versus system-centred evaluation of a personalization system. Inf. Process. Manag. 44(3), 1293–1307 (2008)
Article Google Scholar
Ding, C., Patra, J.C.: User modeling for personalized Web search with self-organizing map. J. Am. Soc. Inf. Sci. Technol. 58(4), 494–507 (2007)
Article Google Scholar
Dou, Z., Song, R., Wen, J.R.: A large-scale evaluation and analysis of personalized search strategies. In: Proceedings of the 16th International Conference on World Wide Web, Banff, Canada, pp. 581–590 (2007)
Elsweiler, D., Losada, D.E., Toucedo, J.C., Fernandez, R.T.: Seeding simulated queries with user-study data for personal search evaluation. In: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 25–34 (2011)
Ghorab, M.R., Zhou, D., O’Connor, A., Wade, V.: Personalised information retrieval: survey and classification. User Model. User Adapt. Interact. 23, 381–443 (2013)
Article Google Scholar
Harman, D.: Overview of the fourth text retrieval conference (trec-4). In: Proceedings of the 4th Text Retrieval Conference, Gaithersburg-Maryland, USA (1995)
Ingwersen, P.: Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. J. Doc. 52(1), 3–50 (1996)
Article Google Scholar
Jarvelin, K., Kekalainen, J.: Cumulative gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)
Article Google Scholar
Liu, F., Yu, C., Meng, W.: Personalized web search for improving retrieval effectiveness. IEEE Trans. Knowl. Data Eng. 16(1), 28–40 (2004)
Article Google Scholar
Mostafa, J., Mukhopadhyay, S., Palakal, M.: Simulation studies of different dimensions of users’ interests and their impact on user modeling and information filtering. Inf. Retr. 6(2), 199–223 (2003)
Article Google Scholar
Nuray, R., Can, F.: Automatic ranking of retrieval systems in imperfect environments. In: Proceedings of the 26th ACM SIGIR Conference on Research and Development in Informaion Retrieval, Toronto, Canada, pp. 379–380 (2003)
Pal, S., Mitra, M., Kamps, J.: Evaluation effort, reliability and reusability in XML retrieval. J. Am. Soc. Inf. Sci. Technol. 62(2), 375–394 (2011)
Article Google Scholar
Petrelli, D.: On the role of user-centred evaluation in the advancement of interactive information retrieval. Inf. Process. Manag. 44(1), 22–38 (2008)
Article Google Scholar
Ramesh, V., Glass, Robert L., Vessey, Iris: Research in computer science: an empirical study. J. Syst. Softw. 70(12), 165–176 (2004)
Article Google Scholar
Sakai, T., Kando, N.: On information retrieval metrics designed for evaluation with incomplete relevance assessments. Inf. Retr. 11(5), 447–470 (2008)
Sanderson, M., Soboroff, I.: Problems with Kendall’s tau. In: Proceedings of the 30th ACM SIGIR Conference on Research and Development in Information Retrieval. Amsterdam, The Netherlands, pp. 839–840 (2007)
Santos, Jr, E., Zhao, Q., Nguyen, H., Wang, H.: Impacts of user modeling on personalization of information retrieval: an evaluation with human intelligence analysts. In: Proceedings of the 4th Workshop on the Evaluation of Adaptive Systems (Held in Conjunction with the 10th International Conference on User Modeling). Edinburgh, UK, pp. 27–36 (2005)
Saracevic, T.: Relevance: a review of and a framework for thinking on the notion in information science. J. Am. Soc. Inf. Sci. 26(6), 321–343 (1975)
Article Google Scholar
Sieg, A., Mobasher, B., Burke, R.: Web search personalization with ontological user profiles. In: Proceedings of the 16th ACM International Conference on Information and Knowledge Management, Lisbon, Portugal, pp. 525–534 (2007)
Soboroff, I., Nicholas, C., Cahan, P.: Ranking retrieval systems without relevance judgments. In: Proceedings of the 24th ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, USA, pp. 66–73 (2001)
Spink, A., Ozmutlu, S., Ozmutlu, H.C., Jansen, B.J.: US versus european web searching trends. In: Proceedings of the 25th ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, pp. 32–38 (2002)
Steichen, B., Ashman, H., Wade, V.: A comparative survey of personalised information retrieval and adaptive hypermedia techniques. Inf. Process. Manag. 48(4), 698–724 (2012)
Article Google Scholar
Sugiyama, K., Hatano, K., Yoshikawa, M.: Adaptive web search based on user profile constructed without any effort from users. In: Proceedings of the 13th International Conference on World Wide Web, Manhattan, NY, USA pp. 675–684 (2004)
Taghavi, M., Patel, A., Schmidt, N., Wills, C., Tew, Y.: An analysis of web proxy logs with query distribution pattern approach for search engines. Comput. Stand. Interfaces 34(1), 162–170 (2012)
Article Google Scholar
Tamine-Lechani, L., Boughanem, M., Daoud, M.: Evaluation of contextual information retrieval effectiveness: overview of issues and research. Knowl. Inf. Syst. 24(1), 1–34 (2009)
Article Google Scholar
Tao, X., Li, Y., Zhong, N.: A personalized ontology model for web information gathering. IEEE Trans. Knowl. Data Eng. 23(4), 496–511 (2011)
Article Google Scholar
Turpin, A.H., William, H.: Why batch and user evaluations do not give the same results. In: Proceedings of the 24th ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, USA, pp. 225–231 (2001)
White, R.W., Ruthven, I., Jose, J.M., Van Rijsbergen, C.J.: Evaluating implicit feedback models using searcher simulations. ACM Trans. Inf. Syst. 23(3), 325–361 (2005)
Article Google Scholar
White, R.W.: Contextual simulations for information retrieval evaluation. In: Proceedings of the 2nd ACM SIGIR Workshop on Information Retrieval in Context, Sheffield, UK, pp. 27–28 (2005)
Yang, Y., Padmanabhan, B.: Evaluation of online personalization systems: a survey of evaluation schemes and a knowledge-based approach. J. Electron. Commer. Res. 6(2), 112–122 (2005)
Google Scholar
Zobel, J.: How reliable are the results of large-scale information retrieval experiments? In: Proceedings of the 21st ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, pp. 307–314 (1998)

Download references

Acknowledgments

This paper has been supported by the Spanish “Consejería de Innovación, Ciencia y Empresa de la Junta de Andalucía” and the “Ministerio de Ciencia e Innovación” under the Projects P09-TIC-4526 and TIN2011-28538-C02-02, respectively.

Author information

Authors and Affiliations

Departamento de Ciencias de la Computación e Inteligencia Artificial, E.T.S.I. Informática y de Telecomunicación, CITIC-UGR, Universidad de Granada, 18071 , Granada, Spain
Eduardo Vicente-López, Luis M. de Campos, Juan M. Fernández-Luna & Juan F. Huete
Servicio de Publicaciones Oficiales, Parlamento de Andalucía, 41009 , Sevilla, Spain
Antonio Tagua-Jiménez & Carmen Tur-Vigil

Authors

Eduardo Vicente-López
View author publications
You can also search for this author in PubMed Google Scholar
Luis M. de Campos
View author publications
You can also search for this author in PubMed Google Scholar
Juan M. Fernández-Luna
View author publications
You can also search for this author in PubMed Google Scholar
Juan F. Huete
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Tagua-Jiménez
View author publications
You can also search for this author in PubMed Google Scholar
Carmen Tur-Vigil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luis M. de Campos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vicente-López, E., de Campos, L.M., Fernández-Luna, J.M. et al. An automatic methodology to evaluate personalized information retrieval systems. User Model User-Adap Inter 25, 1–37 (2015). https://doi.org/10.1007/s11257-014-9148-9

Download citation

Received: 26 June 2013
Accepted: 01 May 2014
Published: 26 June 2014
Issue Date: March 2015
DOI: https://doi.org/10.1007/s11257-014-9148-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An automatic methodology to evaluate personalized information retrieval systems

Abstract

Access this article

Similar content being viewed by others

Practical Online Retrieval Evaluation

A Relevance-Focused Search Application for Personalised Ranking Model

Designing Human-Readable User Profiles for Search Evaluation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An automatic methodology to evaluate personalized information retrieval systems

Abstract

Access this article

Similar content being viewed by others

Practical Online Retrieval Evaluation

A Relevance-Focused Search Application for Personalised Ranking Model

Designing Human-Readable User Profiles for Search Evaluation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation