User Modeling and User-Adapted Interaction

, Volume 22, Issue 4–5, pp 399–439 | Cite as

Evaluating the effectiveness of explanations for recommender systems

Methodological issues and empirical studies on the impact of personalization
Original Paper

Abstract

When recommender systems present items, these can be accompanied by explanatory information. Such explanations can serve seven aims: effectiveness, satisfaction, transparency, scrutability, trust, persuasiveness, and efficiency. These aims can be incompatible, so any evaluation needs to state which aim is being investigated and use appropriate metrics. This paper focuses particularly on effectiveness (helping users to make good decisions) and its trade-off with satisfaction. It provides an overview of existing work on evaluating effectiveness and the metrics used. It also highlights the limitations of the existing effectiveness metrics, in particular the effects of under- and overestimation and recommendation domain. In addition to this methodological contribution, the paper presents four empirical studies in two domains: movies and cameras. These studies investigate the impact of personalizing simple feature-based explanations on effectiveness and satisfaction. Both approximated and real effectiveness is investigated. Contrary to expectation, personalization was detrimental to effectiveness, though it may improve user satisfaction. The studies also highlighted the importance of considering opt-out rates and the underlying rating distribution when evaluating effectiveness.

Keywords

Recommender systems Metrics Item descriptions Explanations Empirical studies 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahn, J.W., Brusilovsky, P., Grady, J., He, D., Syn, S.Y.: Open user profiles for adaptive news systems: help or harm? In: Proceedings of the 16th International Conference on World Wide Web, pp. 11–20. Banff, Alberta, Canada (2007)Google Scholar
  2. Ardissono L., Goy A., Petrone G., Segnan M., Torasso P.: INTRIGUE: personalized recommendation of tourist attractions for desktop and handheld devices. Appl. Artif. Intell. 17, 687–714 (2003)CrossRefGoogle Scholar
  3. Bilgic, M., Mooney, R.J.: Explaining recommendations: satisfaction vs. promotion. In: Proceedings of the Workshop Beyond Personalization, in Conjunction with the International Conference on Intelligent User Interfaces, pp. 13–18. San Diego, CA (2005)Google Scholar
  4. Billsus, D., Pazzani, M.J.: A personal news agent that talks, learns, and explains. In: Proceedings of the Third International Conference on Autonomous Agents, pp. 268–275. Seattle, WA (1999)Google Scholar
  5. Carenini, G., Moore, D.J.: An empirical study of the influence of user tailoring on evaluative argument effectiveness. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence, pp. 1307–1314. Seattle, WA (2001)Google Scholar
  6. Chen, L., Pu, P.: Hybrid critiquing-based recommender systems. In: International Conference on Intelligent User Interfaces, pp. 22–31. Honolulu, HI, USA (2007)Google Scholar
  7. Cho Y., Im I., Hiltz J.F.S.R.: The impact of product category on customer dissatisfaction in cyberspace. Bus. Process Manag. J. 9(5), 635–651 (2003)CrossRefGoogle Scholar
  8. Cramer, H., Evers, V., Someren, M.V., Ramlal, S., Rutledge, L., Stash, N., Aroyo, L., Wielinga, B.: The effects of transparency on perceived and actual competence of a content-based recommender. In: Semantic Web User Interaction Workshop in Conjuction with the International Conference on Human Factors in Computing Systems, pp. 455–496. Florence, Italy (2008a)Google Scholar
  9. Cramer H.S.M., Evers V., Ramlal S., van Someren M., Rutledge L., Stash N., Aroyo L., Wielinga B.J.: The effects of transparency on trust in and acceptance of a content-based art recommender. User Model. User Adapt. Interact. 18(5), 455–496 (2008b)CrossRefGoogle Scholar
  10. Czarkowski, M.: A scrutable adaptive hypertext. PhD thesis, University of Sydney (2006)Google Scholar
  11. Dale, R.: Dynamic document delivery: generating natural language texts on demand. In: Proceedings of the 9th International Workshop on Database and Expert Systems Applications, DEXA ’98, pp. 131–136. IEEE Computer Society, Vienna, Austria (1998)Google Scholar
  12. Felfernig A., Gula B., Teppan E.: User acceptance of knowledge-based recommenders. Mach. Percept. Artif. Intell. 70, 249–276 (2007)CrossRefGoogle Scholar
  13. Felfernigm, A., Gula, B., Letiner, G., Maier, M., Melcher, R., Schippel, S., Teppan, E.: A dominance model for the calculation of decoy products in recommendation environments. In: Symposium on Persuasive Technology in Conjuction with Artificial Intelligence and the Simulation of Behavior Convention, pp. 43–50. Aberdeen, Scotland (2008)Google Scholar
  14. Guy, I., Ronen, I., Wilcox, E.: Do you know? Recommending people to invite into your social network. In: International Conference on Intelligent User Interfaces, pp. 77–86. Sanibel Island, FL, USA (2009a)Google Scholar
  15. Guy, I., Zwerdling, N., Carmel, D., Ronen, I., Uziel, E., Yogev, S., Ofek-Koifman, S.: Personalized recommendation of social software items based on social relations. In: ACM Conference on Recommender systems, pp. 53–60. New York City, NY, USA (2009b)Google Scholar
  16. Häubl G., Trifts V.: Consumer decision making in online shopping environments: the effects of interactive decision aids. Market. Sci. 19, 4–21 (2000)CrossRefGoogle Scholar
  17. Herlocker, J.L., Konstan, J.A., Riedl, J.: Explaining collaborative filtering recommendations. In: ACM Conference on Computer Supported Cooperative Work, pp. 241–250. Philadelphia, PA, USA (2000)Google Scholar
  18. Hingston, M.: User friendly recommender systems. Master’s thesis, Sydney University, Australia (2006)Google Scholar
  19. Laband D.N.: An objective measure of search versus experience goods. Econ. Inq. 29(3), 497–509 (1991)CrossRefGoogle Scholar
  20. Masthoff J.: The evaluation of adaptive systems. In: Patel, N. (eds) Adaptive Evolutionary Information Systems, pp. 329–347. Idea group publishing, Hershey, PA (2002)CrossRefGoogle Scholar
  21. Masthoff J.: Group modeling: selecting a sequence of television items to suit a group of viewers. User Model. User Adapt. Interact. 14, 37–85 (2004)CrossRefGoogle Scholar
  22. McCarthy, K., Reilly, J., McGinty, L., Smyth, B.: Thinking positively—explanatory feedback for conversational recommender systems. In: Explanation Workshop in Conjunction with the European Conference on Case-Based Reasoning, pp. 115–124. Madrid, Spain (2004)Google Scholar
  23. McCarthy, K., Reilly, J., McGinty, L., Smyth, B.: Experiments in dynamic critiquing. In: International Conference on Intelligent User Interfaces, pp. 175–182. San Diego, CA, USA (2005a)Google Scholar
  24. McCarthy K., Reilly J., Smyth B., Mcginty L.: Generating diverse compound critiques. Artif. Intell. Rev. 24, 339–357 (2005b)CrossRefGoogle Scholar
  25. McNee, S.M., Riedl, J., Konstan, J.A.: Being accurate is not enough: how accuracy metrics have hurt recommender systems. In: International Conference on Human Factors in Computing Systems, pp. 1097–1101. Montreal, Canada (2006a)Google Scholar
  26. McNee, S.M., Riedl, J., Konstan, J.A.: Making recommendations better: An analytic model for human-recommender interaction. In: Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems (CHI 2006), pp. 1103–1108. Montreal, Canada (2006b)Google Scholar
  27. McSherry D.: Explanation in recommender systems. Artif. Intell. Rev. 24(2), 179–197 (2005)MATHCrossRefGoogle Scholar
  28. Murphy P.E., Enis B.M.: Classifying products strategically. J. Market. 50, 24–42 (1986)CrossRefGoogle Scholar
  29. Oberlander, J., Mellish, C.: Final report on the ILEX project. online: http://www.hcrc.ed.ac.uk/ilex/final.html (1998)
  30. Paramythis A., Weibelzahl S., Masthoff J.: Layered evaluation of interactive adaptive systems: framework and formative methods. User Model. User Adapt. Interact. 20, 383–453 (2010)CrossRefGoogle Scholar
  31. Pommeranz, A., Broekens, J., Wiggers, P., Brinkman, W.P., Jonker, C.M.: Designing interfaces for explicit preference elicitation: a user-centered investigation of preference representation and elicitation process. User Model. User Adapt. Interact. 22 (2012). doi:10.1007/s11257-011-9116-6
  32. Pu, P., Chen, L.: Trust building with explanation interfaces. In: International Conference on Intelligent User Interfaces, pp. 93–100. Sydney, Australia (2006)Google Scholar
  33. Pu P., Chen L.: Trust-inspiring explanation interfaces for recommender systems. Knowl. Syst. 20, 542–556 (2007)MathSciNetCrossRefGoogle Scholar
  34. Pu, P., Chen, L., Hu, R.: Evaluating recommender systems from the user’s perspective: survey of the state of the art. User Model. User Adapt. Interact. 22 (2012). doi:10.1007/s11257-011-9115-7
  35. Rashid, A.M., Albert, I., Cosley, D., Lam, S.K., McNee, S.M., Konstan, J.A., Riedl, J.: Getting to know you: learning new user preferences in recommender systems. In: International Conference on Intelligent User Interfaces, pp. 127–134. San Francisco, CA, USA (2002)Google Scholar
  36. Reilly, J., McCarthy, K., McGinty, L., Smyth, B.: Incremental critiquing. In: SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, pp. 143–151. Cambridge, UK (2004)Google Scholar
  37. Ricci, F., Rokach, L., Shapira, B., Kantor, P. (eds): Recommender Systems Handbook. Springer, Dordrecht (2010)Google Scholar
  38. Roth-Berghofer, T., Schulz, S., Leake, D.B., Bahls, D.: Workshop on explanation-aware computing. In: European Conference on Artificial Intelligence, Patras, Greece (2008)Google Scholar
  39. Roth-Berghofer, T., Tintarev, N., Leake, D.B.: Workshop on explanation-aware computing. In: International Joint Conference on Artificial Intelligence, Pasadena, CA, USA (2009)Google Scholar
  40. Roth-Berghofer, T., Tintarev, N., Leake, D.B.: Workshop on explanation-aware computing. In: European Conference on Artificial Intelligence, Lisbon, Portugal (2010)Google Scholar
  41. Shapiro C.: Optimal pricing of experience goods. Bell J. Econ. 14(2), 497–507 (1983)CrossRefGoogle Scholar
  42. Sinha, R., Swearingen, K.: The role of transparency in recommender systems. In: Conference on Human Factors in Computing Systems, pp. 830–831. Minneapolis, MN, USA (2002)Google Scholar
  43. Symeonidis, P., Nanopoulos, A., Manolopoulos, Y.: Justified recommendations based on content and rating data. In: Workshop on Web Mining and Web Usage Analysis in Conjunction with the International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA (2008)Google Scholar
  44. Thompson C.A., Göker M.H., Langley P.: A personalized system for conversational recommendations. J. Artif. Intell. Res. 21, 393–428 (2004)Google Scholar
  45. Tintarev, N., Masthoff, J.: Effective explanations of recommendations: user-centered design. In: Recommender Systems, pp. 153–156. Minneapolis, MN, USA (2007a)Google Scholar
  46. Tintarev, N., Masthoff, J.: A survey of explanations in recommender systems. In: WPRSIUI Associated with ICDE’07, pp. 801–810. Istanbul, Turkey (2007b)Google Scholar
  47. Tintarev, N., Masthoff, J.: Over- and underestimation in different product domains. In: Workshop on Recommender Systems in Conjunction with the European Conference on Artificial Intelligence, pp. 14–19. Patras, Greece (2008a)Google Scholar
  48. Tintarev, N., Masthoff, J.: Personalizing movie explanations using commercial meta-data. In: International Conference on Adaptive Hypermedia, pp. 204–213. Hannover, Germany (2008b)Google Scholar
  49. Tintarev, N., Masthoff, J.: Evaluating recommender explanations: Problems experienced and lessons learned for evaluation of adaptive systems. In: UCDEAS Workshop in Conjuction with UMAP, pp. 54–63, Trento, Italy (2009)Google Scholar
  50. Tintarev N., Masthoff J.: Designing and evaluating explanations for recommender systems. In: Kantor, P.B., Ricci, F., Rokach, L., Shapira, B. (eds) Recommender Systems Handbook, pp. 479–510. Springer, Dordrecht (2010)Google Scholar
  51. Vig, J., Sen, S., Riedl, J.: Tagsplanations: Explaining recommendations using tags. In: International Conference on Intelligent User Interfaces, pp. 47–56. Sanibel Island, FL, USA (2009)Google Scholar
  52. Wang W., Benbasat I.: Recommendation agents for electronic commerce: effects of explanation facilities on trusting beliefs. J. Manag. Inf. Syst. 23, 217–246 (2007)CrossRefGoogle Scholar
  53. Wärnestål, P.: User evaluation of a conversational recommender system. In: Workshop on Knowledge and Reasoning in Practical Dialogue Systems in Conjunction with the International Joint Conference on Artificial Intelligence, pp. 32–39. Edinburgh, Scotland (2005)Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  1. 1.Department of Computing ScienceUniversity of AberdeenAberdeenScotland, UK

Personalised recommendations