Advertisement

Journal of Intelligent Information Systems

, Volume 46, Issue 2, pp 285–312 | Cite as

Combining similarity and sentiment in opinion mining for product recommendation

  • Ruihai DongEmail author
  • Michael P. O’Mahony
  • Markus Schaal
  • Kevin McCarthy
  • Barry Smyth
Article

Abstract

In the world of recommender systems, so-called content-based methods are an important approach that rely on the availability of detailed product or item descriptions to drive the recommendation process. For example, recommendations can be generated for a target user by selecting unseen products that are similar to the products that the target user has liked or purchased in the past. To do this, content-based methods must be able to compute the similarity between pairs of products (unseen products and liked products, for example) and typically this is achieved by comparing product features or other descriptive elements. The approach works well when product descriptions are readily available and when they are detailed enough to afford an effective similarity comparison. But this is not always the case. Detailed product descriptions may not be available since they can be expensive to create and maintain. In this article we consider another source of product descriptions in the form of the user-generated reviews that frequently accompany products on the web. We ask whether it is possible to mine these reviews, unstructured and noisy as they are, to produce useful product descriptions that can be used in a recommendation system. In particular we describe a novel approach to product recommendation that harnesses not only the features that can be mined from user-generated reviews but also the expressions of sentiment that are associated with these features. We present a recommendation ranking strategy that combines similarity and sentiment to suggest products that are similar but superior to a query product according to the opinion of reviewers, and we demonstrate the practical benefits of this approach across a variety of Amazon product domains.

Keywords

User-generated Reviews Opinion Mining Sentiment-based Product Recommendation 

References

  1. Archak, N., Ghose, A., & Ipeirotis, P.G. (2011). Deriving the pricing power of product features by mining consumer reviews. Management Science, 57(8), 1485–1509.CrossRefzbMATHGoogle Scholar
  2. Baccianella, S., Esuli, A., & Sebastiani, F. (2009). Multi-facet rating of product reviews. In Advances in Information Retrieval, 31th European Conference on Information Retrieval Research (ECIR 2009) (pp. 461–472). Toulouse, France: Springer.Google Scholar
  3. Bar-Haim, R., Dinur, E., Feldman, R., Fresko, M., & Goldstein, G. (2011). Identifying and following expert investors in stock microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’11, pp 1310–1319. PA, USA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=2145432.2145569.
  4. Boiy, E., & Moens, M.F. (2009). A machine learning approach to sentiment analysis in multilingual web texts. Information Retrieval, 12(5), 526–558.CrossRefGoogle Scholar
  5. Bridge, D., Göker, M.H., McGinty, L., & Smyth, B. (2005). Case-based recommender systems. Knowledge Engineering Review, 20(03), 315–320.CrossRefGoogle Scholar
  6. Burke, R. (2002). Hybrid recommender systems: Survey and experiments. User Model User-Adapted International, 12(4), 331–370. doi: 10.1023/A:1021240730564.CrossRefzbMATHGoogle Scholar
  7. Burke, R., Hammond, K., & Yound, B. (1997). The findme approach to assisted browsing. IEEE Expert, 12(4), 32–40. doi: 10.1109/64.608186.CrossRefGoogle Scholar
  8. Dasgupta, S., & Ng, V. (2009). Mine the easy, classify the hard: A semi-supervised approach to automatic sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2, ACL ’09, pp 701–709. PA, USA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=1690219.1690244.
  9. De Francisci Morales, G., Gionis, A., & Lucchese, C. (2012). From chatter to headlines: Harnessing the real-time web for personalized news recommendation. In Proceedings of the fifth ACM International Conference on Web Search and Data Mining, WSDM ’12, pp. 153–162. NY, USA: ACM. doi: 10.1145/2124295.2124315.
  10. Desrosiers, C., & Karypis, G. (2011). A comprehensive survey of neighborhood-based recommendation methods. In Recommender Systems Handbook (pp. 107–144): Springer.Google Scholar
  11. Ding, X., Liu, B., & Yu, P.S. (2008). A holistic lexicon-based approach to opinion mining. In Proceedings of the 1st ACM International Conference on Web Search and Data Mining (pp. 231–240): ACM.Google Scholar
  12. Dong, R., O’Mahony, M.P, Schaal, M., McCarthy, K., & Smyth, B. (2013). Sentimental product recommendation. In Proceedings of the 7th ACM Conference on Recommender Systems, RecSys ’13. (pp. 411–414). NY, USA: ACM. doi: 10.1145/2507157.2507199.
  13. Dong, R., O’Mahony, M.P., & Smyth, B. (2014). Further experiments in opinionated product recommendation. In Proceedings of the 22nd International Conference on Case-Based Reasoning, ICCBR ’14 (pp. 110–124): Springer.Google Scholar
  14. Dong, R., Schaal, M., O’Mahony, M.P., McCarthy, K., & Smyth, B. (2013). Opinionated product recommendation. In Proceedings of the 21st International Conference on Case-Based Reasoning, ICCBR ’13 (pp. 44–58). Heidelberg: Springer.Google Scholar
  15. Dong, R., Schaal, M., O’Mahony, M.P., & Smyth, B. (2013). Topic extraction from online reviews for classification and recommendation. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence, IJCAI ’13. Menlo Park, California: AAAI Press.Google Scholar
  16. Dooms, S., De Pessemier, T., & Martens, L. (2013). Movietweetings: a movie rating dataset collected from twitter. In Workshop on Crowdsourcing and Human Computation for Recommender Systems, CrowdRec at RecSys, Vol. 13.Google Scholar
  17. Feldman, R., Rosenfeld, B., Bar-Haim, R., & Fresko, M. (2011). The stock sonar sentiment analysis of stocks based on a hybrid approach. In Proceedings of the 23rd IAAI Conference.Google Scholar
  18. Garcia Esparza, S., O’Mahony, M.P., & Smyth, B. (2010). On the real-time web as a source of recommendation knowledge. In Proceedings of the fourth ACM Conference on Recommender Systems, RecSys ’10. (pp. 305–308). NY, USA: ACM. doi: 10.1145/1864708.1864773.
  19. Herlocker, J.L., Konstan, J.A., & Riedl, J. (2000). Explaining collaborative filtering recommendations. In Proceedings of the 2000 ACM Conference on Computer Supported Cooperative Work, CSCW ’00. (pp. 241–250). NY, USA: ACM. doi: 10.1145/358916.358995.
  20. Hsu, C.F., Khabiri, E., & Caverlee, J. (2009). Ranking comments on the social web. In Proceedings of the 2009 IEEE International Conference on Social Computing (SocialCom-09) (pp. 90–97). Vancouver, Canada.Google Scholar
  21. Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04. (pp. 168–177). NY, USA: ACM. doi: 10.1145/1014052.1014073.
  22. Hu, M., & Liu, B. (2004). Mining opinion features in customer reviews. In Proceedings of the 19th National Conference on Artifical Intelligence, AAAI’04. (pp. 755–760): AAAI Press. http://dl.acm.org/citation.cfm?id=1597148.1597269.
  23. Huang, J., Etzioni, O., Zettlemoyer, L., Clark, K., & Lee, C. (2012). Revminer: An extractive interface for navigating reviews on a smartphone. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology, UIST ’12. (pp. 3–12). NY, USA: ACM. doi: 10.1145/2380116.2380120.
  24. Jiang, L., Yu, M., Zhou, M., Liu, X., & Zhao, T. (2011). Target-dependent twitter sentiment classification (pp. 151–160): ACL.Google Scholar
  25. Justeson, J.S., & Katz, S.M. (1995). Technical terminology: Some linguistic properties and an algorithm for identification in text. National Language Engineering, 1(1), 9–27.CrossRefGoogle Scholar
  26. Kim, H.D, & Zhai, C. (2009). Generating comparative summaries of contradictory opinions in text. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM ’09. (pp. 385–394). NY, USA: ACM. doi: 10.1145/1645953.1646004.
  27. Kim, S.M., & Pantel, P. (2006). Chklovski, T.,, Pennacchiotti, M.: Automatically assessing review helpfulness. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2006) (pp. 423–430). Sydney, Australia.Google Scholar
  28. Kim, S.M., Pantel, P., Chklovski, T., & Pennacchiotti, M. (2006). Automatically assessing review helpfulness. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06. (pp. 423–430). PA, USA: Association for Computational Linguistics http://dl.acm.org/citation.cfm?id=1610075.1610135.
  29. Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.CrossRefGoogle Scholar
  30. Kruskal, W.H., & Wallis, W.A. (1952). Use of ranks in one-criterion variance analysis. Journal American Statistics Association, 47(260), 583–621.CrossRefzbMATHGoogle Scholar
  31. Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lecture Human Language Technology, 5(1), 1–167.CrossRefGoogle Scholar
  32. Liu, B., Hu, M., & Cheng, J. (2005). Opinion observer: Analyzing and comparing opinions on the web. In Proceedings of the 14th International Conference on World Wide Web, WWW ’05. (pp. 342–351). NY, USA: ACM. doi: 10.1145/1060745.1060797.
  33. Liu, J., Cao, Y., Lin, C.Y., Huang, Y., & Zhou, M. (2007). Low-quality product review detection in opinion summarization. In EMNLP-CoNLL (pp. 334–342).Google Scholar
  34. Liu, Y., Huang, X., An, A., & Yu, X. (2008). Modeling and predicting the helpfulness of online reviews. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008) (pp. 443–452). Pisa, Italy: IEEE Computer Society.Google Scholar
  35. Lops, P., De Gemmis, M., & Semeraro, G. (2011). Content-based recommender systems: State of the art and trends. In Recommender Systems Handbook (pp. 73–105): Springer.Google Scholar
  36. Manning, C.D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval Vol. 1: Cambridge University Press Cambridge.Google Scholar
  37. McGlohon, M., Glance, N.S., & Reiter, Z. (2010). Star quality: Aggregating reviews to rank products and merchants. In Proceedings of 4th International AAAI Conference on Weblogs and Social Media, ICWSM ’10.Google Scholar
  38. Mishne, G. (2006). Multiple ranking strategies for opinion retrieval in blogs. In Online Proceedings of TREC: Citeseer.Google Scholar
  39. Moghaddam, S., & Ester, M. (2010). Opinion digger: An unsupervised opinion miner from unstructured product reviews. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM ’10. (pp. 1825–1828). NY, USA: ACM. doi: 10.1145/1871437.1871739.
  40. Na, S.H., Lee, Y., Nam, S.H., & Lee, J.H. (2009). Improving opinion retrieval based on query-specific sentiment lexicon. In Advances in Information Retrieval (pp. 734–738): Springer.Google Scholar
  41. Nigam, K., & Hurst, M. (2004). Towards a robust metric of opinion. In AAAI Spring Symposium on Exploring Attitude and Affect in Text (pp. 598–603).Google Scholar
  42. O’Mahony, M.P., & Smyth, B. (2009). Learning to recommend helpful hotel reviews. In Proceedings of the 3rd ACM Conference on Recommender Systems, RecSys ’09. NY, USA.Google Scholar
  43. O’Mahony, M.P., & Smyth, B. (2010). A classification-based review recommender. Knowledge-Based Systems, 23(4), 323–329.CrossRefGoogle Scholar
  44. Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the 2nd ACL Conference on Empirical Methods in Natural Language Processing - Volume 10, EMNLP ’02.(pp. 79–86). PA, USA: Association for Computational Linguistics. doi: 10.3115/1118693.1118704.
  45. Paul, M.J., Zhai, C., & Girju, R. (2010). Summarizing contrastive viewpoints in opinionated text. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP ’10. (pp. 66–76). PA, USA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=1870658.1870665.
  46. Pazzani, M., & Billsus, D. (2007). Content-based recommendation systems. In Brusilovsky, P., Kobsa, A., & Nejdl, W. (Eds.) The Adaptive Web, Lecture Notes in Computer Science. (Vol. 4321 pp. 325–341): Springer Berlin Heidelberg. doi: 10.1007/978-3-540-72079-9_10.
  47. Pazzani, M., & Billsus, D. (2007). Content-based recommendation systems. In The Adaptive Web, Lecture Notes in Computer Science, (Vol. 4321 pp. 325–341): Springer Berlin Heidelberg.Google Scholar
  48. Phelan, O., McCarthy, K., & Smyth, B. (2009). Using twitter to recommend real-time topical news. In Proceedings of the 3rd ACM conference on Recommender systems (pp. 385–388): ACM.Google Scholar
  49. Poirier, D., Tellier, I., Fessant, F., & Schluth, J. (2010). Towards text-based recommendations. In Adaptivity, Personalization and Fusion of Heterogeneous Information, RIAO ’10 (pp. 136–137). Paris, France. http://dl.acm. org/citation.cfm?id=1937055.1937089.
  50. Popescu, A.M., & Etzioni, O. (2007). Extracting product features and opinions from reviews. In Natural Language Processing and Text Mining (pp. 9–28). London: Springer.Google Scholar
  51. Qiu, G., Liu, B., Bu, J., & Chen, C. (2009). Expanding domain sentiment lexicon through double propagation. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, IJCAI ’09, (Vol. 9 pp. 1199–1204).Google Scholar
  52. Reilly, J., McCarthy, K., McGinty, L., & Smyth, B. (2004). Dynamic critiquing. Advances in Case-Based Reasoning, 37–50.Google Scholar
  53. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994). Grouplens: An open architecture for collaborative filtering of netnews. In Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, CSCW ’94. (pp. 175–186). NY, USA: ACM. doi: 10.1145/192844.192905.
  54. Reyes, A., & Rosso, P. (2012). Making objective decisions from subjective data: Detecting irony in customer reviews. Decision Support Systems, 53(4), 754–760.CrossRefGoogle Scholar
  55. Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web, WWW ’01. (pp. 285–295). NY, USA: ACM. doi: 10.1145/ 371920.372071.
  56. Shardanand, U., & Maes, P. (1995). Social information filtering: algorithms for automating ẅord of mouth. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 210–217): ACM Press/Addison-Wesley Publishing Co.Google Scholar
  57. Smyth, B. (2007). Case-based recommendation. In Brusilovsky, P., Kobsa, A., & Nejdl, W. (Eds.) The Adaptive Web, Lecture Notes in Computer Science. (Vol. 4321 pp. 342–376): Springer Berlin Heidelberg. doi: 10.1007/978-3-540-72079-9_11.
  58. Tata, S., & Di Eugenio, B. (2010). Generating fine-grained reviews of songs from album reviews. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL ’10 (pp. 1376–1385). PA, USA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=1858681.1858821.
  59. Tintarev, N., & Masthoff, J. (2007). In Effective explanations of recommendations: User-centered design. In: Proceedings of the 1st ACM Conference on Recommender Systems, RecSys ’07. (pp. 153–156). NY, USA: ACM. doi: 10.1145/1297231.1297259.
  60. Tsur, O., Davidov, D., & Rappoport, A. (2010). Icwsm - a great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews. In Proceedings of the International AAAI Conference on Weblogs and Social Media. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/view/1495/1851.
  61. Tumasjan, A., Sprenger, T.O., Sandner, P.G., & Welpe, I.M. (2010). Predicting elections with twitter: What 140 characters reveal about political sentiment. In Proceedings of 4th International AAAI Conference on Weblogs and Social Media, ICWSM ’10.Google Scholar
  62. Turney, P.D. (2002). Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pp. 417–424. Association for Computational Linguistics. PA, USA. doi: 10.3115/1073083.1073153.
  63. Wiebe, J.M., Bruce, R.F., & O’Hara, T.P. (1999). Development and use of a gold-standard data set for subjectivity classifications. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ACL ’99. (p. Association for Computational Linguistics). PA, USA. doi: 10.3115/1034678.1034721.
  64. Yessenalina, A., Yue, Y., & Cardie, C. (2010). Multi-level structured models for document-level sentiment classification. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP ’10 (pp. 1046–1056). PA, USA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=1870658.1870760.
  65. Zhai, Z., Liu, B., Xu, H., & Jia, P. (2011). Clustering product features for opinion mining. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM ’11. (pp. 347–354). NY, USA: ACM. doi: 10.1145/1935826.1935884.
  66. Zhang, K., Narayanan, R., & Choudhary, A. (2010). Voice of the customers: Mining online customer reviews for product feature-based ranking. In Proceedings of the 3rd Workshop on Online Social Networks, WOSN ’10. CA, USA. http://dl.acm.org/citation.cfm?id=1863190.1863201.
  67. Zhang, M., & Ye, X. (2008). A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 411–418): ACM.Google Scholar
  68. Zhang, W., Jia, L., Yu, C., & Meng, W. (2008). Improve the effectiveness of the opinion retrieval and opinion polarity classification. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (pp. 1415–1416): ACM.Google Scholar
  69. Zhang, Z., & Varadarajan, B. (2006). Utility scoring of product reviews. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management, CIKM ’06. (pp. 51–57). NY, USA: ACM. doi: 10.1145/1183614.1183626.
  70. Zhuang, L., Jing, F., & Zhu, X.Y. (2006). Movie review mining and summarization. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management, CIKM ’06. (pp. 43–50). NY, USA: ACM. doi: 10.1145/1183614.1183625.

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Ruihai Dong
    • 1
    Email author
  • Michael P. O’Mahony
    • 1
  • Markus Schaal
    • 1
  • Kevin McCarthy
    • 2
  • Barry Smyth
    • 2
  1. 1.CLARITY: Centre for Sensor Web Technologies, School of Computer Science and InformaticsUniversity College DublinDublinIreland
  2. 2.Insight Centre for Data Analytics, School of Computer Science and InformaticsUniversity College DublinDublinIreland

Personalised recommendations