Advertisement

The Impact of Profile Coherence on Recommendation Performance for Shared Accounts on Smart TVs

  • Tao LianEmail author
  • Zhengxian Li
  • Zhumin Chen
  • Jun Ma
Conference paper
  • 492 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10390)

Abstract

Most recommendation algorithms assume that an account represents a single user, and capture a user’s interest by what he/she has preferred. However, in some applications, e.g., video recommendation on smart TVs, an account is often shared by multiple users who tend to have disparate interests. It poses great challenges for delivering personalized recommendations. In this paper, we propose the concept of profile coherence to measure the coherence of an account’s interests, which is computed as the average similarity between items in the account profile in our implementation. Furthermore, we evaluate the impact of profile coherence on the quality of recommendation lists for coherent and incoherent accounts generated by different variants of item-based collaborative filtering. Experiments conducted on a large-scale watch log on smart TVs conform that the profile coherence indeed impact the quality of recommendation lists in various aspects—accuracy, diversity and popularity.

Keywords

Profile coherence Shared account Recommendation performance Collaborative filtering Smart TV 

1 Introduction

Recommender systems [15] have become an essential tool to help us overcome the information overload problem by automatically identifying items that suit our interests, such as products, videos, music and social media accounts. Most recommendation algorithms assume that an account represents a single user, and capture a user’s interest by the items that are previously preferred by him/her. For example, item-based collaborative filtering (CF) provides users with recommendations that are similar to what they have already preferred.

However, in some applications, an account is often shared by multiple users. For instance, different people in a household usually watch videos on the same smart TV. Thus the observations are the mixed behavior of multiple users. What’s worse, their interests may be disparate or even conflict with each other. Take for example a household where three generations live together: the children loves animations; the father likes sports while the mother likes variety shows; the grandparents prefers TV dramas. Therefore, it is challenging to provide personalized recommendations for shared accounts.

There are two major problems in the presence of shared accounts [20]. (i) The dominance problem arises when almost all recommendations are relevant to only some users in a shared account but at least one user does not get any relevant recommendation. (ii) The generality problem arises when the recommendations are only a little bit relevant to all users in a shared account but not appealing to any of them. When the diverse interests of multiple users are mixed together, the recommendations tend to be comprised of overly general or popular items that are not very bad for most people. Therefore, it seems reasonable to conjecture that the composition of the account profile should influence the recommendation performance in various aspects such as accuracy, diversity and popularity.

In this paper, we propose the concept of profile coherence to measure the coherence of an account’s interests, which is computed as the average similarity between items in the account profile. We conjecture that the profile coherence should have an impact on the recommendation performance, especially in applications where an account is shared by multiple users. Let us make an analogy between recommender systems and search engines. The profile coherence of an account is like the clarity of an query [2]. The less clear the query, the worse the retrieval results. In a similar vein, the less coherent the account profile, the worse the recommendation results. Though it is possible that a user has a broad interest in traditional settings where an account represents a single user, the problem is more severe in the presence of shared accounts. Being able to know when a recommender system performs worse can shed light on the possible avenues to improve it. Therefore, we are interested in the question: how profile coherence impact the quality of recommendation lists for coherent and incoherent accounts generated by collaborative filtering algorithms in various aspects—accuracy, diversity and popularity?

The contributions of this paper are summarized as follows:
  • We notice an important peculiarity of video viewing behavior on smart TVs—an account is shared by multiple users in a household.

  • We propose the concept of profile coherence to measure the coherence of an account’s interests in order to discriminate between coherent and incoherent accounts.

  • We formulate four different variants of item-based CF that differ in the neighbor selection policy and the similarity aggregation function.

  • We evaluate the impact of profile coherence on the quality of recommendation lists for coherent and incoherent accounts generated by different variants of item-based CF in various aspects—accuracy, diversity and popularity.

2 Related Work

2.1 Collaborative Filtering

A well-known class of recommendation algorithms is collaborative filtering [4], which can be further classified into neighborhood-based methods (e.g., user-based CF [9] and item-based CF [3, 12, 14, 17]) and model-based methods (e.g., matrix factorization [10, 13]), among others. Some of them [9, 13, 17] perform rating predictions on explicit feedback datasets, the others [3, 10, 12, 14] perform top-N recommendations (on implicit feedback datasets). They all capture a user’s preferences based on the items that are already preferred by him/her.

2.2 Performance Variation

Some scholars attempt to explain the performance variation by the characteristics of user profiles and further predict how well or bad the recommender system would perform for a given user. For explicit feedback datasets, the following characteristics of the rating profile of a user are found to have an influence on the performance of collaborative filtering algorithms to different extent depending on the datasets [8]: the number/popularity/quality of rated items, the standard deviation of provided ratings, the quality of neighborhood, etc.

It is well acknowledged that collaborative filtering suffers from the cold-start problem. Users with only a few consumed items could not get accurate recommendations. But some users with quite a few consumed items still get inaccurate recommendations. They are referred to as gray sheep users [1, 18] whose preferences do not consistently agree or disagree with any group of users. Several methods have been developed to identify gray sheep users prior to the recommendation process [6, 7], then they are handled by other techniques such as content-based methods [5].

Recently, Saia et al. [16] proposed a semantic approach to remove incoherent items from a user’s profile in order to improve the recommendation accuracy. It is reasonable in conventional settings where an account represents a single user, though some users may have a broad interest. However, it is useless in settings where an account is shared by multiple users. When the behavior of multiple users are mixed together, the observations are likely to be widely scattered in the item space. If we remove some items in the account’s profile, some users in the shared account might not receive customized recommendations.

2.3 Shared Account Recommendation vs. Group Recommendation

Verstrepen and Goethals [20] were the first to tackle the challenge of recommendation for shared accounts in the absence of contextual information. Shared account recommendation is different from group recommendation. (i) The individual profiles of the users in the group are typically known in group recommendation, whereas they are unknown in shared account recommendation. In the case of recommending videos on smart TVs, we do not know how many people are living in a household, let alone their individual preferences. (ii) In group recommendation, the recommendations will be consumed by all users in the group. But in shared account recommendation, the recommendations are supposed to be consumed individually, and every user in the shared account should be able to identify the recommendations meant for him/her. For example, when recommending videos on smart TVs, a video is qualified if it matches the interest of any single user in the shared account rather than their common interests, but there should be at least one video recommendation for each of them.

3 Our Work

In this paper, we aim to investigate the impact of profile coherence on the recommendation performance of collaborative filtering algorithms. We choose item-based CF as the experimental algorithm, which is currently employed in the Hisense Cloud Platform where our data comes from. It generates recommendations for an account by finding other items similar to those previously consumed by the account. Thus, we conjecture that the profile coherence of an account will influence the quality of recommendation list generated by item-based CF.1

Table 1 lists the notations frequently used in this paper. We consider the top-N recommendation task based on positive-only implicit feedback. Throughout the paper, an item corresponds to a video, and an account denotes a smart TV which are shared by multiple users in a household. If a video has been played on a smart TV, we say that the account has consumed the corresponding item.
Table 1.

Notations

Symbol

Description

\(\mathcal {A}\)

the set of accounts

\(\mathcal {I}\)

the set of items

\(\varvec{P} \in \left\{ 0, 1\right\} ^{|\mathcal {A} |\times |\mathcal {I} |} \)

the preference matrix

a

an account

ij

two items

\(p_{a, i}\)

\(p_{a, i} = 1\) if the account a has consumed the item i, otherwise \(p_{a, i} = 0\)

\(\mathcal {I}_a = \left\{ i \in \mathcal {I} \mid p_{a, i} = 1 \right\} \)

the set of items consumed by the account a

\(\mathcal {A}_i = \left\{ a \in \mathcal {A} \mid p_{a, i} = 1 \right\} \)

the set of accounts who have consumed the item i

\(sim \left( i, j \right) \)

the similarity between the items i and j

\(\mathrm {KNN} \left( i \right) \)

the K most similar items of the item i

\(r\left( a, i \right) \)

the predicted ranking score of the item i for the account a

\(\mathcal {L}_a\)

the top-N recommendation list for the account a

3.1 Profile Coherence

If we treat an account in a recommender system as a query in a search engine, the more similar the items in its profile, the more clear the information need. We measure the profile coherence of an account by the average similarity of all pairs of items in its profile, which is defined as
$$\begin{aligned} coh\left( a \right) = \frac{\sum _{i \in \mathcal {I}_a} \sum _{j \in \mathcal {I}_a \setminus \left\{ i \right\} } sim \left( i, j \right) }{|\mathcal {I}_a |* \left( |\mathcal {I}_a |- 1 \right) } \,. \end{aligned}$$
(1)
In applications with only implicit feedback, the similarity between items i and j can be measured by the binary cosine similarity2, given by
$$\begin{aligned} sim \left( i, j \right) = \frac{|\mathcal {A}_i \cap \mathcal {A}_j |}{\sqrt{|\mathcal {A}_i |} \sqrt{|\mathcal {A}_j |}} \,, \end{aligned}$$
(2)
where \(\mathcal {A}_i\) denotes the subset of accounts who have consumed the item i.
There are other alternatives to measuring the profile coherence of an account. For example, we can compute the entropy of the distribution of consumed items among predefined categories/genres if the meta data of an item is available. But we adopt the Jaccard similarity based only on consumption history for the following reasons: (i) it is domain independent without the need to collect other information, and (ii) collaborative filtering also relies only on the ratings or consumption history [4, 12, 17].
Fig. 1.

Profile coherence vs. profile size

The scatter plot in Fig. 1 shows the relationship between the profile coherence (i.e., \(coh\left( a \right) \)) and the profile size (i.e., \(|\mathcal {I}_a |\)). We can make several observations. (i) For accounts with a large profile size, the coherence scores are very low. It means that accounts that have consumed a large number of items are generally incoherent, probably because items in their profiles are consumed by different users in the shared account. (ii) But for accounts with a relatively small profile size, the coherence scores vary a lot. That is to say, their profiles can be coherent, incoherent, or in between. (iii) There exist incoherent accounts everywhere in the spectrum of profile size, since most households consist of more than one people who tend to have different preferences.

Then we obtain two subsets of accounts in the two ends of the spectrum of profile coherence in a manner similar to box plot3:  
Coherent Accounts:

the 25% accounts with the highest coherence scores;

Incoherent Accounts:

the 25% accounts with the lowest coherence scores.

 

Another alternative is the threshold-based method, but it is hard to determine the thresholds. What matters is the relative magnitude of the coherence scores rather than the absolute values. In a similar vein, Gras et al. [7] considered the \(x\%\) users with the highest abnormality scores as atypical users whose ratings tend to be different from the community.

3.2 Variants of Item-Based CF

Item-based CF analyzes the user-item preference matrix to identify relations between items, and recommend items that are similar to what the account has already consumed. The general process of item-based top-N recommendation is listed in Algorithm 1.
It is of great importance how the ranking scores are predicted (line 6). Generally speaking, the ranking score of an item i for an account a is computed based on the similarities between the item i and items in the profile \(\mathcal {I}_a\):
$$\begin{aligned} s \left( a, i \right) = sim \left( \mathcal {I}_a, i \right) . \end{aligned}$$
(3)
However, there are many variants with regard to how to aggregate the similarities between the item i and items in \(\mathcal {I}_a\) to obtain the final ranking score \(s \left( a, i \right) \).
When the item-based CF algorithm was first proposed for predicting ratings on explicit feedback datasets [17], the predicted rating on the item i for the account a is given bywhere Open image in new window is the indicator function. The rating given by the account a on an item \(j \in \mathcal {I}_a\) contributes to the weighted average only if \(j \in \mathrm {KNN} \left( i \right) \). If there are no items in \(\mathcal {I}_a\) satisfying \(j \in \mathrm {KNN} \left( i \right) \), \(\hat{r}_{a,i}\) is predicted to be the global or account-specific mean. Later on, the item-based CF was adapted to the top-N recommendation task on implicit feedback datasets [12]. The predicted ranking score of the item i for the account a is given bywhere the similarity between an item \(j \in \mathcal {I}_a\) and the item i contributes to the sum only if \(i \in \mathrm {KNN} \left( j \right) \). Thus, an item i can receive a non-zero score only if it is among the K most similar neighbors of at least one of the items in \(\mathcal {I}_a\), otherwise it is not a qualified candidate. This speeds up the prediction by greatly reducing the set of candidate items, which is also adopted by other works [3, 14, 20].
There are two points worthy of attention here: the neighbor selection policy and the similarity aggregation function. The neighbor selection policy determines which items in the profile contribute to the predicted ranking score of a candidate item and also indirectly determines the subset of qualified candidate items. We refer to Open image in new window as KNN which is widely adopted in the top-N recommendation task on implicit feedback datasets, and the policy Open image in new window as iKNN, which means the inverted neighborhood [19] and is rarely adopted in the top-N recommendation task on implicit feedback datasets. The similarity aggregation function determines how to aggregate the similarities between a candidate item and items in the profile selected by either policy. The two different similarity aggregation functions are abbreviated to SUM and AVG. Thus, we can formulate four different variants of item-based CF for the top-N recommendation task on implicit feedback datasets.    

4 Experiments

Dataset. We conduct our experiments on a large-scale watch log provided by a well-known smart TV manufacturer, Hisense. On Hisense smart TVs, users can stream a variety of videos via an app named JuHaoKan4. Each video is classified as one of the following categories: animation, movie, TV drama, sports, children’s program, variety show, music, news, lifestyle, education, documentary, entertainment, autos, info, short film, and others. The watch log spans over a four-month period from 2015-12-07 to 2016-04-24. In our experiments, we only include 10 000 relatively active accounts and videos in the six categories—animation, movie, TV drama, sports, children’s program, variety show—which receive 91.4% of total views. In addition, we remove videos that have been watched by less than 20 accounts. Note that different episodes of the same program are denoted by the same video ID. Table 2 shows the statistics of the final dataset.
Table 2.

Dataset statistics

Num of accounts

10000

Num of videos

6747

Max/avg/min num of videos per account

847/178.7/21

Max/avg/min num of accounts per video

6535/264.9/20

Data sparsity

0.0265

Methodology. We perform 5-fold cross validation by randomly partitioning the observed entries in the preference matrix \(\varvec{P}\) into five folds. The top-N recommendation list (\(N = 10\) in our experiments) for each account is generated by different variants of item-based CF based on the consumed items in any four folds, and is evaluated against the consumed items in the remaining one fold. We evaluate the quality of a recommendation list in three aspects—accuracy, diversity and popularity. Then the metrics are averaged over the accounts. Finally, the results averaged over the five folds are reported.

4.1 Experiment Results

Accuracy. We evaluate the accuracy of a top-N recommendation list \(\mathcal {L}_a\) by the fraction of relevant items, given byFigure 2 shows the precision values for coherent and incoherent accounts achieved by different variants of item-based CF with various neighborhood size k. We can obtain a number of interesting insights.5 (i) Coherent accounts get more accurate recommendations than incoherent ones. (ii) Among the two neighbor selection policies, iKNN outperforms KNN in accuracy, except when \(k = 1, 5\) using SUM. (iii) Among the two similarity aggregation functions, SUM achieves more accurate recommendations than AVG. (iv) As the neighborhood size k increases, the four variants of item-based CF exhibit different variations. (v) They all level off after k exceeds 50. Thereafter, iKNN-SUM performs the best while KNN-AVG performs the worst.
Fig. 2.

Precision

Fig. 3.

Diversity

Diversity. We evaluate the diversity of a top-N recommendation list \(\mathcal {L}_a\) by the average dissimilarity of all pairs of recommended items [21], given by:
$$\begin{aligned} diversity \left( \mathcal {L}_a \right) = \frac{1}{|\mathcal {L}_a |* \left( |\mathcal {L}_a |- 1 \right) } {\sum _{i \in \mathcal {L}_a} \sum _{j \in \mathcal {L}_a \setminus \left\{ i \right\} } {1 - sim \left( i, j \right) }} . \end{aligned}$$
(7)
The diversity values of the recommendation lists for coherent and incoherent accounts generated by different variants of item-based CF are shown in Fig. 3. We can make the following observations. (i) If we adopt SUM as the similarity aggregation function (especially in combination with iKNN), incoherent accounts get more diverse recommendations than coherent ones; if we adopt AVG, incoherent accounts get slightly less diverse recommendations than coherent ones (except when \(k = 1\)). (ii) As k increases (except when \(k = 1\)), the diversity achieved by KNN-AVG increases whereas the diversity achieved by the other three variants decreases. (iii) Though KNN-AVG performs the worst in accuracy, KNN-AVG generates the most diverse recommendations.
Fig. 4.

Popularity

Popularity. We evaluate the popularity of a top-N recommendation list \(\mathcal {L}_a\) by the average popularity of recommended items, given by:
$$\begin{aligned} APop \left( \mathcal {L}_a \right) = \frac{1}{|\mathcal {L}_a |} {\sum _{i \in \mathcal {L}_a} \frac{|\mathcal {A}_i |}{|\mathcal {A} |}} , \end{aligned}$$
(8)
which is opposite to the novelty [11] of a recommendation list. Figure 4 illustrates how the average popularity of the recommendation list generated by different variants of item-based CF varies with respect to the neighborhood size. (i) Generally speaking, as k increases, the recommended items are more biased towards popular items, except in the case of KNN-AVG. (ii) On average, incoherent accounts get more popular recommendations than coherent ones, except in the case of iKNN-SUM.

5 Conclusion and Future Work

In this paper, we identify a novel profile characteristic—profile coherence—that impacts the quality of recommendations on smart TVs, where an account is shared by multiple users. Experiments conducted on a large-scale watch log on smart TVs conform that incoherent accounts indeed get less accurate recommendations. But the recommendation lists for coherent and incoherent accounts generated by different variants of item-based CF exhibit different characteristics in diversity and popularity. We believe our findings are especially valuable for practical applications, since many commercial recommender systems are item-based. The findings may provide guidance for tweaking the recommender systems according to the business goals.

In the future, we plan to conduct more extensive experiments to evaluate the impact of profile coherence on more advanced collaborative filtering algorithms. In addition, we want to compare the impact of profile coherence in applications where an account is typically shared by multiple users and in applications where an account represents a single user. What is more important, it is demanding to develop recommendation algorithms for shared accounts that can adaptively handle profile incoherence.

Footnotes

  1. 1.

    The profile coherence are expected to influence the recommendation performance of other collaborative filtering algorithms too, even content-based filtering methods.

  2. 2.

    We have also experimented with the Jaccard similarity, and the qualitative conclusions are the same.

  3. 3.
  4. 4.
  5. 5.

    The qualitative conclusions are the same when we evluate the accuracy of a recommendation list by the other measures such as recall, MRR and MAP.

Notes

Acknowledgements

This work is supported by the Natural Science Foundation of China (61672322, 61672324), the Natural Science Foundation of Shandong Province (2016ZRE27468) and the Fundamental Research Funds of Shandong University. We also thank Hisense for providing us with a large-scale watch log on smart TVs.

References

  1. 1.
    Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D., Sartin, M.: Combining content-based and collaborative filters in an online newspaper. In: Proceedings of ACM SIGIR Workshop on Recommender Systems (1999)Google Scholar
  2. 2.
    Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 299–306 (2002)Google Scholar
  3. 3.
    Deshpande, M., Karypis, G.: Item-based top-N recommendation algorithms. ACM Trans. Inform. Syst. 22(1), 143–177 (2004)CrossRefGoogle Scholar
  4. 4.
    Ekstrand, M.D., Riedl, J.T., Konstan, J.A.: Collaborative filtering recommender systems. Found. Trends Hum. Comput. Interact. 4(2), 81–173 (2011)CrossRefGoogle Scholar
  5. 5.
    Ghazanfar, M.A., Prügel-Bennett, A.: Leveraging clustering approaches to solve the gray-sheep users problem in recommender systems. Expert Syst. Appl. 41(7), 3261–3275 (2014)Google Scholar
  6. 6.
    Gras, B., Brun, A., Boyer, A.: Identifying grey sheep users in collaborative filtering: a distribution-based technique. In: Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization, pp. 17–26 (2016)Google Scholar
  7. 7.
    Gras, B., Brun, A., Boyer, A.: When Users with preferences different from others get inaccurate recommendations. In: Monfort, V., Krempels, K.-H., Majchrzak, T.A., Turk, Ž. (eds.) WEBIST 2015. LNBIP, vol. 246, pp. 191–210. Springer, Cham (2016). doi: 10.1007/978-3-319-30996-5_10 CrossRefGoogle Scholar
  8. 8.
    Griffith, J., O’Riordan, C., Sorensen, H.: Investigations into user rating information and predictive accuracy in a collaborative filtering domain. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, pp. 937–942 (2012)Google Scholar
  9. 9.
    Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J.: An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 230–237 (1999)Google Scholar
  10. 10.
    Hu, Y., Koren, Y., Volinsky, C.: Collaborative filtering for implicit feedback datasets. In: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pp. 263–272 (2008)Google Scholar
  11. 11.
    Kaminskas, M., Bridge, D.: Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Trans. Interact. Intell. Syst. 7(1), 2:1–2:42 (2016)Google Scholar
  12. 12.
    Karypis, G.: Evaluation of item-based top-N recommendation algorithms. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 247–254 (2001)Google Scholar
  13. 13.
    Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)CrossRefGoogle Scholar
  14. 14.
    Linden, G., Smith, B., York, J.: Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput. 7(1), 76–80 (2003)CrossRefGoogle Scholar
  15. 15.
    Ricci, F., Rokach, L., Shapira, B.: Recommender Systems Handbook, 2nd edn. Springer, US (2015)CrossRefzbMATHGoogle Scholar
  16. 16.
    Saia, R., Boratto, L., Carta, S.: A semantic approach to remove incoherent items from a user profile and improve the accuracy of a recommender system. J. Intell. Inform. Syst. 47(1), 111–134 (2016)CrossRefGoogle Scholar
  17. 17.
    Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th International Conference on World Wide Web, pp. 285–295 (2001)Google Scholar
  18. 18.
    Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009 (2009)Google Scholar
  19. 19.
    Vargas, S., Castells, P.: Improving sales diversity by recommending users to items. In: Proceedings of the 8th ACM Conference on Recommender Systems, pp. 145–152 (2014)Google Scholar
  20. 20.
    Verstrepen, K., Goethals, B.: Top-N recommendation for shared accounts. In: Proceedings of the 9th ACM Conference on Recommender Systems, pp. 59–66 (2015)Google Scholar
  21. 21.
    Zhang, M., Hurley, N.: Avoiding monotony: improving the diversity of recommendation lists. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 123–130 (2008)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyShandong UniversityJinanChina

Personalised recommendations