Keywords

1 Introduction

With the advent and development of web applications, social networks have become popular, especially for information sharing. Through these services, a lot of user behaviors like following, posting and retweeting are fully recorded. Particularly, retweeting is the most important mechanism that forms a diffusion network in the way of virus on social networks. Hence, understanding the mechanism of information diffusion is especially critical problem for many social applications including user interest modeling [16, 25], influential spreaders identifying [2, 28] and social recommendations [20, 23]. Recently, various studies on information diffusion have been proposed including the influence factors investigation [1, 15] and user’s spreading behaviors prediction [29]. However, these studies assume that information diffusion based on retweeting behaviors build on the underlying following network among users, which may cause only be connected users can receive the message. Thus, the scale of information diffusion would be limited due to the effect of structural trapping.

Fig. 1.
figure 1

The observation and analysis of mention on social network.

To break through the structural trapping, social networks offer mention function which can improve the visibility scope of message. Figure 1 gives an illustration of mention on social network. We observe from Fig. 1(a) that mention allows a user to introduce other users in a message by the form of @username. Further, Fig. 1(b) illustrates the diffusion network of a mentioned message, forming a large-scale cascade. Meanwhile, only one or two users are mentioned in a message at most cases in Fig. 1(c). Hence, mention plays important roles in both expanding information diffusion and improving social relationship. The problem of whom-to-mention have attracted more and more attention in recent years, including ranking-based recommendation [10, 21, 22, 27, 30], link prediction [8] and unbalance assignment [4]. The above methods consider different factors in the mention task from the aspects of content [5, 10, 21], social influence [10, 17], spatiotemporal information [4, 21], and user’s interests [6, 17]. Despite the vast and growing studies on mention behaviors, none of models jointly consider both topic relevance and interaction histories as well as homophily influence.

In this paper, we propose a novel Context-aware Mention recommendation model based on Probabilistic Matrix Factorization to recommend the right users. To provide more accurate mention recommendation, we first quantify topic relevance on the basis of interest match between messages and users as well as mention affinity based on interaction histories, and then force the two entries onto the product of user and message latent feature matrices. Meanwhile, we introduce two similarity regularization constraint terms from user and content dimensions on the basis of homophily assumption. Finally, we collaboratively factorize these social contextual factors under probabilistic matrix factorization. A real-world mention dataset is conducted from Weibo. The experiment results show that our proposed model outperforms the baseline models. Furthermore, mention contextual factors can boost the performance of recommendation, which demonstrates the effectiveness of discovered contextual factors.

This work makes the following three contributions:

  • We propose a novel context-aware mention recommendation model based on probabilistic matrix factorization. This model considers topic relevance, mention affinity, user profile and message semantic similarities as important contextual factors to mention the candidates.

  • We also observe topic relevance and mention affinity play a much significant role in the task, and incorporating user profile and message semantic similarities indeed can improve the performance of mention recommendation.

  • Comprehensive experiments on the real-world dataset clearly validate that our proposed model outperforms state-of-the-art comparison methods, which proves the effectiveness of discovered mention contextual factors.

The remainder of this paper is organized as follows: Sect. 2 reviews related work. Section 3 detailed describes the proposed models. We empirically evaluate our proposed method on a real-world dataset in Sect. 4, including a comparison to baseline methods. We conclude the paper in Sect. 5.

2 Related Work

2.1 Social Recommendation Methods

A great deal of work with social recommendation has been proposed by considering contextual information. For example, SoRec [11] uses network structure information and rating records to solve the data sparsity and poor prediction accuracy problems based on probabilistic matrix factorization. Similarly, Context MF [9] considers user preference and social influence to improve the accuracy of social recommendation. STE [12] utilizes social trust restrictions by fusing users’ tastes and their trusted friends’ favors together on the recommender systems. mTrust [19] studies multi-faceted trust relationships between users for rating prediction. SocialMF [7] incorporates the mechanism of trust propagation into the matrix factorization approach for recommendation in social networks. SocialReg [13] imposes social regularization terms to constrain the objective functions based on users’ social friend information. TBPR [24] studies the effects of distinguishing strong and weak ties by using neighbourhood overlap to approximate tie strength in social recommendation. In a word, incorporating social contextual information indeed can improve the recommendation performance successfully.

2.2 Mention Behavior Modeling

Who-to-mention can be viewed as a recommendation task. For instance, Wang et al. [22] use ranking support vector regression to recommend the candidates with interest match, user relationship and social influence features. Tang et al. [21] employ ranking support vector machine as the solution by utilizing content, social, location and time features. To solve the mention overload problem, Zhou et al. [30] propose a personalized ranking model by considering multi-dimensional relations among users and tweets to generate the personalized mention list. Li et al. [10] utilize probabilistic factor graph model with mention relationship as edges and candidates as nodes to deal with overwhelmed information. Gong et al. [5] propose a topical translation-based method to predict the mentioned users by considering both content of microblog and histories of candidate users. Huang et al. [6] design an end-to-end memory network by incorporating users’ interests with external memory. Ma et al. [14] propose a cross-attention memory network by using user’s interests with external memory and the cross-attention mechanism to extract both textual and visual information.

Other works also tackle the problem as a classification prediction task. For example, Jiang et al. [8] use link prediction to predict mention behaviors by using user, textual, social tie and temporal information features. Similarly, Bao et al. [3] propose the response prediction and formulate it as a binary classification task by using three factors from structure, influence, and content. Besides, this problem can be modeled as an unbalance assignment problem using Hungarian method to find the optimal users in the appropriate time by Ding et al. [4].

Different from the above studies, this work proposes a novel context-aware mention recommendation model based probabilistic matrix factorization, which can incorporate topic relevance and mention affinity as well as homophily influence for mentioning the appropriate users.

3 Mention Recommendation Model

In this section, we first formulate the problem of mention recommendation based on Probabilistic Matrix Factorization. Then, we describe how to incorporate mention contextual factors, and discuss how we learn the hidden variables.

3.1 Mention Formulation by Probabilistic Matrix Factorization

Suppose that we have M users and N messages. Let \( R \in {\mathbb {R}}^{M \times N} \) be the mentioning matrix, where the observed mentions are indicated by 1 values, and missing entries are assumed to be 0 values. \(U \in {\mathbb {R}}^{K \times M}\) and \(V \in {\mathbb {R}}^{K \times N}\) be user and message latent feature matrices respectively, where K is the dimension of latent factors. The preference of user \(u_i\) is represented by vector \(U_i \in {\mathbb {R}}^{K \times 1}\) and the characteristic of message \(m_j\) is represented by vector \(V_j \in {\mathbb {R}}^{K \times 1}\). The dot product of \(U_i\) and \(V_j\) can approximate the mention behavior between \(u_i\) and \(m_j\): \({\hat{R}}_{ij} \approx U_i^T V_j\). Mention recommendation based on Probabilistic Matrix Factorization (PMF) [18] solve the following problem

$$\begin{aligned} \begin{aligned} {\mathcal {L}} = \min _{U,V} \sum _{i=1}^{M}\sum _{j=1}^{N}I_{ij}(R_{ij}-g(U_{i}^TV_{j}))^2 + \lambda (\Arrowvert U \Arrowvert _F^2 + \Arrowvert V \Arrowvert _F^2) \end{aligned} \end{aligned}$$
(1)

where \(I_{ij}\) is an indicator function, \(I_{ij}\) is 1 if \(u_i\) is mentioned in \(m_j\) and 0 otherwise. \(g(x)=1/(1+exp(-x))\) is the logistic function that maps \(U_{i}^TV_{j}\) to (0,1), \((\Arrowvert U \Arrowvert _F^2 + \Arrowvert V \Arrowvert _F^2)\) can avoid overfitting, \(||\cdot ||_F\) denotes the Frobenius norm of the matrix.

Since R is highly sparse, it is impossible to accurately recommend the right users only rely on the observed mention behaviors. However, we argue that incorporating mention contextual factors (e.g., topic relevance, mention affinity, user profile similarity and message semantic similarity) can alleviate the data sparsity and improve the performance of mention recommendation. Based on these ideas, we propose a novel mention recommendation model.

3.2 Mention Contextual Factors

In this section, we introduce the social contextual factors of influenced mention behaviors. Here, we use the terms “mentioner” refer to the publisher of message, and “mentionee” refer to the possible mentioned user in a message.

Modeling Message-Mentionee Topic Relevance Feature. From the perspective of target user, one is likely to be accepted and retweeted the notification message if he/she is interested in the content of message, otherwise the message will be viewed as spam and be ignored. Hence, we argue that the topic relevance of message and user is an important factor in the mention decision-making process. Here, we denote the topic distribution of message \(m_j\) as

$$\begin{aligned} T_{t}=(p(z_1|m_j),p(z_2|m_j),\cdots ,p(z_k|m_j)) \end{aligned}$$
(2)

where \(p(z_i|m_j)\) can be learnt in training BTM model [26], which can solve the problem of sparse word co-occurrence patterns at document-level.

User’s topic interests can be reflected in user-generated content. Due to data sparseness, we aggregate all short texts from the same user to form a long pesudo-document before performing BTM. The user \(u_i\)’s topic interests is defined as

$$\begin{aligned} T_{u_i}=(p(z_1|u_i),p(z_2|u_i),\cdots ,p(z_k|u_i)) \end{aligned}$$
(3)

where \(z_i\) is the i-th topic interest of user \(u_i\). Then, we use Jensen-Shannon divergence to measure topic distribution distinguishable between \(u_i\) and \(m_j\) as

$$\begin{aligned} JSD(T_{u_i} || T_{m_j})=\frac{1}{2}D(T_{u_i}||{\bar{T}}) + \frac{1}{2}D(T_{m_j}||{\bar{T}}) \end{aligned}$$
(4)

where \(T_{u_i}\) and \(T_{m_j}\) are the topic distribution of user \(u_i\) and message \(m_j\) respectively, and \({\bar{T}}\) is the average result of \(T_{u_i}\) and \(T_{m_j}\). \(D(\cdot ||\cdot )\) is the KL divergence and is calculated by \(D(T_{u_i}||{\bar{T}})=\sum _{k=1}^K log \frac{T_{u_i}(k)}{{\bar{T}}(k)}T_{u_i}(k)\). A smaller JSD value means a greater topic relevance between user and message, indicating the user is more likely to be interested in and retweet the message.

We introduce a user-message topic relevance matrix \(W \in {\mathbb {R}}^{M \times N}\), which consists of \( JSD(T_{u_i} || T_{m_j}) \). and then force the matrix to the approximated predictions of the observed entries, controlled by their association strengths as

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_1 = \min \quad \sum _{i=1}^{M}\sum _{j=1}^{N} W_{ij} U_{i}^T V_{j} \end{aligned} \end{aligned}$$
(5)

In Eq. (5), a large value of \(W_{ij}\) indicates message \(m_j\) is strongly matching relation with the topic interests of user \(u_i\), thus \(u_i\) is more likely to be mentioned in \(m_j\).

Modeling Mentioner-Mentionee Mention Affinity Feature. Mention can form a strong affinity relationship among users. For example, a user who accepted mention notifications from the same user is more likely to be mentioned again in the future. In this paper, we define the mention affinity from u to v as

$$\begin{aligned} \begin{aligned} A_{u \rightarrow v} = \frac{|{\mathcal {N}}(u \rightarrow v)|}{|{\mathcal {N}}(u)|} \end{aligned} \end{aligned}$$
(6)

where \({\mathcal {N}}(u)\) is the set of messages that user u uses mention when posting and \(|{\mathcal {N}}(u)|\) is the number of posting mention messages of user u in the set \({\mathcal {N}}(u)\). \({\mathcal {N}}(u \rightarrow v)\) is the set of messages that user u mentions user v and \(|{\mathcal {N}}(u \rightarrow v)|\) is the number of messages of user u mention v in the set \({\mathcal {N}}(u \rightarrow v)\).

Similarly, we also construct a user-user mention affinity matrix \( S \in {\mathbb {R}}^{M \times N}\), which consists of the user-user pairs mention affinity score. To model the strength of mention preference, we force the mention affinity matrix to the approximated predictions of the observed entries, controlled by their mention strengths as

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_2 = \min \quad \sum _{i=1}^{M}\sum _{j=1}^{N} S_{ij} U_{i}^TV_{j} \end{aligned} \end{aligned}$$
(7)

Similarly, in Eq. (7), a large value of \( S_{ij} \) indicates that user \(u_i\) is strongly interacted with user \(u_j\), thus \(u_i\) is more likely to mention \(u_j\) in the near future.

Modeling Mentionee-Mentionee Profile Similarity Feature. Our observation find that users with similar social status and similar topic interests are likely to be mentioned in the same message. We also assume that users are similar in hidden user space have similar preferences. The user’s profile information consists of topic interest and social status. The user’s topic interests can be profiled by the set of messages posted by users. The user’s social status can be described by two aspects: one is the social features including the number of messages, friends, followers and mutual fans, and the other one is the behavior features like the average number of retweetings and comments per message.

We first use the same strategy and topic model to the vectorized user as the above descried. Next, we employ the cosine similarity to calculate the profile similarity score between the published user \( u_i \) and the mentioned user \( u_j \) as

$$\begin{aligned} S_{topic}(i,j) = \frac{{\mathcal {U}}(i){\mathcal {U}}(j)}{\left\| {\mathcal {U}}(i) \right\| \left\| {\mathcal {U}}(j) \right\| } \end{aligned}$$
(8)

where \( {\mathcal {U}}(i)={<}{\mathcal {U}}_i^{topic}, {\mathcal {U}}_i^{social}{>} \) is the combination vector of topic interests and social status. \( {\mathcal {U}}_i^{topic} \) is the learned topic vector representations and \( {\mathcal {U}}_i^{social} \) is the learned social status representations for user \( u_i \), respectively. Additionally, we argue that incorporating user clustering module by their profile information can reduce the noisy data and improve the performance of mention recommendation. Here, to cluster users, we use the K-means clustering algorithm, and obtain the user cluster set H(u) in observed spaces.

Fig. 2.
figure 2

The graphical models representation of probabilistic matrix factorization (PMF) and context-aware mention probabilistic matrix factorization (CMPMF).

We believe that two users with a similar profile are likely to be mentioned in the same message. To model a similar profile between \(u_i\) and \(u_f\), we force user’s personal preferences \(U_i\) and \(U_f\) close to each other as

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_3 = \min \quad \sum _{i=1}^{M} \sum _{f \in H(u)} sim(i,f) \Vert U_{i} - U_f \Vert _F^2 \end{aligned} \end{aligned}$$
(9)

where a large value of sim(if) indicates that user’s personal preferences \(U_i\) and \(U_f\) should be very close, while a small value of sim(if) indicates that the distance of user’s personal preferences \(U_i\) and \(U_f\) could be large.

Modeling Message-Message Semantic Similarity Feature. Similarly, messages with similar topics are likely to mention the same users, and the messages similarities in observed spaces are consistent with the latent space. We use the same method to the vectorized message, and then using message clustering to reduce the noisy data and improve the performance of mention recommendation. We also use the K-means clustering algorithm to cluster messages, and generate the message cluster set G(v) in observed spaces.

Two messages with a similar topic distributions are likely to mention the similar users. To model a similar topic distributions between \(m_j\) and \(m_l\), we force message’s topic distributions \(V_j\) and \(V_l\) close to each other as

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_4 = \min \quad \sum _{j=1}^{N} \sum _{l \in G(v)} sim(j,l) \Vert V_{j} - V_l \Vert _F^2 \end{aligned} \end{aligned}$$
(10)

where a large value of sim(jl) indicates that message’s semantic distances \(V_j\) and \(V_l\) should be very close, while a small value of sim(jl) indicates that the distance of message’s semantic relationships \(V_j\) and \(V_l\) could be large.

Context-Aware Mention Recommendation Ensemble. Now, we design an integrated probabilistic matrix factorization model to find whom-to-mention by simultaneously considering topic relevance, mention affinity, user profile similarity and message semantic similarity, and then solve the optimization. The graphical representation of our proposed model is given in Fig. 2.

We model the conditional distribution of U and V over users and messages incorporating \( {\mathcal {L}}_1, {\mathcal {L}}_2, {\mathcal {L}}_3, {\mathcal {L}}_4 \) based on Bayesian inference as

$$\begin{aligned} \begin{aligned}&P(U,V|R,W,S,\sigma _R^2,\sigma _U^2,\sigma _V^2) \propto P(R|W,S,U,V,\sigma _R^2) P(U|\sigma _U^2) P(V|\sigma _V^2)\\&\quad = \prod _{i=1}^{M}\prod _{j=1}^{N}[{\mathcal {N}}(R_{ij}|g(W_{ij} U_i^TV_j + S_{ij} U_i^TV_j), \sigma _R^2)]^{I_{ij}}\\&\qquad \times \prod _{i=1}^{M}\prod _{f=1}^{H(u)}[{\mathcal {N}}(U_{i}|U_f, \sigma _U^2)] \times \prod _{j=1}^{N}\prod _{l=1}^{G(v)}[{\mathcal {N}}(V_{j}|V_l, \sigma _V^2)]\\&\qquad \times \prod _{i=1}^{M}{\mathcal {N}}(U_{i}|0,\sigma _U^2\mathbf{I }) \times \prod _{j=1}^{N}{\mathcal {N}}(V_{j}|0, \sigma _V^2\mathbf{I }) \end{aligned} \end{aligned}$$
(11)

Maximizing the log-posterior distribution with respect to U, V, W and S is equivalent to minimizing the sum-of-of-squared errors function with quadratic regularization terms:

$$\begin{aligned} \begin{aligned} {\mathcal {L}} =&\sum _{i=1}^{M}\sum _{j=1}^{N} I_{ij} \left\| R_{ij} - g(\alpha W_{ij} U_i^TV_j +(1-\alpha )S_{ij} U_i^TV_j) \right\| _F^2\\&+ \beta \sum _{i=1}^{M} \sum _{f \in H(u)} sim(i,f) \Vert U_{i} - U_f \Vert _F^2\\&+ \gamma \sum _{j=1}^{N} \sum _{l \in G(v)} sim(j,l) \Vert V_{j} - V_l \Vert _F^2&\\&+ \lambda (\Vert U \Vert _F^2 + \Vert V \Vert _F^2) \end{aligned} \end{aligned}$$
(12)

where \(\beta =\frac{\sigma _R^2}{\sigma _U^2}\), \(\gamma =\frac{\sigma _R^2}{\sigma _V^2}\). To simplify the model, we set \(\beta =\gamma \) in the experiments.

We perform stochastic gradient descent approach to find the local minimum of Eq. (12) on feature vectors \(U_i\) and \(V_j\) as

$$\begin{aligned} \frac{\partial {\mathcal {L}}}{\partial U_i}= & {} \sum _{j=1}^{N}I_{ij}g'(\alpha W_{ij} U_i^TV_j +(1-\alpha )S_{ij} U_i^TV_j)\nonumber \\&\times (g((\alpha W_{ij} +(1-\alpha )S_{ij}) U_i^TV_j)-R_{ij}) \times (\alpha W_{ij} V_j +(1-\alpha )S_{ij} V_j)\nonumber \\&+ \beta \sum _{f \in H(u)} sim(i,f) \times (U_i-U_f) + \lambda U_i \end{aligned}$$
(13)
$$\begin{aligned} \frac{\partial {\mathcal {L}}}{\partial V_j}= & {} \sum _{i=1}^{M}I_{ij}g'(\alpha W_{ij} U_i^TV_j +(1-\alpha )S_{ij} U_i^TV_j) \nonumber \\&\times (g((\alpha W_{ij} +(1-\alpha )S_{ij}) U_i^TV_j)-R_{ij}) \times (\alpha W_{ij} U_i +(1-\alpha )S_{ij} U_i) \nonumber \\&+ \gamma \sum _{l \in G(v)} sim(j,l) \times (V_j-V_l) + \lambda V_j \end{aligned}$$
(14)

where \(g'(x)=exp(x)/(1+exp(x))^2\) is the derivative of logistic function.

Table 1. Statistics of the dataset.

4 Experiments and Analysis

4.1 Dataset and Setup

Weibo is one of the most popular social network platforms in China, and allows a user to mention other users in a message. In this paper, we use a publicly available Weibo dataset [29], which consists of user profile, message and the snapshot of network structure, etc. The user profile contains the number of friends, followers and messages, etc. The message contain the message content, the number of retweetings and comments, etc. The network has following relationship among users. Table 1 summarizes the detailed information of the used dataset.

We use Precision, Recall, and F-Score to evaluate the experimental results, and employ Hits@3 and Hits@5 to represent the percentage of correct results recommended from the top results. Moreover, we also use the Mean Reciprocal Rank (MRR) metrics to evaluate the rank of the recommended results.

4.2 Baseline Methods

We compare the proposed model with the following state-of-the-art baseline methods on the dataset.

  • Random Guess (RG): RG randomly recommend the candidate users for each message. These candidates are chose from the friends by ordering the number of followers.

  • Majority Guess (MG): This strategy is a simple but powerful method that the candidate user mentioned with a higher frequency by the author would have a higher recommendation probability.

  • PMF: This method only uses user-message mention matrix for the mention recommendations based on the observed entries [18].

  • PMPR: This model considers the mention recommendation as a probabilistic ranking problem to find the maximal possibility candidate by using probabilistic factor graph model in the heterogeneous social network [10].

  • CAR: CAR uses a ranking support vector machine model by considering content, social, location and time based features to recommend the mentioned target users [21].

  • AU-HMNN: The model introduces end-to-end memory network architecture by incorporating the textual information of query tweets and the history interests of the author and candidate users [6].

For our model, we empirically set \(\alpha =0.6\), \(\beta = \gamma = \lambda = 0.01\), the number of latent factors is 100 and the number of iterations is 100, the number of clusters on users and messages are set to 40 and 80, respectively. The dimension of topic model is set to 100, and the number of iterations is set to 1000.

Table 2. Comparison of mention results with Precision, Recall, F-Score, MRR, Hits@3 and Hits@5 metrics on the dataset.

4.3 Performance and Analysis

Table 2 with baselines category shows the comparisons of the proposed method with the state-of-the-art methods on the dataset. From the results, we can draw the following observations: (1) Our proposed model consistently achieves better performance than other baseline methods on the dataset, which indicates the discovered mention contextual factors are effective for mention recommendation on social network; (2) MG can yield better results than RG, indicating that the candidate user mentioned with a higher frequency by the publisher in the past shares a good relationship and would have a higher mention probability in the future; (3) PMF performs even better than RG and MG, which demonstrates that the framework of probabilistic matrix factorization is more suited for the mention task; (4) PMPR and CAR significantly better than PMF. The experimental results show the fact that incorporating mention contextual factors into the learning algorithm indeed improve the performance of mention recommendation; (5) AU-HMNN based on deep neural networks outperforms most of baselines methods, which illustrates that neural network-based model offers more benefit for mention recommendation. Particularly, the best results of our proposed model for MRR and Hits@5 are relatively greater performance. Hence, by incorporating topic relevance, mention affinity, user profile similarity and message semantic similarity, our proposed model indeed performs well on the mention recommendation task. Taken together, these results suggest that our proposed model can achieve the best performance by considering mention contextual factors under probabilistic matrix factorization.

We implement the different variants of our proposed model to demonstrate the effectiveness of our proposed algorithm in Table 2 with variants category: (1) TR-CMPMF only considers the topic relevance feature of messages and the candidate users; (2) UA-CMPMF only incorporates the mention affinity feature between publishers and the candidate users; (3) PS-CMPMF only introduces the profile similarity feature among the candidate users; (4) SS-CMPMF only models the semantic similarity feature among messages. From the comparison results, we can conclude that UA-CMPMF achieves a relatively better result than other three variants of our proposed model. The comparison of TR-CMPMF and UA-CMPMF shows that the mention affinity strength of both users is more important factor than the matching relevance of topic interests while making decision for whom to mention. TR-CMPMF has larger boost than PS-CMPMF and SS-CMPMF in term of F-Score, indicating that the topic relevance is still an important consideration factor for mention recommendation. In contrast to the results of SS-CMPMF, PS-CMPMF could generate better results, indicating that the user profile information is an important consideration. The further explanation of the phenomenon is that the main purpose of mention is to expand the scale of information diffusion, thus the user profile information are more taken into account than semantic information information. In summary, from the results of TR-CMPMF, UA-CMPMF, PS-CMPMF and SS-CMPMF, we can observe that treating the topic relevance and mention affinity as the primary domain of factors and the profile similarity and semantic similarity as the second domain of factors can significantly improve the performance.

Fig. 3.
figure 3

Precision, Recall and F-score with different number of recommended users.

4.4 Effect of Mention Count

Figure 3 shows our proposed model and the baseline models with different numbers of recommended users, varying from 1 to 5. From the figure, we can see that (1) our proposed model achieves consistently better performance than the other methods with different number of recommendations; (2) with the number of recommended users increasing, Precision decreases and Recall increases gradually, indicating while the number of mentions in a message are neither too much nor too little, the recommendation achieves the best performance. It is a reasonable explanation that each message to be mentioned with a small number of users (e.g., one to three persons) would be viewed as a intimate chat with close friends. Otherwise, it may lead to potential mention overload and be considered as a spam when a lot of users are mentioned. Moreover, we observe that the best performance of mention recommendation is obtained in term of F-Score when recommending the top one user. It is also noticeable that our proposed method is significantly better than the state-of-the-art methods.

Fig. 4.
figure 4

Performance of our model under different values of \(\alpha \).

Fig. 5.
figure 5

Performance of our model under different cluster on user and message.

4.5 Effect of Parameters

In this paper, the proposed model contains several critical parameters. Here, we analyze the effect of these parameters from the following aspects.

Impact of \(\alpha \). The parameter \(\alpha \) is a tunable weight to balance the strength of topic relevance and mention affinity. Figure 4 plots the performance of our proposed model with various values of \(\alpha \). From the figure, we observe that the plot first gradually rises and then drops gradually while the values of \(\alpha \) increasing. In particular, \(\alpha =0\) indicates that our model only incorporates mention affinity feature, and \(\alpha =1\) indicates that our method only considers topic relevance. It is clear that the best performance is achieved when \(\alpha \) is 0.6.

Table 3. Performances of our proposed model with different number of topics

Number of Topic. Table 3 shows how the topic latent features affect the mention performance. From the results, we can see that our proposed model achieves better performance as the number of topics increasing. Clearly, the optimal value of topic latent features can be observed at 100 dimensions.

Number of Clustering. We also try to investigate how number of clusters on users and messages affect the performance of mention recommendation. Evaluation experiments are recorded at different points as shown in Fig. 5. From the result, we can see that (1) the dataset have a optimal value for number of clusters on users and messages as the number of clusters enlarging; (2) the best number of user and message clusters is achieved around 40 and 80, respectively. It is reasonable in reality that users with the similar preferences and daily habits are likely to be friends and form a community, and messages with the similar semantic hold the same distribution.

5 Conclusion

In this paper, we propose a novel mention recommendation model by incorporating topic relevance, mention affinity, user profile similarity and message semantic similarity. We use topic relevance to learn how the candidate users are interested in the message, and measure the strength of mention affinity based on mention histories, and quantify the distance of user profile and message semantic similarities based on homophily influence. We use these mention contextual factors to constrain objective function under the framework of probabilistic matrix factorization for mention recommendation task. To demonstrate the effective of our model, we construct extensive experiments. The experimental results reveal that our proposed method can outperform the state-of-the-art baseline methods.