Skip to main content

On information propagation in mobile call networks


We consider the dynamics of rapid propagation of information (RPI) in mobile phone networks. We propose a heuristic method for identification of sequences of calls that supposedly propagate the same information and apply it to large-scale real-world data. We show that some of the information propagation events identified by the proposed method can explain the physical co-location of subscribers. We further show that features of subscriber’s behavior in these events can be used for efficient churn prediction. To the best of our knowledge, our method for churn prediction is the first method that relies on dynamic, rather than static, social behavior. Finally, we introduce two generative models that address different aspects of RPI. One model describes the emergence of sequences of calls that lead to RPI. The other model describes the emergence of different topologies of paths in which the information propagates from one subscriber to another. We report high correspondence between certain features observed in the data and these models.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16


  1. 1.

    As a reference point for this threshold, one can consider the following statistics provided in NielsenWire (2008). The average number of monthly calls made by a subscriber in USA is only 204.

  2. 2.

    In RPIs that contain less that five subscribers, we require dissemination leader to propagate information either to all or to all-but-one user in this RPI.

  3. 3.

    This specific selection of T is justified in Sect. 7.3.

  4. 4.

    The precise values of lift of churn predictors are usually considered to be proprietary information and are not mode public. However, working with a large number of telecom companies around the world, our understanding is that this lift value matches the state-of-the-art for such models.

  5. 5.

    The maximal possible value of the error is 0.5. Error value smaller than 0.1 is considered to reflect a distinctive feature.


  1. Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval, vol 463. Addison-Wesley, New York.

  2. Bi Z, Faloutsos C, Korn F (2001) The “DGX” distribution for mining massive, skewed data. In: KDD’01

  3. Bin L, Peiji S, Juan L (2007) Customer churn prediction based on the decision tree in personal handyphone system service. In: ICSSSM’07

  4. Burez J, Vanden Poel D (2009) Handling class imbalance in customer churn prediction. Expert Syst Appl 36(3):4626–4636

    Article  Google Scholar 

  5. Catanese S, Ferrara E, Fiumara G (2012) Forensic analysis of phone call networks. Social Netw Anal Min 1–19. ISSN:1869-5450. doi:10.1007/s13278-012-0060-1

  6. Cormen T, Leiserson C, Rivest R, Stein C (2001) Introduction to algorithms. The MIT press, Cambridge

  7. Coussement K, Vanden Poel D (2008) Churn prediction in subscription services: an application of support vector machines while comparing two parameter-selection techniques. Expert systems with applications 34(1):313–327

    Article  Google Scholar 

  8. Dasgupta K, Singh R, Viswanathan B, Chakraborty D, Mukherjea S, Nanavati A, Joshi A (2008) Social ties and their relevance to churn in mobile telecom networks. In: EDBT’08

  9. Datta P, Masand B, Mani D, Li B (2000) Automated cellular modeling and prediction on a large scale. Artif Intell Rev 14(6):485–502

    MATH  Article  Google Scholar 

  10. de Melo P, Akoglu L, Faloutsos C, Loureiro A (2010) Surprising patterns for the call duration distribution of mobile phone users. In: PKDD’10

  11. de Oliveira Lima E (2009) Domain knowledge integration in data mining for churn and customer lifetime value modelling: new approaches and applications. PhD thesis, University of Southampton

  12. Domingos P (2005) Mining social networks for viral marketing. IEEE Intell Syst 20(1):80–82

    MathSciNet  Article  Google Scholar 

  13. Domingos P, Richardson M (2001) Mining the network value of customers. In: KDD’01

  14. Doyle S (2007) The role of social networks in marketing. J Database Mark Cust Strateg Manag 15(1):60–64

    Article  Google Scholar 

  15. Duda R, Hart P, Stork D (2001) Pattern classification. Wiley, New York

  16. Dyagilev K, Mannor S, Yom-Tov E (2010) Generative models for rapid information propagation. In: Proceedings of the First Workshop on Social Media Analytics, ACM, pp 35–43

  17. Eagle N, Pentland A, Lazer D (2009) Inferring social network structure using mobile phone data. Proc Nat Acad Sci 106(36):15274–15278

    Article  Google Scholar 

  18. Fildes R, Kumar V (2002) Telecommunications demand forecasting—a review. Int J Forecast 18(4):489–522

    Article  Google Scholar 

  19. Gill K (2008) How can we measure the influence of the blogosphere. In: WWW’08

  20. Goldenberg J, Libai B, Moldovan S, Muller E (2007) The NPV of bad news. Int J Res Mark 24(3):186–200

    Article  Google Scholar 

  21. Goldenberg J, Han S, Lehmann D, Hong J (2009) The role of hubs in the adoption process. J Mark 73(2):1–13

    Article  Google Scholar 

  22. Gomez Rodriguez M, Leskovec J, Krause A (2010) Inferring networks of diffusion and influence. In: KDD’10

  23. Gopal R, Meher S (2008) Customer churn time prediction in mobile telecommunication industry using ordinal regression. In: Advances in Knowledge Discovery and Data Mining, pp 884–889

  24. Harris T (2002) The theory of branching processes. Dover Publications, New York

  25. Hill S, Provost F, Volinsky C (2006) Network-based marketing: identifying likely adopters via consumer networks. Stat Sci 21(2):256–276

    MathSciNet  MATH  Article  Google Scholar 

  26. Jackson M (2008) Social and economic networks. Princeton University Press, Princeton

  27. Kempe D, Kleinberg J, Tardos É (2003) Maximizing the spread of influence through a social network. In: KDD’03

  28. Kourtellis N, Alahakoon T, Simha R, Iamnitchi A, Tripathi R (2012) Identifying high betweenness centrality nodes in large social networks. Social Netw Anal Min 1–16. ISSN:1869-5450. doi:10.1007/s13278-012-0076-6

  29. Leskovec J., Adamic L., Huberman B. (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5

    Article  Google Scholar 

  30. Nanavati A, Gurumurthy S, Das G, Chakraborty D, Dasgupta K, Mukherjea S, Joshi A (2006) On the structural properties of massive telecom call graphs: findings and implications. In: ICIKM’06

  31. NielsenWire (2008) In U.S., SMS text messaging tops mobile phone calling.

  32. Nitzan I, Libai B (2010) Social effects on customer retention, marketing Science Institute, working paper 10-107

  33. Pan W, Aharony N, Pentland A (2011) Composite social network for predicting mobile apps installation. Arxiv preprint arXiv:11060359

  34. Pendharkar P (2009) Genetic algorithm based neural network approaches for predicting churn in cellular wireless network services. Expert Syst Appl 36(3):6714–6720

    Article  Google Scholar 

  35. Radosavljevik D, van der Putten P, Kyllesbech Larsen K (2010) The impact of experimental setup in prepaid churn prediction for mobile telecommunications: what to predict, for whom and does the customer experience matter? Trans Mach Learn Data Min 3(2):80–99

    Google Scholar 

  36. Richter Y, Yom-Tov E, Slonim N (2010) Predicting customer churn in mobile networks through analysis of social groups. In: ICDM’10

  37. Sadikov E, Medina M, Leskovec J, Garcia-Molina H (2011) Correcting for missing data in information cascades. In: WSDM’11

  38. Saravanan M, Prasad G, Karishma S, Suganthi D (2011) Analyzing and labeling telecom communities using structural properties. Social Netw Anal Min 1(4):271–286. ISSN:1869-5450. doi:10.1007/s13278-011-0020-1

    Google Scholar 

  39. Shao J (2003) Mathematical statistics, 2nd edn. Springer, New York

  40. Song G, Yang D, Wu L, Wang T, Tang S (2006) A mixed process neural network and its application to churn prediction in mobile communications. In: Data Mining Workshop, ICDM’06

  41. Vega-Redondo F (2007) Complex social networks. Cambridge University Press, Cambridge

  42. Wu F, Huberman B (2007) Novelty and collective attention. Proc Nat Acad Sci 104(45):17–599

    Article  Google Scholar 

  43. Yang J, He X, Lee H (2007) Social reference group influence on mobile phone purchasing behaviour: a cross-nation comparative study. Int J Mobile Commun 5(3):319–338

    Article  Google Scholar 

Download references


Preliminary version of this research was published in Dyagilev et al. (2010).

Author information



Corresponding author

Correspondence to Kirill Dyagilev.

Additional information

The author was also with IBM Haifa Research Lab, Haifa, Israel during a part of this research.


Appendix 1: Analysis of information flow tree model

We proceed to investigate the properties of some of these topologies. Assume that M ≥ 5 and consider a tree generated by the information flow tree model with parameters as above. Let N denote the total number of nodes in the tree and let E i for i = 1, 2,…,4 denote the event that the generated tree is of Topology i. The following proposition assesses one of the basic properties of these topologies, namely, the distribution \({\mathbb{P}\{N=\cdot|{E_1}\}}\) of sizes of RPIs in each topology.

Proposition 1

The size distribution of RPIs of Topologies 1–4 is given by the following expressions.

  1. (1)

    For Topology 1:

    $$ {\mathbb{P}}\{N=n|{E_1}\}= p_0(n-1)\left(p_{1,4}(0)\right)^{n-1}/{\mathbb{P}}\{E_{1}\} {\mathbb{I}}_{\{n\geq M\}}, $$


    $$ {\mathbb{P}}\{E_{1}\} = \sum_{r=M-1}^{\infty} p_0(r)\left(p_{1,4}(0)\right)^r $$
  2. (2)

    For Topology 2:

    $$ {\mathbb{P}}\{N=n| E_{2}\} = p_0(1)p_{1,1}(n-2)\left(p_{2,1}(0)\right)^{n-2}/{\mathbb{P}}\{E_{2}\}, $$


    $$ {\mathbb{P}}\{E_{2}\} = \sum_{d=M-2}^{\infty} p_0(1)p_{1,1}(d)\left(p_{2,1}(0)\right)^d. $$
  3. (3)

    For Topology 3:

    $$ \begin{aligned} {\mathbb{P}}\{N=n|E_3\} &= p_0(n-2)\left[(n-2) p_{1,4}(1)(p_{1,4}(0))^{n-3}\right]\\ &\quad\times p_{2,4}(0)/{\mathbb{P}}\{E_3\}, \end{aligned} $$


    $$ {\mathbb{P}}\{E_3\} = \sum_{d=M-2}^{\infty} p_0(d)\left[d p_{1,4}(1)(p_{1,4}(0))^{d-1}\right]p_{2,4}(0). $$
  4. (4)

    For Topology 4:

    $$ {\mathbb{P}}\{N=n|E_4\} = (p_{2,1}(1))^{n-M}(1-p_{2,1}(1)). $$


These results can be easily shown by straight-forward calculations.\(\hfill{\square}\)

Appendix 2: Stability of dissemination leaders

In this section, we investigate the stability of dissemination leaders over different days. We consider a training period of three consecutive days and a testing period of 14 following days. Overall, there are 19 pairs of training and testing periods in DS-1 and eight pairs in DS-2.

We compare the set of dissemination leaders found in the training period to the set of dissemination leaders in the testing period. In particular, we gather the following statistics: (S1) fraction of dissemination leaders in the testing period that were also dissemination leaders in the training period; (S2) fraction of dissemination leaders in the training period that were also dissemination leaders in the testing period. (S3) fraction of RPIs in the testing period in which the dominant node was a dissemination leader in the training period. We note that statistics S1 and S3 are different since a subscriber can be a dominant node in more that one RPI during the testing period.

As a baseline for these statistics, we use synthetic data in which probability of a user becoming a dominant node in some RPI in the current day does not depend on whether it was a dissemination leader in previous day. This synthetic data can be generated by the following baseline model. We assume that there exists a general pool of L dissemination leaders and numbers n i , for i = 1,…,D, of RPIs on ith day that have a dominant node. The identities of dominant node in n i RPIs of day i are selected uniformly at random from the pool of size L (with repetitions).

To make a fair comparison of statistics in the real and synthetic data, we estimate the parameters of synthetic data model in the following straight-forward fashion. We let D be the number of days in the corresponding real-world data set, hence D = 35 for the data set DS-1 and D = 24 for the data set DS-2. We estimate L as the total number of subscribers that were dissemination leaders on any out of D days in the data set. Finally, we let numbers n i be the actual numbers of RPIs with dissemination leader in each of D days. We denote the synthetic data that correspond to data sets DS-1 and DS-2 by Syn-1 and Syn-2, respectively. The values of statistics S1–S3 are presented in Table 5. We represent each entry by its mean value over all possible selections of training and testing period. The confidence intervals on values are chosen to be one standard deviation.

Table 5 Stability of dissemination leader

Using the Wilcoxon signed-rank test [e.g., see Shao (2003)], we determine that the statistics for the real data and the corresponding synthetic data differ in a statistically significant way (with p value of 0).

We next focused on stability of dissemination leaders that appeared in more than one RPI in the training set. Specifically, we considered a method that predicts that all users that were dominant nodes in at least K = 1, 2,… RPIs in the training set will also be a dominant node in at least one RPI in the testing set. The precision–recall curve of this method [see, e.g., Baeza-Yates and Ribeiro-Neto (1999) for definition] is presented in Fig. 17. We note that in real-world data, a user that was a dominant node in a large number of RPIs during the training period has a higher chance to be a dissemination leader in the testing period. In contrast, in synthetic data the number of RPIs in which the user was a dominant node during the training period has no effect of his chances to be a dissemination leader of in the testing period.

Fig. 17

Stability of dissemination leaders

Appendix 3: Statistical analysis of features used by the RPI-CP algorithm

In this section, we present statistical analysis of the features used by the RPI-CP algorithm. We begin by presenting the complete list of features in Table 6. In Table 7, we list the three basic statistics for each feature and each data set. First, we present the average value and the standard deviation of the feature over the set of churners, i.e., subscribers that churned during the testing period. Second, we present the same values for non-churners, i.e., subscribers that did not churn during the testing period. It can be seen that the interval of one standard deviation around the mean overlaps significantly for churners and non-churners, for most features. Hence, it is hard to distinguish between a churner and a non-churner using a single feature. To make this observation precise, for each feature we calculate the Neyman–Pearson error of a classifier churner/non-churner that relies on this single feature. Namely, we calculate the minimal possible classification error one can obtain using this single feature. As expected, the Neyman–Pearson error is highFootnote 5 for all of the features in both data sets. Thus, multiple features are required to classify churners, each contributing a relatively small amount of information.

Table 6 Description of features used for churn prediction in RPI-CP
Table 7 Basic statistical analysis of features used by the RPI-CP algorithm

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Dyagilev, K., Mannor, S. & Yom-Tov, E. On information propagation in mobile call networks. Soc. Netw. Anal. Min. 3, 521–541 (2013).

Download citation


  • Mobile call network
  • Dynamic behavior of networks
  • Churn prediction