Advertisement

The dynamic stochastic topic block model for dynamic networks with textual edges

  • Marco Corneli
  • Charles Bouveyron
  • Pierre Latouche
  • Fabrice Rossi
Article

Abstract

The present paper develops a probabilistic model to cluster the nodes of a dynamic graph, accounting for the content of textual edges as well as their frequency. Vertices are clustered in groups which are homogeneous both in terms of interaction frequency and discussed topics. The dynamic graph is considered stationary on a latent time interval if the proportions of topics discussed between each pair of node groups do not change in time during that interval. A classification variational expectation–maximization algorithm is adopted to perform inference. A model selection criterion is also derived to select the number of node groups, time clusters and topics. Experiments on simulated data are carried out to assess the proposed methodology. We finally illustrate an application to the Enron dataset.

Keywords

Dynamic random graph Model based clustering Stochastic block model Topic modeling Latent Dirichlet allocation 

Notes

References

  1. Airoldi, E., Blei, D., Fienberg, S., Xing, E.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9, 1981–2014 (2008)zbMATHGoogle Scholar
  2. Aitkin, M.: Posterior Bayes factors (disc: p128–142). J. R. Stat. Soc. Ser. B Methodol. 53, 111–128 (1991)zbMATHGoogle Scholar
  3. Biernacki, C., Celeux, G., Govaert, G.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Mach. Intell. 22(7), 719–725 (2000)CrossRefGoogle Scholar
  4. Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 113–120. ACM (2006)Google Scholar
  5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). http://dl.acm.org/citation.cfm?id=944919.944937
  6. Blondel, V.D., Loup Guillaume, J., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)CrossRefGoogle Scholar
  7. Bouveyron, C., Latouche, P., Zreik, R.: The stochastic topic block model for the clustering of vertices in networks with textual edges. Stat. Comput. (2016).  https://doi.org/10.1007/s11222-016-9713-7. https://hal.archives-ouvertes.fr/hal-01299161 MathSciNetCrossRefGoogle Scholar
  8. Celeux, G., Govaert, G.: A classification EM algorithm for clustering and two stochastic versions. Research Report RR-1364, INRIA, (1991). https://hal.inria.fr/inria-00075196, projet CLOREC
  9. Côme, E., Latouche, P.: Model selection and clustering in stochastic block models based on the exact integrated complete data likelihood. Stat. Model. 15(6), 564–589 (2015).  https://doi.org/10.1177/1471082X15577017 MathSciNetCrossRefGoogle Scholar
  10. Corneli, M., Latouche, P., Rossi, F.: Modelling time evolving interactions in networks through a non stationary extension of stochastic block models. In: Pei, J., Silvestri, F., Tang, J. (eds) International Conference on Advances in Social Networks Analysis and Mining ASONAM 2015, IEEE/ACM, pp. 1590–1591. ACM, Paris, France (2015).  https://doi.org/10.1145/2808797.2809348. https://hal.archives-ouvertes.fr/hal-01263540
  11. Corneli, M., Latouche, P., Rossi, F.: Block modelling in dynamic networks with non-homogeneous poisson processes and exact ICL. Soc. Netw. Anal. Min. 6(1), 1–14 (2016a).  https://doi.org/10.1007/s13278-016-0368-3 CrossRefGoogle Scholar
  12. Corneli, M., Latouche, P., Rossi, F.: Exact ICL maximization in a non-stationary temporal extension of the stochastic block model for dynamic networks. Neurocomputing 192, 81–91 (2016b).  https://doi.org/10.1016/j.neucom.2016.02.031 CrossRefGoogle Scholar
  13. Corneli, M., Latouche, P., Rossi, F.: Multiple change points detection and clustering in dynamic networks. Stat. Comput. 28(5), 989–1007 (2018)MathSciNetCrossRefGoogle Scholar
  14. Daudin, J.J., Picard, F., Robin, S.: A mixture model for random graphs. Stat. Comput. 18(2), 173–183 (2008)MathSciNetCrossRefGoogle Scholar
  15. Durante, D., Dunson, D.B.: Locally adaptive dynamic networks. Ann. Appl. Stat. 10(4), 2203–2232 (2016)MathSciNetCrossRefGoogle Scholar
  16. Friel, N., Rastelli, R., Wyse, J., Raftery, A.E.: Interlocking directorates in Irish companies using a latent space model for bipartite networks. Proc. Natl. Acad. Sci. 113(24), 6629–6634 (2016).  https://doi.org/10.1073/pnas.1606295113. http://www.pnas.org/content/113/24/6629.full.pdf CrossRefGoogle Scholar
  17. Guigourès, R., Boullé, M., Rossi, F.: A triclustering approach for time evolving graphs. In: IEEE 12th International Conference on Data Mining Workshops (ICDMW 2012) on Co-clustering and Applications, Brussels, Belgium, pp. 115–122 (2012).  https://doi.org/10.1109/ICDMW.2012.61
  18. Guigourès, R., Boullé, M., Rossi, F.: Discovering patterns in time-varying graphs: a triclustering approach. In: Advances in Data Analysis and Classification, pp. 1–28 (2015).  https://doi.org/10.1007/s11634-015-0218-6 CrossRefGoogle Scholar
  19. Handcock, M.S., Raftery, A.E., Tantrum, J.M.: Model-based clustering for social networks. J. R. Stat. Soc. Ser. A (Stat. Soc.) 170(2), 301–354 (2007)MathSciNetCrossRefGoogle Scholar
  20. Hanneke, S., Fu, W., Xing, E.P.: Discrete temporal models of social networks. Electron. J. Stat. 4, 585–605 (2010)MathSciNetCrossRefGoogle Scholar
  21. Hoff, P., Raftery, A., Handcock, M.: Latent space approaches to social network analysis. J. Am. Stat. Assoc. 97(460), 1090–1098 (2002)MathSciNetCrossRefGoogle Scholar
  22. Jernite, Y., Latouche, P., Bouveyron, C., Rivera, P., Jegou, L., Lamassé, S.: The random subgraph model for the analysis of an ecclesiastical network in Merovingian Gaul. Ann. Appl. Stat. 8(1), 55–74 (2014)MathSciNetCrossRefGoogle Scholar
  23. Krivitsky, P.N., Handcock, M.S.: A separable model for dynamic networks. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 76(1), 29–46 (2014)MathSciNetCrossRefGoogle Scholar
  24. Latouche, P., Birmelé, E., Ambroise, C.: Variational bayesian inference and complexity control for stochastic block models. Stat. Model. 12(1), 93–115 (2012)MathSciNetCrossRefGoogle Scholar
  25. Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link LDA: joint models of topic and author community. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, pp. 665–672. ACM, New York, NY, USA (2009).  https://doi.org/10.1145/1553374.1553460
  26. Matias, C., Miele, V.: Statistical clustering of temporal networks through a dynamic stochastic block model. J. R. Stat. Soc. Ser. B 79(4), 1119–1141 (2017)MathSciNetCrossRefGoogle Scholar
  27. Matias, C., Rebafka, T., Villers, F.: Estimation and clustering in a semiparametric Poisson process stochastic block model for longitudinal networks. ArXiv e-prints 1512, 07075 (2015)Google Scholar
  28. McCallum, A., Corrada-Emmanuel, A., Wang, X.: The author-recipient-topic model for topic and role discovery in social networks. In: Workshop on Link Analysis, Counterterrorism and Security (2005)Google Scholar
  29. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(026), 113 (2004).  https://doi.org/10.1103/PhysRevE.69.026113 CrossRefGoogle Scholar
  30. Nouedoui, L., Latouche, P.: Bayesian non parametric inference of discrete valued networks. In: 21-st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2013), pp. 291–296. Bruges, Belgium (2013)Google Scholar
  31. Nowicki, K., Snijders, T.: Estimation and prediction for stochastic blockstructures. J. Am. Stat. Assoc. 96(455), 1077–1087 (2001)MathSciNetCrossRefGoogle Scholar
  32. Pathak, N., DeLong, C., Banerjee, A., Erickson, K.: Social topic models for community extraction. In: The 2nd SNAKDD workshop, vol. 8, p. 2008 (2008)Google Scholar
  33. Peel, L., Clauset, A.: Detecting change points in the large-scale structure of evolving networks. (2014). CoRR abs/1403.0989, arxiv:1403.0989
  34. Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)CrossRefGoogle Scholar
  35. Robins, G., Pattison, P., Kalish, Y., Lusher, D.: An introduction to exponential random graph (p*) models for social networks. Soc. Netw. 29(2), 173–191 (2007)CrossRefGoogle Scholar
  36. Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, UAI ’04, pp. 487–494. AUAI Press, Arlington, VA, USA (2004). http://dl.acm.org/citation.cfm?id=1036843.1036902
  37. Sachan, M., Contractor, D., Faruquie, T.A., Subramaniam, L.V.: Using content and interactions for discovering communities in social networks. In: Proceedings of the 21st International Conference on World Wide Web, WWW ’12, pp. 331–340. ACM, New York, NY, USA (2012).  https://doi.org/10.1145/2187836.2187882
  38. Sarkar, P., Moore, A.W.: Dynamic social network analysis using latent space models. ACM SIGKDD Explor. Newsl. 7(2), 31–40 (2005)CrossRefGoogle Scholar
  39. Sewell, D.K., Chen, Y.: Latent space models for dynamic networks. J. Am. Stat. Assoc. 110(512), 1646–1657 (2015)MathSciNetCrossRefGoogle Scholar
  40. Sewell, D.K., Chen, Y.: Latent space models for dynamic networks with weighted edges. Soc. Netw. 44, 105–116 (2016)CrossRefGoogle Scholar
  41. Steyvers, M., Smyth, P., Rosen-Zvi, M., Griffiths, T.: Probabilistic author-topic models for information discovery. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04, pp. 306–315. ACM, New York, NY, USA (2004).  https://doi.org/10.1145/1014052.1014087
  42. von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007).  https://doi.org/10.1007/s11222-007-9033-z MathSciNetCrossRefGoogle Scholar
  43. Wang, Y., Wong, G.: Stochastic blockmodels for directed graphs. J. Am. Stat. Assoc. 82, 8–19 (1987)MathSciNetCrossRefGoogle Scholar
  44. Wu, C.F.J.: On the convergence properties of the EM algorithm. Ann. Statist. 11(1), 95–103 (1983).  https://doi.org/10.1214/aos/1176346060 MathSciNetCrossRefzbMATHGoogle Scholar
  45. Xu, K.S., Hero III, A.O.: Dynamic stochastic blockmodels: statistical models for time-evolving networks. In: Greenberg, A.M., Kennedy, W.G., Bos, N.D. (eds.) Social Computing, Behavioral-Cultural Modeling and Prediction. SBP 2013. Lecture Notes in Computer Science, vol. 7812. Springer, Berlin (2013)Google Scholar
  46. Yang, T., Chi, Y., Zhu, S., Gong, Y., Jin, R.: Detecting communities and their evolutions in dynamic social networks a Bayesian approach. Mach. Learn. 82(2), 157–189 (2011)MathSciNetCrossRefGoogle Scholar
  47. Zhou, D., Manavoglu, E., Li, J., Giles, C.L., Zha, H.: Probabilistic models for discovering e-communities. In: Proceedings of the 15th International Conference on World Wide Web, WWW ’06, pp. 173–182. ACM, New York, NY, USA (2006).  https://doi.org/10.1145/1135777.1135807
  48. Zreik, R., Latouche, P., Bouveyron, C.: The dynamic random subgraph model for the clustering of evolving networks. Comput. Stat. 32(2), 501–533 (2017)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Université Cote d’AzurNiceFrance
  2. 2.Université Paris 1ParisFrance

Personalised recommendations