Science China Mathematics

, Volume 61, Issue 4, pp 627–640 | Cite as

Network-based naive Bayes model for social network

  • Danyang Huang
  • Guoyu Guan
  • Jing Zhou
  • Hansheng Wang


Naive Bayes (NB) is one of the most popular classification methods. It is particularly useful when the dimension of the predictor is high and data are generated independently. In the meanwhile, social network data are becoming increasingly accessible, due to the fast development of various social network services and websites. By contrast, data generated by a social network are most likely to be dependent. The dependency is mainly determined by their social network relationships. Then, how to extend the classical NB method to social network data becomes a problem of great interest. To this end, we propose here a network-based naive Bayes (NNB) method, which generalizes the classical NB model to social network data. The key advantage of the NNB method is that it takes the network relationships into consideration. The computational effciency makes the NNB method even feasible in large scale social networks. The statistical properties of the NNB model are theoretically investigated. Simulation studies have been conducted to demonstrate its finite sample performance. A real data example is also analyzed for illustration purpose.


classification naive Bayes Sina Weibo social network data 


62H30 91D30 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



This work was supported by National Natural Science Foundation of China (Grant Nos. 11701560, 11501093, 11631003, 11690012, 71532001, 11525101), the Fundamental Research Funds for the Central Universities and the Research Funds of Renmin University of China (Grant No. 16XNLF01), the Beijing Municipal Social Science Foundation (Grant No. 17GLC051), Fund for Building World-Class Universities (Disciplines) of Renmin University of China, the Fundamental Research Funds for the Central Universities (Grant Nos. 130028613, 130028729 and 2412017FZ030), China’s National Key Research Special Program (Grant No. 2016YFC0207700) and Center for Statistical Science at Peking University.


  1. 1.
    Antonakis A C, Sfakianakis M E. Assessing naïve Bayes as a method for screening credit applicants. J Appl Stat, 2009, 36: 537–545MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Belkin M, Niyogi P, Sindhwani V. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res, 2006, 7: 2399–2434MathSciNetzbMATHGoogle Scholar
  3. 3.
    Bickel P J, Chen A. A nonparametric view of network models and Newman-Girvan and other modularities. Proc Natl Acad Sci USA, 2009, 106: 21068–21073CrossRefzbMATHGoogle Scholar
  4. 4.
    Breiman L. Random forest. Mach Learn, 2001, 45: 5–32CrossRefzbMATHGoogle Scholar
  5. 5.
    Buhlmann P, Yu B. Boosting with the L2 loss: Regression and classification. J Amer Statist Assoc, 2003, 98: 324–340MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Choi D, Wolfe P, Airoldi E. Stochastic blockmodels with a growing number of classes. Biometrika, 2012, 99: 273–284MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Craven M, McCallum A, PiPasquo D, et al. Learning to extract symbolic knowledge from the World Wide Web. In: Proceedings of the 15th National Conference on Artificial Intelligence. World Wide Web Internet and Web Information Systems, vol. 118. Menlo Park: Amer Assoc Artif Intell, 1998, 509–516Google Scholar
  8. 8.
    Erdős P, Rényi A. On the evolution of random graphs. Magyar Tud Akad Mat Kutató Int Közl, 1960, 5: 17–61MathSciNetzbMATHGoogle Scholar
  9. 9.
    Fan J, Feng Y, Jiang J, et al. Feature augmentation via nonparametrics and selection (FANS) in high-dimensional classification. J Amer Statist Assoc, 2016, 111: 275–287MathSciNetCrossRefGoogle Scholar
  10. 10.
    Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn, 1997, 29: 131–163CrossRefzbMATHGoogle Scholar
  11. 11.
    Guan G, Guo J, Wang H. Varying naive Bayes models with applications to classification of Chinese text documents. J Bus Econom Statist, 2014, 32: 445–456MathSciNetCrossRefGoogle Scholar
  12. 12.
    Guan G, Shan N, Guo J. Feature screening for ultrahigh dimensional binary data. Stat Interface, 2018, 11: 41–50MathSciNetCrossRefGoogle Scholar
  13. 13.
    Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. New York: Springer, 2001CrossRefzbMATHGoogle Scholar
  14. 14.
    Holland P W, Leinhardt S. An exponential family of probability distributions for directed graphs. J Amer Statist Assoc, 1981, 76: 33–50MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Hunter D R, Handcock M S. Inference in curved exponential family models for networks. J Comput Graph Statist, 2006, 15: 565–583MathSciNetCrossRefGoogle Scholar
  16. 16.
    Hunter D R, Handcock M S, Butts C T, et al. Ergm: A package to fit, simulate and diagnose exponential-family models for networks. J Statist Softw, 2008, 24: 1–29CrossRefGoogle Scholar
  17. 17.
    Lewis D D. Evaluating and optimizing autonomous text classification systems. In: International Acm Sigir Conference on Research and Development in Information Retrieval. New York: ACM, 1995, 246–254Google Scholar
  18. 18.
    Lewis D D. Naive Bayes at forty: The independence assumption in information retrieval. In: Proceedings of ECML-98, 10th European Conference on Machine Learning. London: Springer-Verlag, 1998, 4–15CrossRefGoogle Scholar
  19. 19.
    Macskassy S A, Provost F. Classification in networked data: A toolkit and a univariate case study. J Mach Learn Res, 2007, 8: 935–983Google Scholar
  20. 20.
    Minnier J, Yuan M, Liu J S, et al. Risk classification with an adaptive naive Bayes kernel machine model. J Amer Statist Assoc, 2015, 110: 393–404MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Neville J, Jensen D. Iterative classification in relational data. In: Proceedings of American Association for Artificial Intelligence Workshop on Learning Statistical Models from Relational Data. Palo Alto: AAAI Press, 2000, 42–49Google Scholar
  22. 22.
    Nowicki K, Snijders T A B. Estimation and prediction for stochastic block structures. J Amer Statist Assoc, 2001, 96: 1077–1087MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Ozuysal M, Calonder M, Lepetit V, et al. Fast keypoint recognition using random ferns. IEEE Trans Pattern Anal Mach Intell, 2010, 32: 448–461CrossRefGoogle Scholar
  24. 24.
    Robins G, Pattison P, Elliott P. Network models for social in uence processes. Psychometrika, 2001, 66: 161–189MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Wang Y J, Wong G Y. Stochastic blockmodels for directed graphs. J Amer Statist Assoc, 1987, 82: 8–19MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Wasserman S, Faust K. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press, 1994CrossRefzbMATHGoogle Scholar
  27. 27.
    Webb G I, Boughton J R, Wang Z. Not so naive Bayes: Aggregating one-dependence estimators. Mach Learn, 2005, 58: 5–24CrossRefzbMATHGoogle Scholar
  28. 28.
    Wu Y, Liu Y. Robust truncated-hinge-loss support vector machines. J Amer Statist Assoc, 2007, 102: 974–983MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Zaidi N A, Cerquides J, Carman M, et al. Alleviating naive Bayes attribute independence assumption by attribute weighting. J Mach Learn Res, 2013, 14: 1947–1988MathSciNetzbMATHGoogle Scholar
  30. 30.
    Zanin M, Papo D, Sousa P A, et al. Combining complex networks and data mining: Why and how. Phys Rep, 2016, 635: 1–44MathSciNetCrossRefGoogle Scholar
  31. 31.
    Zheng Z, Webb G I. Lazy learning of Bayesian rules. Mach Learn, 2000, 41: 53–84CrossRefGoogle Scholar

Copyright information

© Science China Press and Springer-Verlag GmbH Germany, part of Springer Nature 2017

Authors and Affiliations

  • Danyang Huang
    • 1
  • Guoyu Guan
    • 2
  • Jing Zhou
    • 1
  • Hansheng Wang
    • 3
  1. 1.School of StatisticsRenmin University of ChinaBeijingChina
  2. 2.KLAS of MOE, and School of EconomicsNortheast Normal UniversityChangchunChina
  3. 3.Guanghua School of ManagementPeking UniversityBeijingChina

Personalised recommendations