Using Web Mining and Social Network Analysis to Study The Emergence of Cyber Communities In Blogs

  • Michael Chau
  • Jennifer Xu
Part of the Integrated Series In Information Systems book series (ISIS, volume 18)

Blogs have become increasingly popular in recent years. Bloggers can express their opinions and emotions more freely and easily than before.Many communities have emerged in the blogosphere, including racist and hate groups that are trying to share their ideology, express their views, or recruit new group members. It is imperative to analyze these cyber communities in order to monitor for activities that are potentially harmful to society. Web mining and social network analysis techniques, which have been widely used to analyze the content and structure of Web sites of hate groups on the Internet, have not been applied to the study of hate groups in blogs. In this research, we present a framework, which consists of components of blog spider, information extraction, network analysis, and visualization, to address this problem (Chau & Xu, 2007). We applied this framework to identify and analyze a selected set of 28 anti-Blacks hate groups on Xanga, one of the most popular blog hosting sites. Our analysis results revealed some interesting demographical and topological characteristics in these groups, and identified at least two large communities on top of the smaller ones. We suggest that our framework can be generalized and applied to blog analysis in other domains.


Social Network Analysis Hate Crime Giant Component Average Short Path Length White Supremacist 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Albert, R. & Barabási, A.-L. (2002). Statistical Mechanics of Complex Networks. Reviews of Modern Physics 74(1), 47-97.CrossRefGoogle Scholar
  2. Albert, R., Jeong, H., Barabási, A.-L. (2000). Error and Attack Tolerance of Complex Networks. Nature 406, 378-382.CrossRefGoogle Scholar
  3. Alexa(2005).Top English Language Sites.[Online]Retrievedfrom on October 7, 2005.
  4. Anti-Defamation League (2001). Poisoning the Web: Hatred Online. [Online] Retrieved from on October 7, 2005.
  5. Barabási, A.-L., Albert, R., & Jeong, H. (1999). Mean-Field Theory for Scale-Free Random Networks. Physica A 272, 173-187.CrossRefGoogle Scholar
  6. Blazak, R. (2001). White Boys to Terrorist Men: Target Recruitment of Nazi Skinheads. American Behavioral Scientist 44(6), 982-1000.CrossRefGoogle Scholar
  7. Blood, R. (2004). How Blogging Software Reshapes the Online Community. Communica-tions of the ACM 47(12), 53-55.CrossRefGoogle Scholar
  8. Bollobás, B. (1985). Random Graphs. London, Academic.Google Scholar
  9. Brin, S. & Page, L. (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. Proceedings of the 7th WWW Conference, Brisbane, Australia, April 1998.Google Scholar
  10. Burris, V., Smith, E., & Strahm, A. (2000). White Supremacist Networks on the Internet. So-ciological Focus 33(2), 215-235.Google Scholar
  11. Chau, M. & Chen, H. (2003). Personalized and Focused Web Spiders, in N. Zhong, J. Liu, & Y. Yao (Eds), Web Intelligence, Springer-Verlag, 197-217.Google Scholar
  12. Chau, M., Shiu, B., Chan, I., & Chen, H. (2005). Automated Identification of Web Communi-ties for Business Intelligence Analysis, in Proceedings of the Fourth Workshop on E-Business (WEB 2005), Las Vegas, USA, December, 2005.Google Scholar
  13. Chau, M., Shiu, B., Chan, I., & Chen, H. (2007). Redips: Backlink Search and Analysis on the Web for Business Intelligence, Journal of the American Society for Information Science and Technology 58(3), 351-365.CrossRefGoogle Scholar
  14. Chau, M. & Xu, J. (2007). Mining Communities and Their Relationships in Blogs: A Study of Online Hate Groups, International Journal of Human-Computer Studies 65(1), 57-70.CrossRefGoogle Scholar
  15. Chen, H. and Chau, M. (2004). Web Mining: Machine Learning for Web Applications, Annual Review of Information Science and Technology 38, 289-329, 2004.CrossRefGoogle Scholar
  16. Chen, H., Chung, W., Xu. J., Wang, G., Qin, Y., & Chau, M. (2004). Crime Data Mining: A General Framework and Some Examples. IEEE Computer 37(4), 50-56.Google Scholar
  17. Cheong, F. C. (1996). Internet Agents: Spiders, Wanderers, Brokers, and Bots. Indianapolis, Indiana, USA: New Riders Publishing.Google Scholar
  18. CNN (1999). Hate Group Web Sites on the Rise, CNN News [Online] Retrieved from on October 7, 2005.
  19. Crucitti, P., Latora, V., Marchiori, M., & Rapisarda A. (2003). Efficiency of Scale-Free Net-works: Error and Attack Tolerance. Physica A 320, 622-642.CrossRefGoogle Scholar
  20. Flake, G. W., Lawrence, S., & Giles, C. L. (2000). Efficient Identification of Web Communities. In Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD 2000), Boston, MA.Google Scholar
  21. Flake, G. W., Lawrence, S., Giles, C. L., & Coetzee, F. M. (2002). Self-Organization and Identification of Web Communities. IEEE Computer 35(3), 66-71.Google Scholar
  22. Franklin,R. A.(2005).The Hate Directory[Online] Retrieved from on October 7, 2005.
  23. Freeman, L. C. (1979). Centrality in Social Networks: Conceptual Clarification. Social Networks 1, 215-240.CrossRefGoogle Scholar
  24. Freeman, L. C. (2000). Visualizing Social Networks. Journal of Social Structure 1(1).Google Scholar
  25. Fruchterman, T. M. J. & Reingold, E. M. (1991). Graph Drawing by Force-Directed Place-ment. Software-Practice & Experience 21(11), 1129-1164.CrossRefGoogle Scholar
  26. Gerstenfeld, P. B., Grant, D. R., & Chiang, C. P. (2003). Hate Online: A Content Analysis of Extremist Internet Sites. Analyses of Social Issues and Public Policy 3, 29-44.CrossRefGoogle Scholar
  27. Gibson, D., J. Kleinberg, & Raghavan, P. (1998). Inferring Web Communities from Link Topology. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia, Pitts-burgh, PA.Google Scholar
  28. Girvan, M. & Newman, M. E. J. (2002). Community Structure in Social and Biological Networks. Proceedings of the National Academy of Science of the United States of America 99, 7821-7826.CrossRefGoogle Scholar
  29. Glaser, J., Dixit, J., & Green, D. P. (2002). Studying Hate Crime with the Internet: What Makes Racists Advocate Racial Violence? Journal of Social Issues 58(1), 177-193.CrossRefGoogle Scholar
  30. Hof, R. (2005). Blogs on Ice: Signs of a Business Model? Business Week Online - The Tech Beat, June 2, 2005. [Online] Retrieved from the_thread/techbeat/archives/2005/06/ blogs_on_ice_si.html on October 7, 2005.
  31. Kleinberg, J. (1998). Authoritative Sources in a Hyperlinked Environment, in Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, USA, Jan 1998, pp. 668-677.Google Scholar
  32. Kosala, R. & Blockeel, H. (2000). Web Mining Research: A Survey. ACM SIGKDD Explo-rations 2(1), 1-15.CrossRefGoogle Scholar
  33. Krebs, V. E. (2001). Mapping Networks of Terrorist Cells. Connections 24(3), 43-52.Google Scholar
  34. Krupka, G. R. & Hausman, K. (1998). IsoQuest Inc.: Description of the NetOwlTM extractor system as used for MUC-7, in Proceedings of the Seventh Message Understanding Conference, April 1998.Google Scholar
  35. Kruskal, J. B. & Wish, M. (1978). Multidimensional Scaling. Beverly Hills, CA, Sage Publications.Google Scholar
  36. Kumar, R., Raghavan, P., Rajagopalan, S., & Tomkins, A. (1999). Trawling the Web for Emerging Cyber-Communities. Computer Networks 31(11-16), 1481-1493.CrossRefGoogle Scholar
  37. Kumar, R., Raghavan, P., Rajagopalan, S., & Tomkins, A. (2002). The Web and Social Networks. IEEE Computer 35(11), 32-36.Google Scholar
  38. Lee, E., & Leets, L. (2002). Persuasive Storytelling by Hate Groups Online: Examining Its Effects on Adolescents. American Behavioral Scientist 45, 927-957.CrossRefGoogle Scholar
  39. Levin, J., & McDevitt, J. (1993). Hate crimes: The Rising Tide of Bigotry and Bloodshed. New York: Plenum.Google Scholar
  40. Nardi, B. A., Schiano, D. J., Gumbrecht, M., & Swartz, L. (2004). Why We Blog. Communi-cations of the ACM 47(12), 41-46Google Scholar
  41. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., & Parisi, D. (2004). Defining and Identi-fying Communities in Networks. Proceedings of the National Academy of Science of the United States of America, 101, 2658-2663.CrossRefGoogle Scholar
  42. Sparrow, M. K. (1991). The Application of Network Analysis to Criminal Intelligence: An Assessment of the Prospects. Social Networks 13, 251-274.CrossRefGoogle Scholar
  43. Wasserman, S. & Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge, Cambridge University Press.Google Scholar
  44. Watts, D. J. & Strogatz, S. H. (1998). Collective Dynamics of ‘Small-World’ Networks. Na-ture 393, 440-442.Google Scholar
  45. White, H. C., Boorman, S. A., & Breiger, R. L. (1976). Social Structure from Multiple Networks: I. Blockmodels of Roles and Positions. American Journal of Sociology 81, 730-780.CrossRefGoogle Scholar
  46. Xu, J. J. & Chen, H. (2004). Fighting Organized Crime: Using Shortest-Path Algorithms to Identify Associations in Criminal Networks. Decision Support Systems 38(3), 473-487.CrossRefGoogle Scholar
  47. Xu, J. J. & Chen, H. (2005). CrimeNet Explorer: A Framework for Criminal Network Knowl-edge Discovery. ACM Transactions on Information Systems 23(2), 201-226.CrossRefGoogle Scholar
  48. Zhou, Y., Reid, E., Qin, J., Chen, H., & Lai, G. (2005). US Domestic Extremist Groups on the Web: Link and Content Analysis. IEEE Intelligent Systems 20(5), 44-51.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Michael Chau
    • 1
  • Jennifer Xu
    • 2
  1. 1.School of BusinessThe University of Hong KongChina
  2. 2.Department of Computer Information SystemsBentley CollegeWalthamUSA

Personalised recommendations