Using Web Mining and Social Network Analysis to Study The Emergence of Cyber Communities In Blogs
Blogs have become increasingly popular in recent years. Bloggers can express their opinions and emotions more freely and easily than before.Many communities have emerged in the blogosphere, including racist and hate groups that are trying to share their ideology, express their views, or recruit new group members. It is imperative to analyze these cyber communities in order to monitor for activities that are potentially harmful to society. Web mining and social network analysis techniques, which have been widely used to analyze the content and structure of Web sites of hate groups on the Internet, have not been applied to the study of hate groups in blogs. In this research, we present a framework, which consists of components of blog spider, information extraction, network analysis, and visualization, to address this problem (Chau & Xu, 2007). We applied this framework to identify and analyze a selected set of 28 anti-Blacks hate groups on Xanga, one of the most popular blog hosting sites. Our analysis results revealed some interesting demographical and topological characteristics in these groups, and identified at least two large communities on top of the smaller ones. We suggest that our framework can be generalized and applied to blog analysis in other domains.
KeywordsSocial Network Analysis Hate Crime Giant Component Average Short Path Length White Supremacist
Unable to display preview. Download preview PDF.
- Alexa(2005).Top English Language Sites.[Online]Retrievedfrom http://www.alexa.com/site/ds/top_sites?ts_mode=lang&lang=en on October 7, 2005.
- Anti-Defamation League (2001). Poisoning the Web: Hatred Online. [Online] Retrieved from http://www.adl.org/poisoning_web/poisoning_toc.asp on October 7, 2005.
- Bollobás, B. (1985). Random Graphs. London, Academic.Google Scholar
- Brin, S. & Page, L. (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. Proceedings of the 7th WWW Conference, Brisbane, Australia, April 1998.Google Scholar
- Burris, V., Smith, E., & Strahm, A. (2000). White Supremacist Networks on the Internet. So-ciological Focus 33(2), 215-235.Google Scholar
- Chau, M. & Chen, H. (2003). Personalized and Focused Web Spiders, in N. Zhong, J. Liu, & Y. Yao (Eds), Web Intelligence, Springer-Verlag, 197-217.Google Scholar
- Chau, M., Shiu, B., Chan, I., & Chen, H. (2005). Automated Identification of Web Communi-ties for Business Intelligence Analysis, in Proceedings of the Fourth Workshop on E-Business (WEB 2005), Las Vegas, USA, December, 2005.Google Scholar
- Chen, H., Chung, W., Xu. J., Wang, G., Qin, Y., & Chau, M. (2004). Crime Data Mining: A General Framework and Some Examples. IEEE Computer 37(4), 50-56.Google Scholar
- Cheong, F. C. (1996). Internet Agents: Spiders, Wanderers, Brokers, and Bots. Indianapolis, Indiana, USA: New Riders Publishing.Google Scholar
- CNN (1999). Hate Group Web Sites on the Rise, CNN News [Online] Retrieved from http://edition.cnn.com/US/9902/23/hate.group.report/index.html on October 7, 2005.
- Flake, G. W., Lawrence, S., & Giles, C. L. (2000). Efficient Identification of Web Communities. In Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD 2000), Boston, MA.Google Scholar
- Flake, G. W., Lawrence, S., Giles, C. L., & Coetzee, F. M. (2002). Self-Organization and Identification of Web Communities. IEEE Computer 35(3), 66-71.Google Scholar
- Franklin,R. A.(2005).The Hate Directory[Online] Retrieved from http://www.bcpl.net/~rfrankli/hatedir.htm on October 7, 2005.
- Freeman, L. C. (2000). Visualizing Social Networks. Journal of Social Structure 1(1).Google Scholar
- Gibson, D., J. Kleinberg, & Raghavan, P. (1998). Inferring Web Communities from Link Topology. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia, Pitts-burgh, PA.Google Scholar
- Hof, R. (2005). Blogs on Ice: Signs of a Business Model? Business Week Online - The Tech Beat, June 2, 2005. [Online] Retrieved from http://www.businessweek.com/ the_thread/techbeat/archives/2005/06/ blogs_on_ice_si.html on October 7, 2005.
- Kleinberg, J. (1998). Authoritative Sources in a Hyperlinked Environment, in Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, USA, Jan 1998, pp. 668-677.Google Scholar
- Krebs, V. E. (2001). Mapping Networks of Terrorist Cells. Connections 24(3), 43-52.Google Scholar
- Krupka, G. R. & Hausman, K. (1998). IsoQuest Inc.: Description of the NetOwlTM extractor system as used for MUC-7, in Proceedings of the Seventh Message Understanding Conference, April 1998.Google Scholar
- Kruskal, J. B. & Wish, M. (1978). Multidimensional Scaling. Beverly Hills, CA, Sage Publications.Google Scholar
- Kumar, R., Raghavan, P., Rajagopalan, S., & Tomkins, A. (2002). The Web and Social Networks. IEEE Computer 35(11), 32-36.Google Scholar
- Levin, J., & McDevitt, J. (1993). Hate crimes: The Rising Tide of Bigotry and Bloodshed. New York: Plenum.Google Scholar
- Nardi, B. A., Schiano, D. J., Gumbrecht, M., & Swartz, L. (2004). Why We Blog. Communi-cations of the ACM 47(12), 41-46Google Scholar
- Wasserman, S. & Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge, Cambridge University Press.Google Scholar
- Watts, D. J. & Strogatz, S. H. (1998). Collective Dynamics of ‘Small-World’ Networks. Na-ture 393, 440-442.Google Scholar