Dark Web Forum Portal
In recent years, there have been numerous studies from a variety of perspectives analyzing the Internet presence of hate and extremist groups. Yet the web sites and forums of extremist and terrorist groups have long remained an underutilized resource for terrorism researchers due to their ephemeral nature and persistent access and analysis problems. The purpose of the Dark Web archive, therefore, is to provide a research infrastructure for use by social scientists, computer and information scientists, policy and security analysts, and others studying a wide range of social and organizational phenomena and computational problems. The Dark Web Forum Portal provides web-enabled access to critical international jihadist web forums. The focus of this chapter is on the significant extensions to previous work including: increasing the scope of our data collection; adding an incremental spidering component for regular data updates; enhancing the searching and browsing functions; enhancing multilingual machine translation for Arabic, French, German, and Russian; and adding advanced social network analysis. A case study on identifying active participants is described at the end.
KeywordsSocial Network Analysis Machine Translation Security Analyst List Page Forum Data
This work is supported by the NSF Computer and Network Systems (CNS) Program (CNS-0709338), September 2007–August 2010, and HDTRA1-09-1-0058, July 2009–July 2012. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or DOD.
- Adamic, L. A., Zhang, J., Bakshy, E., and Ackerman, M. S. (2008). “Knowledge Sharing and Yahoo Answers: Everyone Knows Something.” In Proceeding of the 17th International Conference on World Wide Web (Beijing, China, April 21–25). WWW ‘08. ACM, New York, NY, pp. 665–674.Google Scholar
- Cheong, F.C. (1996). Internet Agents: Spiders, Wanderers, Brokers, and Bots. Indianapolis, IN: New Riders Publishing.Google Scholar
- Cho, J. and Garcia-Molina, H. (2000). “The Evolution of the Web and Implications for an Incremental Crawler.” In Proceedings of the 26th International Conference on Very Large Databases.Google Scholar
- Coll, S. and Glasser, S.B. (2005). “Terrorists Turn to the Web as Base of Operations,” Washington Post, August 7.Google Scholar
- Fu, T.J., Abbasi, A., and Chen, H. (2010 online; forthcoming in print). “A Focused Crawler for Dark Web Forums,” Journal of the American Society for Information Science and Technology (JASIST).Google Scholar
- Raghavan, S. and Garcia-Molina, H. (2001). “Crawling the Hidden Web.” In Proceedings of the 27th International Conference on Very Large Databases.Google Scholar
- Reid, E., Qin, J., Chung, W., Xu, J., Zhou, Y., Schumaker, R., Sageman, M., and Chen, H. (2004). “Terrorism Knowledge Discovery Project: A Knowledge Discovery Approach to Addressing the Threats of Terrorism.” In Proceedings of the 2nd Symposium on Intelligence and Security Informatics (Tucson, June 10–11), pp. 125–145.Google Scholar
- Weimann, G. (2004). “www.terror.net: How Modern Terrorism Uses the Internet.” Special Report, United States Institute of Peace. Retrieved October 31, 2006. http://www.usip.org/pubs/specialreports/sr116.pdf.
- Zhang, Y., Zeng, S., Fan, L., Dang, Y., Larson, C., and Chen, H. (2009). “Dark Web Forums Portal: Searching and Analyzing Jihadist Forums.” In Proceedings of the IEEE International Intelligence and Security Informatics Conference (Dallas, Texas, June 8–11).Google Scholar
- Zhou, Y., Qin, J., Chen, H., et al. (2005). “Multilingual Web Retrieval: An Experiment on a Multilingual Business Intelligence Portal.” In Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS’2005).Google Scholar
- Zhang, J., Ackerman, M. S., and Adamic, L. (2007). “Expertise Networks in Online Communities: Structure and Algorithms.” In Proceedings of the 16th International Conference on World Wide Web (Banff, Alberta, Canada, May 08–12). WWW ‘07. ACM, New York, NY, pp. 221–230.Google Scholar