Abstract
A key counterterrorism problem is how to identify people that should be added to a watchlist even though they have no direct communication with its members. One of the main ways a watchlist is expanded is by monitoring the emergence of new persons who establish contact with those on the list. Unfortunately, this severely limits the time horizon for managing risks of dark network behaviors because individuals are already actively involved with one another and more likely to be discussing and planning terrorist actions. In contrast, a wider time horizon results from identifying individuals who do not yet have communication with watchlist members, while they have highly similar semantic networks. Discussion forums are considered a primary source of intelligence about plans for dark behaviors. The research reported here develops a method for locating individuals in discussion forums who have highly similar semantic networks to some reference network, either based on watchlist members’ observed message content or based on other standards such as radical jihadists’ semantic networks extracted from messages they disseminate on the internet. This research demonstrates such methods using a Pakistani discussion forum with diverse content. Of those pairs of individuals with highly-similar semantic networks, 61% have no direct contact in the forum. It is likely that adding to watchlists individuals who have a high match to a reference semantic network lengthens the time horizon for identifying high risk dark behaviors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Batageli, V., Mrvar, A.: Pajek program for analysis and visualization of large networks. Version 2.0 Reference Manual. Ljubljana, Slovenia, University of Ljubljana (2010)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Borgatti, S.P., Everett, M.G., Freeman, L.C.: UCINet for Windows: Software for Social Network Analysis [computer program]. Analytic Technologies, Harvard, MA (2002)
Borgatti, S.P.: NetDraw: Graph visualization software [computer program]. Analytic Technologies, Harvard, MA (2002)
Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–53 (2006)
Bunt, G.R.: iMuslims: Rewiring the House of Islam. University of North Carolina Press, Chapel Hill, NC (2009)
Burgoon, M., Miller, M.D., Cohen, M., Montgomery, C.L.: An empirical test of a model of resistance to persuasion. Human Commun. Res. 5(1), 27–39 (1978)
Burt, R.S.: Cohesion versus structural equivalence as a basis for network subgroups. Sociol. Methods Research, 7; 189–211 (1978)
Chang, J.: Package ‘lda’: Collapsed Gibbs sampling methods for topic models [computer program.] Princeton, NJ: Princeton University. Retrieved August 30, 2010 from http://cran.r-project.org/web/packages/lda/ (2010)
Chaski, C.E.: Empirical evaluations of language-based author identification techniques. Forensic Linguistics 8(1), 1350–1771 (2001)
Chen, H., Yang, C. (eds.): Terrorism informatics: Knowledge management and data mining for homeland security. Springer, New York, NY (2008)
Danowski, J.A., Martin, T.H.: Evaluating the health of information science: Research community and user contexts. Final report to the Division of Information Science of the National Science Foundation, no. IST78-21130 (1979)
Danowski, J.A.: A network-based content analysis methodology for computer-mediated communication: An illustration with a computer bulletin board. In: Burgoon, M. (ed.) Communication Yearbook 5, pp. 904–925. Transaction Books, New Brunswick, NJ (1982)
Danowski, J.A.: Interpersonal network radiality and mass and non-mass media behaviors. In: Gumpert, G., Cathcart, R. (eds.) Inter-Media, 3rd edn., pp. 168–175. Oxford University Press, New York (1986)
Danowski, J.A.: Organizational infographics and automated auditing: Using computers to unobtrusively gather and analyze communication. In: Goldhaber, G.,Barnett, G. (eds.) Handbook of organizational communication. pp. 335–384. Ablex, Norwood, NJ (1988)
Danowski, J.A.: A mathematical model for selection based on individuals’ semantic fit with the organization’s aggregate semantic network in high performance units. Presented to the Speech Communication Association, Chicago (1990)
Danowski, J.A.: WORDij: A word-pair approach to information retrieval. Proceedings of the DARPA/NIST TREC Conference, pp. 131–136. National Institute of Standards and Technology, Washington, DC (1993a)
Danowski, J.A.: Network analysis of message content. In: Barnett, G., Richards, W. (eds.). Progress in communication sciences XII, pp. 197–222. Ablex, Norwood, NJ (1993b)
Danowski, J.A.: Evaluative word locations in semantic networks from news stories about Al Qaeda and implications for optimal communication messages in anti-terrorism campaigns. Paper presented to the conference: EuroISI2008: European Conference on Intelligence and Security Informatics, Esbjerg, Denmark (2008)
Danowski, J.A. (2010). WORDij 3.0 [computer program]. Chicago: University of Illinois at Chicago. Available at http://wordij.net.
Danowski, J.A., Ruchinskas, J.E.: Period, cohort, and age effects: A study of television exposure in presidential election campaigns, 1952–1980. Communic. Res. 10(1), 77–96 (1983)
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman R.A.: Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41(6), 391–407 (1990)
Ess, C., and the AoIR ethics working committee: Ethical decision-making and Internet research: Recommendations from the aoir ethics working committee (2002). Retrieved on December 1, 2010 from http://www.aoir.org/reports/ethics.pdf
Freeman, L.C.: Centrality in social networks: Conceptual clarification. Soc. Netw. 1, 215–239 (1979)
Gehler, P.V. (n.d.). pLSA: Probabilistic Latent Semantic Analysis. [computer program.]. Tubingen, Germany: Max Planck Institute for Biological Cybernetics. Retrieved August 31, 2010 from http://www.kyb.mpg.de/bs/people/pgehler/code/index.html
Girvan, M., Newman, M.E. J.: Community structure in social and biological networks. PNAS 99(12), 7821–7826 (2002)
Harish, P., Vineet, V., Narayanan, P.J.: Large graph algorithms for massively multithreaded architectures. Report No: IIIT/TR/2009/74. Hyderabad, India: Center for Visual Information Technology, International Institute of Information Technology (2009). Retrieved September 10, 2010 from http://iiit.ac.in
Hocking, J.E., Margreiter, D.G., Hylton, C.: Intra-audience effects: A field test. Hum. Commun. Res. 3(3), 243–249 (1977)
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the Twenty-Second Annual International SIGIR Conference (1999)
Horton, D., Wohl, R.: Mass communication and parasocial interaction: Observations on intimacy at a distance. Psychiatry 19, 215–29 (1956)
Howard, P.N.: Network ethnography and the hypermedia organization: New media, new organizations, new methods. New Media Soc. 4(4), 550–574 (2002)
Hubert, L., Schultz, J.: Quadratic Assignment as a general data analysis strategy. Br. J. Math. Stat. Psychol. 29, 190–241 (1976)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of International Conference on Research in Computational Linguistics, Taiwan (1997)
Kay, P., Kempton, W.: What is the Sapir-Whorf Hypothesis? Am. Anthropol. 86(1), 65–79 (1984)
Kelman H.C.: Compliance, identification, and internalization: Three processes of attitude change. J. Conflict. Resolut. 2(1), 51–60 (1958)
Kincaid, D.L.: The convergence theory of communication, self-organization and cultural evolution. In: Kincaid, D.L. (ed.), Communication theory: Eastern and Western perspectives, pp. 209–221. Academic, New York (1987)
Krackhardt, D.: Predicting with social networks: Nonparametric multiple regression analysis of dyadic data. Soc. Netw. 10, 359–382 (1988)
Li, J., Zheng, R., Chen, H.: From fingerprint to writeprint. Commun. ACM 49(4), 76–82 (2006)
Marques, J.M., Abrams, D., Paez, D., Hogg, M.A.: Social categorization, social identification, and rejection of deviant group members. In: Blackwell Handbook of Social Psychology: Group Processes, vol. 3, pp. 400–420 (2001)
McCallum, A.K.: MALLET: A machine learning for language toolkit (2002). Retrieved August 29, 2010 from http://mallet.cs.umass.edu
Mirzal, A., Furukawa, M.: Node-context network clustering using parafac tensor decomposition. Proceedings of the 4th International Conference Information & Communication Technology and Systems, pp. 283–388 (2010)
Monge, P.R., Eisenberg, E.M.: Emergent communication networks. In: Jablin, F., Putnam, L.L., Roberts, K., Porter, L. (eds.). Handbook of organizational communication, pp. 304–342. Sage, Newbury Park, CA (1987)
Pennebaker, J.W., Booth, R.J., Francis, M.E.: Linguistic inquiry and word count (LIWC). [computer program]. Liwc, Austin, TX (2007)
Raskin, R., Shaw, R.: Narcissism and the use of personal pronouns. J Pers. 56(2), 393–404 (1988)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453, Montreal (1995)
Richards, W.D., Jr.: A manual for network analysis: Using the NEGOPY network analysis program. ERIC, ED #114110 (1975)
Riopelle, K., Danowski, J.A., Bishop, A.: Expression of sentiment by different node positions in email networks. Paper presented to annual meetings of the International Network for Social Network Analysts, Riva Del Garda, Italy, 29 June–4 July 2010
Rogers, E.M., Bhowmik, D.K.: Homophily-heterophily: Relational concepts for communication research. The Public Opinion Quarterly 34(4), 523–538 (1970)
Rogers, T.B., Kuiper, N.A., Kirker, W.S.: Self-reference and the encoding of personal information. J. Pers. Soc. Psychol. 35(9), 677–688 (1977)
Stone, P.J., Bales, R.F., Namewrith, Z., Ogilvie, D.M.: The General Inquirer: A computer system for content analysis and retrieval based on the sentence as a unit of information. Behav. Sci. 7(4), 484–498 (1962)
Thelwall, M.: Extracting macroscopic information from Web links. J. Am. Soc. Inf. Sci. Technol. 52(13), 1157–1168 (2001)
Office of the Inspector General: The Federal Bureau of Investigation’s Terrorist Watchlist nomination practices. U.S. Department of Justice, Audit Division, Audi Report 09-25, May (2009)
Wallach, H.M.: Topic modeling: Beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA (2006)
Acknowledgements
I am grateful for programming support from Rafal Radulski, Mike Hutcheson, and Brittany Johnson at the University of Illinois at Chicago, to Jonathan Chang, computer scientist at Facebook and Princeton for help in executing his LDA in R code, to Michael W. Berry for resolving some computational problems with a Matlab pLSI script, and to Gabriel Weimann, University of Haifa for providing translated radical jihadist texts in response to Obama’s June 4, 2008 Cairo speech. The version 3.0 WORDij software used in this research was supported in part by the National Science Foundation’s Human and Social Dynamics (HSD) Award #SES-527487.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag/Wien
About this chapter
Cite this chapter
Danowski, J.A. (2011). Counterterrorism Mining for Individuals Semantically-Similar to Watchlist Members. In: Wiil, U.K. (eds) Counterterrorism and Open Source Intelligence. Lecture Notes in Social Networks. Springer, Vienna. https://doi.org/10.1007/978-3-7091-0388-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-7091-0388-3_12
Published:
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-0387-6
Online ISBN: 978-3-7091-0388-3
eBook Packages: Computer ScienceComputer Science (R0)