Counterterrorism Mining for Individuals Semantically-Similar to Watchlist Members

Danowski, James A.

doi:10.1007/978-3-7091-0388-3_12

James A. Danowski⁴

Part of the book series: Lecture Notes in Social Networks ((LNSN))

2008 Accesses
3 Citations

Abstract

A key counterterrorism problem is how to identify people that should be added to a watchlist even though they have no direct communication with its members. One of the main ways a watchlist is expanded is by monitoring the emergence of new persons who establish contact with those on the list. Unfortunately, this severely limits the time horizon for managing risks of dark network behaviors because individuals are already actively involved with one another and more likely to be discussing and planning terrorist actions. In contrast, a wider time horizon results from identifying individuals who do not yet have communication with watchlist members, while they have highly similar semantic networks. Discussion forums are considered a primary source of intelligence about plans for dark behaviors. The research reported here develops a method for locating individuals in discussion forums who have highly similar semantic networks to some reference network, either based on watchlist members’ observed message content or based on other standards such as radical jihadists’ semantic networks extracted from messages they disseminate on the internet. This research demonstrates such methods using a Pakistani discussion forum with diverse content. Of those pairs of individuals with highly-similar semantic networks, 61% have no direct contact in the forum. It is likely that adding to watchlists individuals who have a high match to a reference semantic network lengthens the time horizon for identifying high risk dark behaviors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Batageli, V., Mrvar, A.: Pajek program for analysis and visualization of large networks. Version 2.0 Reference Manual. Ljubljana, Slovenia, University of Ljubljana (2010)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Borgatti, S.P., Everett, M.G., Freeman, L.C.: UCINet for Windows: Software for Social Network Analysis [computer program]. Analytic Technologies, Harvard, MA (2002)
Google Scholar
Borgatti, S.P.: NetDraw: Graph visualization software [computer program]. Analytic Technologies, Harvard, MA (2002)
Google Scholar
Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–53 (2006)
Article MATH Google Scholar
Bunt, G.R.: iMuslims: Rewiring the House of Islam. University of North Carolina Press, Chapel Hill, NC (2009)
Google Scholar
Burgoon, M., Miller, M.D., Cohen, M., Montgomery, C.L.: An empirical test of a model of resistance to persuasion. Human Commun. Res. 5(1), 27–39 (1978)
Article Google Scholar
Burt, R.S.: Cohesion versus structural equivalence as a basis for network subgroups. Sociol. Methods Research, 7; 189–211 (1978)
Article Google Scholar
Chang, J.: Package ‘lda’: Collapsed Gibbs sampling methods for topic models [computer program.] Princeton, NJ: Princeton University. Retrieved August 30, 2010 from http://cran.r-project.org/web/packages/lda/ (2010)
Chaski, C.E.: Empirical evaluations of language-based author identification techniques. Forensic Linguistics 8(1), 1350–1771 (2001)
Google Scholar
Chen, H., Yang, C. (eds.): Terrorism informatics: Knowledge management and data mining for homeland security. Springer, New York, NY (2008)
Google Scholar
Danowski, J.A., Martin, T.H.: Evaluating the health of information science: Research community and user contexts. Final report to the Division of Information Science of the National Science Foundation, no. IST78-21130 (1979)
Google Scholar
Danowski, J.A.: A network-based content analysis methodology for computer-mediated communication: An illustration with a computer bulletin board. In: Burgoon, M. (ed.) Communication Yearbook 5, pp. 904–925. Transaction Books, New Brunswick, NJ (1982)
Google Scholar
Danowski, J.A.: Interpersonal network radiality and mass and non-mass media behaviors. In: Gumpert, G., Cathcart, R. (eds.) Inter-Media, 3rd edn., pp. 168–175. Oxford University Press, New York (1986)
Google Scholar
Danowski, J.A.: Organizational infographics and automated auditing: Using computers to unobtrusively gather and analyze communication. In: Goldhaber, G.,Barnett, G. (eds.) Handbook of organizational communication. pp. 335–384. Ablex, Norwood, NJ (1988)
Google Scholar
Danowski, J.A.: A mathematical model for selection based on individuals’ semantic fit with the organization’s aggregate semantic network in high performance units. Presented to the Speech Communication Association, Chicago (1990)
Google Scholar
Danowski, J.A.: WORDij: A word-pair approach to information retrieval. Proceedings of the DARPA/NIST TREC Conference, pp. 131–136. National Institute of Standards and Technology, Washington, DC (1993a)
Google Scholar
Danowski, J.A.: Network analysis of message content. In: Barnett, G., Richards, W. (eds.). Progress in communication sciences XII, pp. 197–222. Ablex, Norwood, NJ (1993b)
Google Scholar
Danowski, J.A.: Evaluative word locations in semantic networks from news stories about Al Qaeda and implications for optimal communication messages in anti-terrorism campaigns. Paper presented to the conference: EuroISI2008: European Conference on Intelligence and Security Informatics, Esbjerg, Denmark (2008)
Google Scholar
Danowski, J.A. (2010). WORDij 3.0 [computer program]. Chicago: University of Illinois at Chicago. Available at http://wordij.net.
Danowski, J.A., Ruchinskas, J.E.: Period, cohort, and age effects: A study of television exposure in presidential election campaigns, 1952–1980. Communic. Res. 10(1), 77–96 (1983)
Article Google Scholar
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman R.A.: Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41(6), 391–407 (1990)
Article Google Scholar
Ess, C., and the AoIR ethics working committee: Ethical decision-making and Internet research: Recommendations from the aoir ethics working committee (2002). Retrieved on December 1, 2010 from http://www.aoir.org/reports/ethics.pdf
Freeman, L.C.: Centrality in social networks: Conceptual clarification. Soc. Netw. 1, 215–239 (1979)
Article Google Scholar
Gehler, P.V. (n.d.). pLSA: Probabilistic Latent Semantic Analysis. [computer program.]. Tubingen, Germany: Max Planck Institute for Biological Cybernetics. Retrieved August 31, 2010 from http://www.kyb.mpg.de/bs/people/pgehler/code/index.html
Girvan, M., Newman, M.E. J.: Community structure in social and biological networks. PNAS 99(12), 7821–7826 (2002)
Article MATH MathSciNet Google Scholar
Harish, P., Vineet, V., Narayanan, P.J.: Large graph algorithms for massively multithreaded architectures. Report No: IIIT/TR/2009/74. Hyderabad, India: Center for Visual Information Technology, International Institute of Information Technology (2009). Retrieved September 10, 2010 from http://iiit.ac.in
Hocking, J.E., Margreiter, D.G., Hylton, C.: Intra-audience effects: A field test. Hum. Commun. Res. 3(3), 243–249 (1977)
Article Google Scholar
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the Twenty-Second Annual International SIGIR Conference (1999)
Google Scholar
Horton, D., Wohl, R.: Mass communication and parasocial interaction: Observations on intimacy at a distance. Psychiatry 19, 215–29 (1956)
Google Scholar
Howard, P.N.: Network ethnography and the hypermedia organization: New media, new organizations, new methods. New Media Soc. 4(4), 550–574 (2002)
Article Google Scholar
Hubert, L., Schultz, J.: Quadratic Assignment as a general data analysis strategy. Br. J. Math. Stat. Psychol. 29, 190–241 (1976)
Article MATH MathSciNet Google Scholar
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of International Conference on Research in Computational Linguistics, Taiwan (1997)
Google Scholar
Kay, P., Kempton, W.: What is the Sapir-Whorf Hypothesis? Am. Anthropol. 86(1), 65–79 (1984)
Article Google Scholar
Kelman H.C.: Compliance, identification, and internalization: Three processes of attitude change. J. Conflict. Resolut. 2(1), 51–60 (1958)
Article MathSciNet Google Scholar
Kincaid, D.L.: The convergence theory of communication, self-organization and cultural evolution. In: Kincaid, D.L. (ed.), Communication theory: Eastern and Western perspectives, pp. 209–221. Academic, New York (1987)
Chapter Google Scholar
Krackhardt, D.: Predicting with social networks: Nonparametric multiple regression analysis of dyadic data. Soc. Netw. 10, 359–382 (1988)
Article MathSciNet Google Scholar
Li, J., Zheng, R., Chen, H.: From fingerprint to writeprint. Commun. ACM 49(4), 76–82 (2006)
Google Scholar
Marques, J.M., Abrams, D., Paez, D., Hogg, M.A.: Social categorization, social identification, and rejection of deviant group members. In: Blackwell Handbook of Social Psychology: Group Processes, vol. 3, pp. 400–420 (2001)
Google Scholar
McCallum, A.K.: MALLET: A machine learning for language toolkit (2002). Retrieved August 29, 2010 from http://mallet.cs.umass.edu
Mirzal, A., Furukawa, M.: Node-context network clustering using parafac tensor decomposition. Proceedings of the 4th International Conference Information & Communication Technology and Systems, pp. 283–388 (2010)
Google Scholar
Monge, P.R., Eisenberg, E.M.: Emergent communication networks. In: Jablin, F., Putnam, L.L., Roberts, K., Porter, L. (eds.). Handbook of organizational communication, pp. 304–342. Sage, Newbury Park, CA (1987)
Google Scholar
Pennebaker, J.W., Booth, R.J., Francis, M.E.: Linguistic inquiry and word count (LIWC). [computer program]. Liwc, Austin, TX (2007)
Google Scholar
Raskin, R., Shaw, R.: Narcissism and the use of personal pronouns. J Pers. 56(2), 393–404 (1988)
Article Google Scholar
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453, Montreal (1995)
Google Scholar
Richards, W.D., Jr.: A manual for network analysis: Using the NEGOPY network analysis program. ERIC, ED #114110 (1975)
Google Scholar
Riopelle, K., Danowski, J.A., Bishop, A.: Expression of sentiment by different node positions in email networks. Paper presented to annual meetings of the International Network for Social Network Analysts, Riva Del Garda, Italy, 29 June–4 July 2010
Google Scholar
Rogers, E.M., Bhowmik, D.K.: Homophily-heterophily: Relational concepts for communication research. The Public Opinion Quarterly 34(4), 523–538 (1970)
Article Google Scholar
Rogers, T.B., Kuiper, N.A., Kirker, W.S.: Self-reference and the encoding of personal information. J. Pers. Soc. Psychol. 35(9), 677–688 (1977)
Article Google Scholar
Stone, P.J., Bales, R.F., Namewrith, Z., Ogilvie, D.M.: The General Inquirer: A computer system for content analysis and retrieval based on the sentence as a unit of information. Behav. Sci. 7(4), 484–498 (1962)
Article Google Scholar
Thelwall, M.: Extracting macroscopic information from Web links. J. Am. Soc. Inf. Sci. Technol. 52(13), 1157–1168 (2001)
Article Google Scholar
Office of the Inspector General: The Federal Bureau of Investigation’s Terrorist Watchlist nomination practices. U.S. Department of Justice, Audit Division, Audi Report 09-25, May (2009)
Google Scholar
Wallach, H.M.: Topic modeling: Beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA (2006)
Google Scholar

Download references

Acknowledgements

I am grateful for programming support from Rafal Radulski, Mike Hutcheson, and Brittany Johnson at the University of Illinois at Chicago, to Jonathan Chang, computer scientist at Facebook and Princeton for help in executing his LDA in R code, to Michael W. Berry for resolving some computational problems with a Matlab pLSI script, and to Gabriel Weimann, University of Haifa for providing translated radical jihadist texts in response to Obama’s June 4, 2008 Cairo speech. The version 3.0 WORDij software used in this research was supported in part by the National Science Foundation’s Human and Social Dynamics (HSD) Award #SES-527487.

Author information

Authors and Affiliations

Department of Communication, University of Illinois at Chicago, Chicago, IL, USA
James A. Danowski

Authors

James A. Danowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James A. Danowski .

Editor information

Editors and Affiliations

The Maersk McKinney Moller Institute, University of Southern Denmark, Campusvej 55, 5230, Odense, Denmark
Uffe Kock Wiil

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Danowski, J.A. (2011). Counterterrorism Mining for Individuals Semantically-Similar to Watchlist Members. In: Wiil, U.K. (eds) Counterterrorism and Open Source Intelligence. Lecture Notes in Social Networks. Springer, Vienna. https://doi.org/10.1007/978-3-7091-0388-3_12

Download citation

DOI: https://doi.org/10.1007/978-3-7091-0388-3_12
Published: 26 May 2011
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-0387-6
Online ISBN: 978-3-7091-0388-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics