Skip to main content

Counterterrorism Mining for Individuals Semantically-Similar to Watchlist Members

  • Chapter
  • First Online:
Counterterrorism and Open Source Intelligence

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Abstract

A key counterterrorism problem is how to identify people that should be added to a watchlist even though they have no direct communication with its members. One of the main ways a watchlist is expanded is by monitoring the emergence of new persons who establish contact with those on the list. Unfortunately, this severely limits the time horizon for managing risks of dark network behaviors because individuals are already actively involved with one another and more likely to be discussing and planning terrorist actions. In contrast, a wider time horizon results from identifying individuals who do not yet have communication with watchlist members, while they have highly similar semantic networks. Discussion forums are considered a primary source of intelligence about plans for dark behaviors. The research reported here develops a method for locating individuals in discussion forums who have highly similar semantic networks to some reference network, either based on watchlist members’ observed message content or based on other standards such as radical jihadists’ semantic networks extracted from messages they disseminate on the internet. This research demonstrates such methods using a Pakistani discussion forum with diverse content. Of those pairs of individuals with highly-similar semantic networks, 61% have no direct contact in the forum. It is likely that adding to watchlists individuals who have a high match to a reference semantic network lengthens the time horizon for identifying high risk dark behaviors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Batageli, V., Mrvar, A.: Pajek program for analysis and visualization of large networks. Version 2.0 Reference Manual. Ljubljana, Slovenia, University of Ljubljana (2010)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Borgatti, S.P., Everett, M.G., Freeman, L.C.: UCINet for Windows: Software for Social Network Analysis [computer program]. Analytic Technologies, Harvard, MA (2002)

    Google Scholar 

  4. Borgatti, S.P.: NetDraw: Graph visualization software [computer program]. Analytic Technologies, Harvard, MA (2002)

    Google Scholar 

  5. Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–53 (2006)

    Article  MATH  Google Scholar 

  6. Bunt, G.R.: iMuslims: Rewiring the House of Islam. University of North Carolina Press, Chapel Hill, NC (2009)

    Google Scholar 

  7. Burgoon, M., Miller, M.D., Cohen, M., Montgomery, C.L.: An empirical test of a model of resistance to persuasion. Human Commun. Res. 5(1), 27–39 (1978)

    Article  Google Scholar 

  8. Burt, R.S.: Cohesion versus structural equivalence as a basis for network subgroups. Sociol. Methods Research, 7; 189–211 (1978)

    Article  Google Scholar 

  9. Chang, J.: Package ‘lda’: Collapsed Gibbs sampling methods for topic models [computer program.] Princeton, NJ: Princeton University. Retrieved August 30, 2010 from http://cran.r-project.org/web/packages/lda/ (2010)

  10. Chaski, C.E.: Empirical evaluations of language-based author identification techniques. Forensic Linguistics 8(1), 1350–1771 (2001)

    Google Scholar 

  11. Chen, H., Yang, C. (eds.): Terrorism informatics: Knowledge management and data mining for homeland security. Springer, New York, NY (2008)

    Google Scholar 

  12. Danowski, J.A., Martin, T.H.: Evaluating the health of information science: Research community and user contexts. Final report to the Division of Information Science of the National Science Foundation, no. IST78-21130 (1979)

    Google Scholar 

  13. Danowski, J.A.: A network-based content analysis methodology for computer-mediated communication: An illustration with a computer bulletin board. In: Burgoon, M. (ed.) Communication Yearbook 5, pp. 904–925. Transaction Books, New Brunswick, NJ (1982)

    Google Scholar 

  14. Danowski, J.A.: Interpersonal network radiality and mass and non-mass media behaviors. In: Gumpert, G., Cathcart, R. (eds.) Inter-Media, 3rd edn., pp. 168–175. Oxford University Press, New York (1986)

    Google Scholar 

  15. Danowski, J.A.: Organizational infographics and automated auditing: Using computers to unobtrusively gather and analyze communication. In: Goldhaber, G.,Barnett, G. (eds.) Handbook of organizational communication. pp. 335–384. Ablex, Norwood, NJ (1988)

    Google Scholar 

  16. Danowski, J.A.: A mathematical model for selection based on individuals’ semantic fit with the organization’s aggregate semantic network in high performance units. Presented to the Speech Communication Association, Chicago (1990)

    Google Scholar 

  17. Danowski, J.A.: WORDij: A word-pair approach to information retrieval. Proceedings of the DARPA/NIST TREC Conference, pp. 131–136. National Institute of Standards and Technology, Washington, DC (1993a)

    Google Scholar 

  18. Danowski, J.A.: Network analysis of message content. In: Barnett, G., Richards, W. (eds.). Progress in communication sciences XII, pp. 197–222. Ablex, Norwood, NJ (1993b)

    Google Scholar 

  19. Danowski, J.A.: Evaluative word locations in semantic networks from news stories about Al Qaeda and implications for optimal communication messages in anti-terrorism campaigns. Paper presented to the conference: EuroISI2008: European Conference on Intelligence and Security Informatics, Esbjerg, Denmark (2008)

    Google Scholar 

  20. Danowski, J.A. (2010). WORDij 3.0 [computer program]. Chicago: University of Illinois at Chicago. Available at http://wordij.net.

  21. Danowski, J.A., Ruchinskas, J.E.: Period, cohort, and age effects: A study of television exposure in presidential election campaigns, 1952–1980. Communic. Res. 10(1), 77–96 (1983)

    Article  Google Scholar 

  22. Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman R.A.: Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  23. Ess, C., and the AoIR ethics working committee: Ethical decision-making and Internet research: Recommendations from the aoir ethics working committee (2002). Retrieved on December 1, 2010 from http://www.aoir.org/reports/ethics.pdf

  24. Freeman, L.C.: Centrality in social networks: Conceptual clarification. Soc. Netw. 1, 215–239 (1979)

    Article  Google Scholar 

  25. Gehler, P.V. (n.d.). pLSA: Probabilistic Latent Semantic Analysis. [computer program.]. Tubingen, Germany: Max Planck Institute for Biological Cybernetics. Retrieved August 31, 2010 from http://www.kyb.mpg.de/bs/people/pgehler/code/index.html

  26. Girvan, M., Newman, M.E. J.: Community structure in social and biological networks. PNAS 99(12), 7821–7826 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  27. Harish, P., Vineet, V., Narayanan, P.J.: Large graph algorithms for massively multithreaded architectures. Report No: IIIT/TR/2009/74. Hyderabad, India: Center for Visual Information Technology, International Institute of Information Technology (2009). Retrieved September 10, 2010 from http://iiit.ac.in

  28. Hocking, J.E., Margreiter, D.G., Hylton, C.: Intra-audience effects: A field test. Hum. Commun. Res. 3(3), 243–249 (1977)

    Article  Google Scholar 

  29. Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the Twenty-Second Annual International SIGIR Conference (1999)

    Google Scholar 

  30. Horton, D., Wohl, R.: Mass communication and parasocial interaction: Observations on intimacy at a distance. Psychiatry 19, 215–29 (1956)

    Google Scholar 

  31. Howard, P.N.: Network ethnography and the hypermedia organization: New media, new organizations, new methods. New Media Soc. 4(4), 550–574 (2002)

    Article  Google Scholar 

  32. Hubert, L., Schultz, J.: Quadratic Assignment as a general data analysis strategy. Br. J. Math. Stat. Psychol. 29, 190–241 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  33. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of International Conference on Research in Computational Linguistics, Taiwan (1997)

    Google Scholar 

  34. Kay, P., Kempton, W.: What is the Sapir-Whorf Hypothesis? Am. Anthropol. 86(1), 65–79 (1984)

    Article  Google Scholar 

  35. Kelman H.C.: Compliance, identification, and internalization: Three processes of attitude change. J. Conflict. Resolut. 2(1), 51–60 (1958)

    Article  MathSciNet  Google Scholar 

  36. Kincaid, D.L.: The convergence theory of communication, self-organization and cultural evolution. In: Kincaid, D.L. (ed.), Communication theory: Eastern and Western perspectives, pp. 209–221. Academic, New York (1987)

    Chapter  Google Scholar 

  37. Krackhardt, D.: Predicting with social networks: Nonparametric multiple regression analysis of dyadic data. Soc. Netw. 10, 359–382 (1988)

    Article  MathSciNet  Google Scholar 

  38. Li, J., Zheng, R., Chen, H.: From fingerprint to writeprint. Commun. ACM 49(4), 76–82 (2006)

    Google Scholar 

  39. Marques, J.M., Abrams, D., Paez, D., Hogg, M.A.: Social categorization, social identification, and rejection of deviant group members. In: Blackwell Handbook of Social Psychology: Group Processes, vol. 3, pp. 400–420 (2001)

    Google Scholar 

  40. McCallum, A.K.: MALLET: A machine learning for language toolkit (2002). Retrieved August 29, 2010 from http://mallet.cs.umass.edu

  41. Mirzal, A., Furukawa, M.: Node-context network clustering using parafac tensor decomposition. Proceedings of the 4th International Conference Information & Communication Technology and Systems, pp. 283–388 (2010)

    Google Scholar 

  42. Monge, P.R., Eisenberg, E.M.: Emergent communication networks. In: Jablin, F., Putnam, L.L., Roberts, K., Porter, L. (eds.). Handbook of organizational communication, pp. 304–342. Sage, Newbury Park, CA (1987)

    Google Scholar 

  43. Pennebaker, J.W., Booth, R.J., Francis, M.E.: Linguistic inquiry and word count (LIWC). [computer program]. Liwc, Austin, TX (2007)

    Google Scholar 

  44. Raskin, R., Shaw, R.: Narcissism and the use of personal pronouns. J Pers. 56(2), 393–404 (1988)

    Article  Google Scholar 

  45. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453, Montreal (1995)

    Google Scholar 

  46. Richards, W.D., Jr.: A manual for network analysis: Using the NEGOPY network analysis program. ERIC, ED #114110 (1975)

    Google Scholar 

  47. Riopelle, K., Danowski, J.A., Bishop, A.: Expression of sentiment by different node positions in email networks. Paper presented to annual meetings of the International Network for Social Network Analysts, Riva Del Garda, Italy, 29 June–4 July 2010

    Google Scholar 

  48. Rogers, E.M., Bhowmik, D.K.: Homophily-heterophily: Relational concepts for communication research. The Public Opinion Quarterly 34(4), 523–538 (1970)

    Article  Google Scholar 

  49. Rogers, T.B., Kuiper, N.A., Kirker, W.S.: Self-reference and the encoding of personal information. J. Pers. Soc. Psychol. 35(9), 677–688 (1977)

    Article  Google Scholar 

  50. Stone, P.J., Bales, R.F., Namewrith, Z., Ogilvie, D.M.: The General Inquirer: A computer system for content analysis and retrieval based on the sentence as a unit of information. Behav. Sci. 7(4), 484–498 (1962)

    Article  Google Scholar 

  51. Thelwall, M.: Extracting macroscopic information from Web links. J. Am. Soc. Inf. Sci. Technol. 52(13), 1157–1168 (2001)

    Article  Google Scholar 

  52. Office of the Inspector General: The Federal Bureau of Investigation’s Terrorist Watchlist nomination practices. U.S. Department of Justice, Audit Division, Audi Report 09-25, May (2009)

    Google Scholar 

  53. Wallach, H.M.: Topic modeling: Beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA (2006)

    Google Scholar 

Download references

Acknowledgements

I am grateful for programming support from Rafal Radulski, Mike Hutcheson, and Brittany Johnson at the University of Illinois at Chicago, to Jonathan Chang, computer scientist at Facebook and Princeton for help in executing his LDA in R code, to Michael W. Berry for resolving some computational problems with a Matlab pLSI script, and to Gabriel Weimann, University of Haifa for providing translated radical jihadist texts in response to Obama’s June 4, 2008 Cairo speech. The version 3.0 WORDij software used in this research was supported in part by the National Science Foundation’s Human and Social Dynamics (HSD) Award #SES-527487.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James A. Danowski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag/Wien

About this chapter

Cite this chapter

Danowski, J.A. (2011). Counterterrorism Mining for Individuals Semantically-Similar to Watchlist Members. In: Wiil, U.K. (eds) Counterterrorism and Open Source Intelligence. Lecture Notes in Social Networks. Springer, Vienna. https://doi.org/10.1007/978-3-7091-0388-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-0388-3_12

  • Published:

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-7091-0387-6

  • Online ISBN: 978-3-7091-0388-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics