Abstract
Recent years have seen an explosion in the number and scale of digital communities (e.g. peer-to-peer file sharing systems, chat applications and social networking sites). Unfortunately, digital communities are host to significant criminal activity including copyright infringement, identity theft and child sexual abuse. Combating this growing level of crime is problematic due to the ever increasing scale of today’s digital communities. This paper presents an approach to provide automated support for the detection of child sexual abuse related activities in digital communities. Specifically, we analyze the characteristics of child sexual abuse media distribution in P2P file sharing networks and carry out an exploratory study to show that corpus-based natural language analysis may be used to automate the detection of this activity. We then give an overview of how this approach can be extended to police chat and social networking communities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
MySpace (April 2008), http://www.myspace.com/
MSN Messenger (April 2008), http://webmessenger.msn.com/
The Gnutella Protocol Specification, version 0.4 (retrieved, April 2008), http://www9.limewire.com/developer/gnutella_protocol0.4.pdf
Ellison, L.: Cyberstalking: Tackling Harassment on the Internet. In: 14th BILETA Conference: CYBERSPACE 1999: Crime, Criminal Justice and the Internet (1999)
Pallister, D.: Internet paedophile gets nine years for sex with schoolgirls, Guardian Newspaper (June 23, 2006), http://www.guardian.co.uk/uk/2006/jun/23/ukcrime.davidpallister
Hughes, D., Gibson, S., Walkerdine, J., Coulson, G.: Is Deviant Behaviour the Norm on P2P File Sharing Networks? IEEE Distributed Systems Online 7(2) (February 2006)
Bittorrent Protocol Specification, version 1.0 (retrieved, April 2008), http://cs.ecs.baylor.edu/~donahoo/classes/5321/projects/bittorrent/BitTorrent%20Protocol%20Specification.doc
Karagiannis, T., Broido, A., Brownlee, N., Faloutsos, M.: Is P2P Dying or Just Hiding? In: Proceedings of Globecom 2004, Dallas, Texas, USA (December 2004)
Lee, K., Walkerdine, J., Hughes, D.: On the Penetration of Business Networks by P2P File Sharing. In: Proceedings of the 2nd International Conference on Internet Monitoring and Protection (ICIMP 2007), Santa Clara, California, USA (July 2007)
RFC 1459: Internet Relay Chat (IRC) Protocol (retrieved, April 2008), http://www.irchelp.org/irchelp/rfc/
Skype (April 2008), www.skype.com
BBC News 24, Chat room Paedophile Jailed, http://news.bbc.co.uk/1/hi/england/2969020.stm
BBC News 24, Men Jailed for Online Rape Plot (April 2008), http://news.bbc.co.uk/1/hi/england/6331517.stm
The Virtual Global Task Force (April 2008), http://www.virtualglobaltaskforce.com/
The UK Child Exploitation and Online Protection Centre (CEOP) (April 2008), www.ceop.gov.uk
Scottish Parliament, The Protection of Children and Prevention of Sexual Offences (Scotland) Bill (April 2008), http://www.scottish.parliament.uk/business/bills/pdfs/b30s2-aspassed.pdf
Facebook (April 2008), http://www.facebook.com
Social Media Today, Facebook Explodes (June 2007), http://www.socialmediatoday.com/SMC/10670
Office of Public Sector Information, Malicious Communications Act 1988 (April 2008), http://www.opsi.gov.uk/ACTS/acts1988/Ukpga_19880027_en_1.htm
Crime Library (2007), Cyberstalking- A Case Study (April 2008), http://www.crimelibrary.com/criminal_mind/psychology/cyberstalking/5.html
Panorama Transcript: One click from Danger (2008) (April 2008), http://news.bbc.co.uk/1/hi/programmes/panorama/7180769.stm
Scott, M.: Focusing on the text and its key words. In: Burnard, L., McEnery, T. (eds.) Rethinking Language Pedagogy from a Corpus Perspective, Peter Lang, Frankfurt, pp. 104–121 (2000)
Rayson, P., Leech, G., Hodges, M.: Social differentiation in the use of English vo-cabulary: some analyses of the conversational component of the British National Corpus. Intl. Journal of Corpus Linguistics 2(1), 133–152 (1997)
Hofland, K., Johansson, S.: Word frequencies in British and American English, NCCH, Bergen, Norway (1982)
Rayson, P.: Matrix: A statistical method and software tool for linguistic analysis through corpus comparison, Ph.D. thesis, Lancaster University (2003)
Holmes, D.I.: Authorship attribution, Computers and the humanities 28(2), 87–106 (1994)
Juola, P., Sofko, J., Brennan, P.: A prototype for authorship attribution studies. Literary and Linguistic Computing 21, 169–178 (2006)
Wmatrix (April 2008), http://ucrel.lancs.ac.uk/wmatrix/
SpectorSoft ‘Spector Pro’ (April 2008), http://www.spectorsoft.com
Protecting Each Other, Crisp ctingeachother (April 2008), http://www.prote
SpyTech ‘Spy Agent’ (April 2008), http://www.spytech-web.com
Rayson, P., Garside, R.: Comparing corpora using frequency profiling. In: Proceedings of the workshop on Comparing Corpora, held in conjunction with ACL 2000, Hong Kong, October 1-8, pp. 1–6 (2000)
Sawyer, P., Rayson, P., Cosh, K.: Shallow Knowledge as an Aid to Deep Understand-ing in Early Phase Requirements Engineering. IEEE Transactions on Software Engineering 31(11), 969–981 (2005)
Face Book Press Information, http://www.facebook.com/press/info.php?statistics
Peng, H.: A Data Mining Approach Based on Grey Prediction Model in Web Environment. Semantics, Knowledge and Grid, 76 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hughes, D. et al. (2008). Supporting Law Enforcement in Digital Communities through Natural Language Analysis. In: Srihari, S.N., Franke, K. (eds) Computational Forensics. IWCF 2008. Lecture Notes in Computer Science, vol 5158. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85303-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-85303-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85302-2
Online ISBN: 978-3-540-85303-9
eBook Packages: Computer ScienceComputer Science (R0)