Abstract
The ability to infer the characteristics of offenders from their criminal behaviour (‘offender profiling’) has only been partially successful since it has relied on subjective judgments based on limited data. Words and structured data used in crime descriptions recorded by the police relate to behavioural features. Thus Language Modelling was applied to an existing police archive to link behavioural features with significant characteristics of offenders. Both multinomial and multiple Bernoulli models were used. Although categories selected are gender and age group, in principle this can be applied to any characteristic recorded. Results indicate that statistically significant relationships exist between both age and sex in certain types of crime. Both types of language model perform with similar effectiveness. It is also possible to identify automatically specific terms which when taken together give insight into the style of offending related to a particular group.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Canter, D., Fritzon, K.: Differentiating Arsonists: A Model of Firesetting Actions and Characteristics. Legal and Criminal Psychology 3, 73–96 (1998)
Chen, H., Chung, W., Xu, J.J., Qin, G.W.Y., Chau, M.: Crime Data Mining: A General Framework and Some Examples. Computer 37(4), 50–56 (2004)
Bache, R., Crestani, F., Canter, D., Youngs, D.: Application of Language Models to Suspect Prioritisation and Suspect Likelihood in Serial Crimes. In: International Workshop on Computer Forensics (2007)
Canter, D.: Offender Profiling and Criminal Differentiation. Legal and Criminological Psychology 5, 23–46 (2000)
Canter, D., Bennell, C., Laurance, A.: Differentiating Sex Offences: A Behaviorally Based Thematic Classification of Stranger Rapes. Behavioral Sciences and the Law 21, 157–174 (2003)
Ponte, J.M., Croft, W.B.: A Language Modeling Approach to Information Retrieval. In: Proceedings of the Twenty First ACM-SIGIR, Melbourne, Australia, pp. 275–281 (1998)
Lafferety, J., Cheng-Xiang, Z.: Probabilistic Relevance Models based on Document and Language Generation. In: Croft, W.B., Lafferty, J. (eds.) Language Modeling for Information Retrieval. Kluwer Academic Publishers, Dordrecht (2003)
Bai, J., Nie, J., Paradis, F.: Text Classification Using Language Models. In: Asia Information Retrieval Symposium, Poster Session, Beijing (2004)
Peng, F., Schuurmans, D.: Combining Naive Bayes and n-Gram Language Models for Text Classification. In: Twenty-Fifth European Conference on Information Retrieval Research (2003)
Peng, F., Schuurmans, D., Wang, S.: Augmenting Naive Bayes classifiers with statistical language models. Information Retrieval 7(3), 317–345 (2003)
Zhai, C., Lafferty, J.: A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval. In: Proceedings of SIGIR, pp. 334–342 (2001)
Losada, D.: Language Modeling for Sentence Retrieval: A Comparison between Multiple-Bernoulli Models and Multinomial Models. In: Information Retrieval Workshop. Glasgow (July 2005)
McCallum, A., Nigam, K.: A Comparison of Event Models for Naïve Bayes Text Classification. In: Proc. AAAI/ICML-98 Workshop on Learning for Text Categorisation, pp. 41–48. AAAI Press, Menlo Park (1998)
Jelinek, F., Mercer, R.: Interpolation estimation of Markov source parameters from sparse data. In: Workshop on Pattern Recognition in Practice, Amsterdam, The Netherlands (1980)
Fellbaum, C. (ed.): WordNet – An Electronic Lexical Database. MIT Press, Cambridge (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bache, R., Crestani, F., Canter, D., Youngs, D. (2008). A Language Modelling Approach to Linking Criminal Styles with Offender Characteristics. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds) Natural Language and Information Systems. NLDB 2008. Lecture Notes in Computer Science, vol 5039. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69858-6_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-69858-6_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69857-9
Online ISBN: 978-3-540-69858-6
eBook Packages: Computer ScienceComputer Science (R0)