Abstract
Most of the crimes committed today are reported on the Internet by news articles, blogs and social networking sites. With the increasing volume of crime information available on the web, a means to retrieve and exploit them and provide insight into the criminal behavior and networks must be determined to fight crime more efficiently and effectively. We believe that an electronic system must be designed for crime named entity recognition from the newspaper articles. Thus, this study designs and develops a crime named entity recognition based on machine learning approaches that extract nationalities, weapons, and crime locations in online crime documents. This study also collected a new corpus of crime and manually labeled them. A machine learning classification framework is proposed based on Naïve Bayes and SVM model in extracting nationalities, weapons, and crime location from online crime documents. To evaluate our model, a manually annotated data set was used, which was then validated by experiments. The results of the experiments showed that the developed techniques are promising.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chau, M., Xu, J.J., Chen, H.: Extracting Meaningful Entities from Police Narrative Reports. In: 2002 Proceedings of the 2002 Annual National Conference on Digital Government Research, pp. 1–5 (2002)
Alruily, M., Ayesh, A., Al-Marghilani, A.: Using Self Organizing Map to Cluster Arabic crime documents. In: Proceedings of the International Multiconference on Computer Science and Information Technology, IMCSIT, pp. 357–363 (2010)
Nath, S.V.: Crime Pattern Detection using Data Mining. In: 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops, WI-IAT 2006 Workshops, pp. 41–44. IEEE (2006)
Chih Hao, K., Iriberri, A., Leroy, G.: Crime Information Extraction from Police and Witness Narrative Reports. In: Conference on Technologies for Homeland Security, pp. 193–198. IEEE (2008)
Alruily, M., Ayesh&, A., Zedan, H.: Automated Dictionary Construction from Arabic Corpus for Meaningful Crime Information Extraction and Document Classification, 137–142 (2010)
Pinheiro, V., Furtado, V., Pequeno, T., Nogueira, D.: Natural Language Processing based on Semantic Inferentialism for Extracting Crime Information from Text. In: IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 19–24 (2010)
Arulanandam, R., Savarimuthu, B.T.R., Purvis, M.A.: Extracting Crime Information from Online Newspaper Articles. In: Proceedings of the Second Australasian Web Conference (2014)
Cortes, C., Vapnik, V.: Support-vector Networks. Machine Learning 20, 273–297 (1995)
Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization (1997)
Joachims, T.: The Maximum Margin Approach to Learning Text Classifiers: Methods,Theory, and Algorithms. PhD thesis, university Dortmund (2001)
Joachims, T.: Text Categorization With Support Vector Machines: Learning with Many Relevant Features. In: European Conference on Machine Learning, Chemnitz, Germany, pp. 137–142 (1998)
Isa, D., Lee, L.H., Kallimani, V.P., RajKumar, R.: Text Documents Preprocessing with the Bahes Formula for Classification using the Support Vector Machine. IEEE, TKDE 20(9), 1264–1272 (2008)
Saha, S., Ekbal, A.: Combining Multiple Classifiers using Vote based Classifier Ensemble Technique for Named Entity Recognition. Data& Knowledge Engineering, 85 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Shabat, H., Omar, N., Rahem, K. (2014). Named Entity Recognition in Crime Using Machine Learning Approach. In: Jaafar, A., et al. Information Retrieval Technology. AIRS 2014. Lecture Notes in Computer Science, vol 8870. Springer, Cham. https://doi.org/10.1007/978-3-319-12844-3_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-12844-3_24
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12843-6
Online ISBN: 978-3-319-12844-3
eBook Packages: Computer ScienceComputer Science (R0)