Abstract
Authorship attribution is the challenging and promising research field of digital forensics. It determines the plausible author of a text message written by an author by investigating other documents written by that author. Analysis of online messages is helpful to examine the text content in order to draw conclusion about attribution of authorship. Forensics analysis of online messages involves analyzing long fraud documents, terrorists secret communication, suicide letters, threatening mails, emails, blog posts, and also short texts such as SMS text messages, Twitter streams, or Facebook status updates to check the authenticity and identify fraudulence. This paper evaluates the performance of various classifiers for authorship attribution of online messages using proposed wordprint approach. Data mining classification techniques selected for performing the task of authorship attribution are SVM, K-NN, and naïve Bayes. Also, performance analysis of frequent words was evaluated using same experimental setup.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)
Sara, E., Manar, E., Bouanani: Authorship analysis studies: a survey. Int. J. Comput. Appl. 86(12), 22–29 (2014)
Mukherjee, A., Pensia, A., Pandey, A.: Author identification: a deep approach and comparative study (2015)
Juola, P., Stamatatos, E.: Overview of the author identification task, PAN 2013
Howedi, F., Mohd, M.: Text classification for authorship attribution using Naive Bayes classifier with limited training data. Int. J. Comput. Eng. Intell. Syst. 5(4), 48–56 (2014)
Nirkhi, S., Dharaskar, R.V.: Comparative study of authorship identification techniques for cyber forensics analysis. Int. J. Adv. Comput. Sci. Appl. 4(5), 32–35 (2013)
Mendenhall, T.C.: The characteristic curves of composition. Science 11(11), 237–249 (1997)
Forensic Toolkit: Web site: http://www.accessdata.com/forensictoolkit.html. Retrieved on 2 Mar 2009. Access Data
Encase: Web site: http://www.guidancesoftware.com/. Retrieved on 10 May 2010. Guidance Software
Stolfo, S.J., Hershkop, S.: Email mining toolkit supporting law enforcement forensic analyses. In: National Conference on Digital Government Research, pp. 221–222. Digital Government Society of North America (2005)
Holmes, D.I.: The evolution of stylometry in humanities scholarship. Literary Linguist. Comput. 13(3), 111–117 (1998)
Koppel, M., Schler, J., Argamon, S.: Authorship attribution in the wild. J. Lang. Res. Eval. 45(1), 83–94 (2010)
Motion, P.: Hidden evidence. J. Law Soc. Scotland 50(2), 32–34 (2005)
Rygl, J., Zemková, K., Kovář, V.: Authorship verification based on syntax features. In: Sixth Workshop on Recent Advances in Slavonic Natural Language Processing, pp. 111–119 (2012)
Introduction to data mining Concept by Han and Kamber, Witten and Frank (2005)
Chen, H., Chung, W., Qin, Y., Chau, M., Xu, J.J., Wang, G., Zheng, R., Atabakhsh, H.: Crime data mining: an overview and case studies. In: National Conference for Digital Government Research, Boston, Massachusetts, USA, pp. 45–48 (2003)
Mena, J.: Investigative Data Mining for Security and Criminal Detection. Butterworth Heinemann, New York (2003)
Hadjidj, R., Debbabi, M., Lounis, H., Iqbal, F., Szporer, A., Benredjem, D.: Towards an integrated e-mail forensic analysis framework. J. Digital Invest. Int. J. Digital Forensics Incident Response 5(3), 124–137 (2009)
Madigan, D., Genkin, A., Lewis, D.D., Argamon, S., Fradkin, D., Ye, L.: Authorship identification on large scale. In: Proceedings of the Meeting of the Classification Society of North America (2005)
Zheng, R., Li, J., Chen, H., Huang, Z: A framework for authorship identification of online messages, writing style features and classification technique. Wiley InterScience (2005)
Juola, P.: Authorship attribution. J. Found. Trends Inf. Retrieval 1(3), 238–239 (2006)
Mendenhall, T.C.: The characteristic curves of composition. J. Sci. 11(11), 237–249 (1987)
Karie, N.M., Venter, H.S.: Toward a general ontology for digital forensic disciplines. J. Forensic Sci. 59(5), 1231–1241 (2014)
Teng, G.-F., Lai, M.S., Ma, J.-B., Li, Y.: E-mail authorship mining based on SVM for computer forensic. In: Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, Aug 2004, pp. 26–29
Madigan, D., Genkin, A., Lewis, D.D., Argamon, S., Fradkin, D., Ye, L.: Author identification on the large scale. In: Proceedings of the Meeting of the Classification Society of North America, June 2005, pp. 1–12
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nirkhi, S. (2019). Evaluation of Classifiers for Detection of Authorship Attribution. In: Verma, N., Ghosh, A. (eds) Computational Intelligence: Theories, Applications and Future Directions - Volume I. Advances in Intelligent Systems and Computing, vol 798. Springer, Singapore. https://doi.org/10.1007/978-981-13-1132-1_18
Download citation
DOI: https://doi.org/10.1007/978-981-13-1132-1_18
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1131-4
Online ISBN: 978-981-13-1132-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)