Abstract
Authorship attribution, the science of inferring characteristics of an author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. This paper surveys the history and present state of the discipline — essentially a collection of ad hoc methods with little formal data available to select among them. It also makes some predictions about the needs of the discipline and discusses how these needs might be met.
Chapter PDF
Similar content being viewed by others
References
A. Abbasi and H. Chen, Identification and comparison of extremist-group web forum messages using authorship analysis, IEEE Intelligent Systems, vol. 20(5), pp. 67–75, 2005.
A. Abbasi and H. Chen, Visualizing authorship for identification, in Proceedings of the IEEE International Conference on Intelligence and Security Informatics (LNCS 3975), S. Mehrotra, et al. (Eds.), Springer-Verlag, Berlin Heidelberg, Germany, pp. 60–71, 2006.
S. Argamon and S. Levitan, Measuring the usefulness of function words for authorship attribution, Proceedings of the Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities, 2005.
R. Baayen, H. van Halteren, A. Neijt and F. Tweedie, An experiment in authorship attribution, Proceedings of JADT 2002: Sixth International Conference on Textual Data Statistical Analysis, pp. 29–37, 2002.
J. Binongo, Who wrote the 15th Book of Oz? An application of multivariate analysis to authorship attribution, Chance, vol. 16(2), pp. 9–17, 2003.
C. Brown, M. Covington, J. Semple and J. Brown, Reduced idea density in speech as an indicator of schizophrenia and ketamine in-toxication, presented at the International Congress on Schizophrenia Research, 2005.
J. Burrows, “an ocean where each kind…:” Statistical analysis and some major determinants of literary style, Computers and the Humanities, vol. 23(4–5), pp. 309–321, 1989.
J. Burrows, Questions of authorships: Attribution and beyond, Computers and the Humanities, vol. 37(1), pp. 5–32, 2003.
F. Can and J. Patton, Change of writing style with time, Computers and the Humanities, vol. 38(1), pp. 61–82, 2004.
C. Chaski, Who’s at the keyboard: Authorship attribution in digital evidence investigations, International Journal of Digital Evidence, vol. 4(1), 2005.
C. Chaski, The keyboard dilemma and forensic authorship attribution, in Advances in Digital Forensics III, P. Craiger and S. Shenoi (Eds.), Springer, New York, pp. 133–146, 2007.
G. Easson, The linguistic implications of Shibboleths, presented at the Annual Meeting of the Canadian Linguistics Association, 2002.
R. Forsyth, Towards a text benchmark suite, Proceedings of the Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities, 1997.
W. Friedman and E. Friedman., The Shakespearean Ciphers Examined, Cambridge University Press, Cambridge, United Kingdom, 1957.
D. Holmes, Authorship attribution, Computers and the Humanities, vol. 28(2), pp. 87–106, 1994.
D. Holmes, Stylometry and the Civil War: The case of the Pickett Letters, Chance, vol. 16(2), pp. 18–26, 2003.
D. Holmes and R. Forsyth, The Federalist revisited: New directions in authorship attribution, Literary and Linguistic Computing, vol. 10(2), pp. 111–127, 1995.
D. Hoover, Delta prime? Literary and Linguistic Computing, vol. 19(4), pp. 477–495, 2004.
D. Hoover, Testing Burrows’ delta, Literary and Linguistic Computing, vol. 19(4), pp. 453–475, 2004.
[20] International Graphoanalysis Society (IGAS), http://www.igas.com.
E. Johnson, Lexical Change and Variation in the Southeastern United States 1930–1990, University of Alabama Press, Tuscaloosa, Alabama, 1996.
P. Juola, What can we do with small corpora? Document categorization via cross-entropy, Proceedings of the Interdisciplinary Workshop on Similarity and Categorization, 1997.
P. Juola, The rate of language change, Proceedings of the Fourth International Conference on Quantitative Linguistics, 2000.
P. Juola, Becoming Jack London, Proceedings of the Fifth International Conference on Quantitative Linguistics, 2003.
P. Juola, Ad-hoc authorship attribution competition, Proceedings of the Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities, 2004.
P. Juola, On composership attribution, Proceedings of the Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities, 2004.
P. Juola, Authorship attribution for electronic documents, in Advances in Digital Forensics II, M. Olivier and S. Shenoi (Eds.), Springer, New York, pp. 119–130, 2006.
P. Juola and H. Baayen, A controlled-corpus experiment in authorship attribution by cross-entropy, Literary and Linguistic Computing, vol. 20, pp. 59–67, 2005.
P. Juola, J. Sofko and P. Brennan, A prototype for authorship attribution studies, Literary and Linguistic Computing, vol. 21(2), pp. 169–178, 2006.
D. Kahn, The Codebreakers, Scribner, New York, 1996.
V. Kešelj and N. Cercone, CNG method with weighted voting, presented at the Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities, 2004.
M. Koppel, S. Argamon and A. Shimoni, Automatically categorizing written texts by author gender, Literary and Linguistic Computing, vol. 17(4), pp. 401–412, 2002.
C. Martindale and D. McKenzie, On the utility of content analysis in authorship attribution: The Federalist Papers, Computers and the Humanities, vol. 29(4), pp. 259–270, 1995.
T. Mendenhall, The characteristic curves of composition, Science, vol. IX, pp. 237–249, 1887.
F. Mosteller and D. Wallace, Inference and Disputed Authorship: The Federalist, Addison-Wesley, Reading, Massachusetts, 1964.
M. Rockeach, R. Homant and L. Penner, A value analysis of the disputed Federalist Papers, Journal of Personality and Social Psychology, vol. 16, pp. 245–250, 1970.
J. Rudman, The state of authorship attribution studies: Some problems and solutions, Computers and the Humanities, vol. 31, pp. 351–365, 1998.
J. Rudman, The non-traditional case for the authorship of the twelve disputed Federalist Papers: A monument built on sand, Proceedings of the Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities, 2005.
S. Stein and S. Argamon, A mathematical explanation of Burrows’ delta, Proceedings of the Digital Humanities Conference, 2006.
F. Tweedie, S. Singh and D. Holmes, Neural network applications in stylometry: The Federalist Papers, Computers and the Humanities, vol. 30(1), pp. 1–10, 1996.
H. van Halteren, R. Baayen, F. Tweedie, M. Haverkort and A. Neijt, New machine learning methods demonstrate the existence of a human stylome, Journal of Quantitative Linguistics, vol. 12(1), pp. 65–77, 2005.
F. Wellman, The Art of Cross-Examination, MacMillan, New York, 1936.
G. Yule, The Statistical Study of Literary Vocabulary, Cambridge University Press, Cambridge, United Kingdom, 1944.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 International Federation for Information Processing
About this paper
Cite this paper
Juola, P. (2007). Future Trends in Authorship Attribution. In: Craiger, P., Shenoi, S. (eds) Advances in Digital Forensics III. DigitalForensics 2007. IFIP — The International Federation for Information Processing, vol 242. Springer, New York, NY. https://doi.org/10.1007/978-0-387-73742-3_8
Download citation
DOI: https://doi.org/10.1007/978-0-387-73742-3_8
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-73741-6
Online ISBN: 978-0-387-73742-3
eBook Packages: Computer ScienceComputer Science (R0)