Explanation in Computational Stylometry
Computational stylometry, as in authorship attribution or profiling, has a large potential for applications in diverse areas: literary science, forensics, language psychology, sociolinguistics, even medical diagnosis. Yet, many of the basic research questions of this field are not studied systematically or even at all. In this paper we will go into these problems, and suggest that a reinterpretation of current and historical methods in the framework and methodology of machine learning of natural language processing would be helpful. We also argue for more attention in research for explanation in computational stylometry as opposed to purely quantitative evaluation measures and propose a strategy for data collection and analysis for achieving progress in computational stylometry. We also introduce a fairly new application of computational stylometry in internet security.
KeywordsSocial Network Site Machine Learning Method Short Text Supervise Machine Learning Knowledge Extraction
Unable to display preview. Download preview PDF.
- 5.Pennebaker, J.: The Secret Life of Pronouns. Bloomsbury Press, New York (2011)Google Scholar
- 6.Fan, J., Kalyanpur, A., Gondek, D., Ferrucci, D.: Automatic knowledge extraction from documents. IBM Journal of Research and Development 56(3/4), 1–10 (2012)Google Scholar
- 7.Liu, B.: Sentiment Analysis and Opinion Mining, 180 pages. Morgan & Claypool Publishers(2012)Google Scholar
- 11.Argamon, S.: Interpreting Burrow’s Delta: Geometric and Probabilistic Foundations. Literary and Linguistic Computing 23(3), 131–147 (2008)Google Scholar
- 14.Rudman, J.: The satet of non-traditional authorship studies 2010: some problems and solutions. In: Proceedings of the Digital Humanities, pp. 217–219 (2010)Google Scholar
- 16.Brennan, M., Afroz, S., Greenstadt, R.: Adversarial Stylometry: circumventing authorship recognition to preserve privacy and anonymity. ACM Transactions on Information and System Security 15(3), 12:1–22 (2012)Google Scholar
- 19.Koppel, M., Schler, J.: Authorship verification as a one-class classification problem. In: Proceedings 21st International Conference on Machine Learning, pp. 489–495 (2004)Google Scholar
- 21.Luyckx, K.: Scalability Issues in Authorship Attribution. UPA, Antwerp (2010)Google Scholar
- 25.Sanderson, C., Guenter, S.: Short text authorship attribution via sequence kernels, markov chains and author unmasking: an investigation. In: Proceedings of the 2006 EMNLP, pp. 482–491 (2006)Google Scholar
- 27.Peersman, C., Daelemans, W., Van Vaerenbergh, L.: Predicting Age and Gender in Online Social Networks. In: 3rd International Workshop on Search and Mining User-generated Contents (SMUC 2011), pp. 37–44 (2012)Google Scholar
- 28.Peersman, C., Vaassen, F., Van Asch, V., Daelemans, W.: Conversation Level Constraints on Pedophile Detection in Chat Rooms. In: CLEF 2012 Conference and Labs of the Evaluation Forum, pp. 1–13 (2012)Google Scholar