Advertisement

Dynamic Ensemble Selection for Author Verification

  • Nektaria Potha
  • Efstathios StamatatosEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11437)

Abstract

Author verification is a fundamental task in authorship analysis and associated with significant applications in humanities, cyber-security, and social media analytics. In some of the relevant studies, there is evidence that heterogeneous ensembles can provide very reliable solutions, better than any individual verification model. However, there is no systematic study of examining the application of ensemble methods in this task. In this paper, we start from a large set of base verification models covering the main paradigms in this area and study how they can be combined to build an accurate ensemble. We propose a simple stacking ensemble as well as a dynamic ensemble selection approach that can use the most reliable base models for each verification case separately. The experimental results in ten benchmark corpora covering multiple languages and genres verify the suitability of ensembles for this task and demonstrate the effectiveness of our method, in some cases improving the best reported results by more than 10%.

Keywords

Author verification Authorship analysis Ensemble learning Dynamic ensemble selection 

References

  1. 1.
    Almishari, M., Oguz, E., Tsudik, G.: Fighting authorship linkability with crowdsourcing. In: Proceedings of the Second ACM Conference on Online Social Networks, COSN, pp. 69–82 (2014)Google Scholar
  2. 2.
    Bagnall, D.: Author identification using multi-headed recurrent neural networks. In: Cappellato, L., Ferro, N., Gareth, J., San Juan, E. (eds.) Working Notes Papers of the CLEF 2015 Evaluation Labs (2015)Google Scholar
  3. 3.
    Barbon, S., Igawa, R., Bogaz Zarpelão, B.: Authorship verification applied to detection of compromised accounts on online social networks: a continuous approach. Multimed. Tools Appl. 76(3), 3213–3233 (2017)CrossRefGoogle Scholar
  4. 4.
    Bartoli, A., Dagri, A., Lorenzo, A.D., Medvet, E., Tarlao, F.: An author verification approach based on differential features. In: Cappellato, L., Ferro, N., Gareth, J., San Juan, E. (eds.) Working Notes Papers of the CLEF 2015 Evaluation Labs (2015)Google Scholar
  5. 5.
    Brocardo, M., Traore, I., Woungang, I., Obaidat, M.: Authorship verification using deep belief network systems. Int. J. Commun. Syst. 30(12) (2017). Article no. e3259CrossRefGoogle Scholar
  6. 6.
    Castro-Castro, D., Arcia, Y.A., Brioso, M.P., Guillena, R.M.: Authorship verification, average similarity analysis. In: Recent Advances in Natural Language Processing, pp. 84–90 (2015)Google Scholar
  7. 7.
    Ding, S., Fung, B., Iqbal, F., Cheung, W.: Learning stylometric representations for authorship analysis. IEEE Trans. Cybern. 49(1), 107–121 (2019)CrossRefGoogle Scholar
  8. 8.
    Duman, S., Kalkan-Cakmakci, K., Egele, M., Robertson, W., Kirda, E.: Emailprofiler: Spearphishing filtering with header and stylometric features of emails. In: Proceedings - International Computer Software and Applications Conference, vol. 1, pp. 408–416 (2016)Google Scholar
  9. 9.
    Fréry, J., Largeron, C., Juganaru-Mathieu, M.: UJM at CLEF in author identification. In: Proceedings CLEF-2014, Working Notes, pp. 1042–1048 (2014)Google Scholar
  10. 10.
    Halvani, O., Graner, L., Vogel, I.: Authorship verification in the absence of explicit features and thresholds. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 454–465. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-76941-7_34CrossRefGoogle Scholar
  11. 11.
    Hernández, C.Á., Calvo, H.: Author verification using a semantic space model. Computación y Sistemas 21(2) (2017)Google Scholar
  12. 12.
    Hürlimann, M., Weck, B., van den Berg, E., Šuster, S., Nissim, M.: GLAD: groningen lightweight authorship detection. In: Cappellato, L., Ferro, N., Jones, G., San Juan, E. (eds.) CLEF 2015 Evaluation Labs and Workshop - Working Notes Papers. CEUR-WS.org (2015)Google Scholar
  13. 13.
    Jankowska, M., Milios, E., Keselj, V.: Author verification using common n-gram profiles of text documents. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 387–397 (2014)Google Scholar
  14. 14.
    Juola, P., Stamatatos, E.: Overview of the author identification task at PAN 2013. In: Working Notes for CLEF 2013 Conference (2013)Google Scholar
  15. 15.
    Kestemont, M., Luyckx, K., Daelemans, W.T.C.: Cross-genre authorship verification using unmasking. Engl. Stud. 93(3), 340–356 (2012)CrossRefGoogle Scholar
  16. 16.
    Khonji, M., Iraqi, Y.: A slightly-modified GI-based author-verifier with lots of features (ASGALF). In: CLEF 2014 Labs and Workshops, Notebook Papers. CLEF and CEUR-WS.org (2014)Google Scholar
  17. 17.
    Ko, A.H., Sabourin, R., de Souza Britto Jr., A.: From dynamic classifier selection to dynamic ensemble selection. Pattern Recogn. 41(5), 1718–1731 (2008)Google Scholar
  18. 18.
    Kocher, M., Savoy, J.: A simple and efficient algorithm for authorship verification. J. Assoc. Inf. Sci. Technol. 68(1), 259–269 (2017)CrossRefGoogle Scholar
  19. 19.
    Koppel, M., Schler, J., Argamon, S., Winter, Y.: The fundamental problem of authorship attribution. Engl. Stud. 93(3), 284–291 (2012)CrossRefGoogle Scholar
  20. 20.
    Koppel, M., Schler, J., Bonchek-Dokow, E.: Measuring differentiability: unmasking pseudonymous authors. J. Mach. Learn. Res. 8, 1261–1276 (2007)zbMATHGoogle Scholar
  21. 21.
    Koppel, M., Winter, Y.: Determining if two documents are written by the same author. J. Am. Soc. Inf. Sci. Technol. 65(1), 178–187 (2014)CrossRefGoogle Scholar
  22. 22.
    Layton, R., Watters, P., Ureche, O.: Identifying faked hotel reviews using authorship analysis. In: Proceedings - 4th Cybercrime and Trustworthy Computing Workshop, CTC 2013, pp. 1–6 (2013)Google Scholar
  23. 23.
    Moreau, E., Jayapal, A., Lynch, G., Vogel, C.: Author verification: basic stacked generalization applied to predictions from a set of heterogeneous learners-notebook for PAN at CLEF 2015. In: CLEF 2015-Conference and Labs of the Evaluation forum. CEUR (2015)Google Scholar
  24. 24.
    Noreen, E.: Computer-Intensive Methods for Testing Hypotheses: An Introduction. Wiley, New York (1989)Google Scholar
  25. 25.
    Potha, N., Stamatatos, E.: A profile-based method for authorship verification. In: Likas, A., Blekas, K., Kalles, D. (eds.) SETN 2014. LNCS (LNAI), vol. 8445, pp. 313–326. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-07064-3_25CrossRefGoogle Scholar
  26. 26.
    Potha, N., Stamatatos, E.: An improved impostors method for authorship verification. In: Jones, G.J.F., et al. (eds.) CLEF 2017. LNCS, vol. 10456, pp. 138–144. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-65813-1_14CrossRefGoogle Scholar
  27. 27.
    Potha, N., Stamatatos, E.: Intrinsic author verification using topic modeling. In: Artificial Intelligence: Methods and Applications - Proceedings of the 10th Hellenic Conference on AI, SETN (2018)Google Scholar
  28. 28.
    Potha, N., Stamatatos, E.: Improving author verification based on topic modeling. J. Assoc. Inf. Sci. Technol. (2019)Google Scholar
  29. 29.
    Sanderson, C., Guenter, S.: Short text authorship attribution via sequence kernels, markov chains and author unmasking: an investigation. In: Proceedings of the International Conference on Empirical Methods in Natural Language Engineering, pp. 482–491 (2006)Google Scholar
  30. 30.
    Seidman, S.: Authorship verification using the impostors method. In: Forner, P., Navigli, R., Tufis, D. (eds.) CLEF 2013 Evaluation Labs and Workshop - Working Notes Papers (2013)Google Scholar
  31. 31.
    Stamatatos, E.: A survey of modern authorship attribution methods. J. Am. Soc. Inf. Sci. Technol. 60, 538–556 (2009)CrossRefGoogle Scholar
  32. 32.
    Stamatatos, E.: Authorship verification: a review of recent advances. Res. Comput. Sci. 123, 9–25 (2016)Google Scholar
  33. 33.
    Stamatatos, E., et al.: Overview of the author identification task at PAN 2015. In: Working Notes of CLEF 2015 - Conference and Labs of the Evaluation Forum (2015)Google Scholar
  34. 34.
    Stamatatos, E., et al.: Overview of the author identification task at PAN 2014. In: CLEF Working Notes, pp. 877–897 (2014)Google Scholar
  35. 35.
    Stover, J.A., Winter, Y., Koppel, M., Kestemont, M.: Computational authorship verification method attributes a new work to a major 2nd century African author. J. Am. Soc. Inf. Sci. Technol. 67(1), 239–242 (2016)CrossRefGoogle Scholar
  36. 36.
    Tuccinardi, E.: An application of a profile-based method for authorship verification: investigating the authenticity of Pliny the Younger’s letter to Trajan concerning the Christians. Digit. Scholarsh. Humanit. 32(2), 435–447 (2017)Google Scholar
  37. 37.
    Wolpert, D.H.: Stacked generalization. Neural Netw. 5, 241–259 (1992)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of the AegeanKarlovassiGreece

Personalised recommendations