Advertisement

A New Approach for Authorship Attribution

  • P. Buddha Reddy
  • T. Raghunadha Reddy
  • M. Gopi Chand
  • A. Venkannababu
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 701)

Abstract

Authorship attribution is a text classification technique, which is used to find the author of an unknown document by analyzing the documents of multiple authors. The accuracy of author identification mainly depends on the writing styles of the authors. Feature selection for differentiating the writing styles of the authors is one of the most important steps in the authorship attribution. Different researchers proposed a set of features like character, word, syntactic, semantic, structural, and readability features to predict the author of a unknown document. Few researchers used term weight measures in authorship attribution. Term weight measures have proven to be an effective way to improve the accuracy of text classification. The existing approaches in authorship attribution used the bag-of-words approach to represent the document vectors. In this work, a new approach is proposed, wherein the document weight is used to represent the document vector instead of using features or terms in the document. The experimentation is carried out on reviews corpus with various classifiers, and the results achieved for author attribution are prominent than most of the existing approaches.

Keywords

Authorship attribution Author prediction Term weight measure BOW approach 

References

  1. 1.
    Stamatatos, E.: A survey of modern authorship attribution methods. JASIST (2009)CrossRefGoogle Scholar
  2. 2.
    Elayidom, M.S., Jose, C., Puthussery, A., Sasi, N.K.: Text classification for authorship attribution analysis. Advanc. Comput. Int. J. 4(5) (2013)Google Scholar
  3. 3.
    Koppel, M., Schler, J., Bonchek-Dokow, E.: Measuring differentiability: Unmasking pseudonymous authors. J. Mach. Learn. Res. 8, 1261–1276 (2007)MATHGoogle Scholar
  4. 4.
    Koppel, M., Argamon, S., Shimoni, A.R.: Automatically categorizing written texts by author gender. Liter. Linguist. Comput. 17(4), 401–412 (2002)CrossRefGoogle Scholar
  5. 5.
    Juola, P.: Authorship attribution. Found. Trends Inf. Retr. 1, 233–334 (2006)CrossRefGoogle Scholar
  6. 6.
    Stefan, R., Traian, R.: Authorship identification using a reduced set of linguistic features—notebook for PAN at CLEF 2012. In: CLEF 2012 Evaluation Labs and Workshop, 17–20 September, Rome, Italy, September 2012. ISBN 978-88-904810-3-1. ISSN 2038-4963Google Scholar
  7. 7.
    Ludovic, T., Franck, S., Basilio, C., Nabil, H.: Authorship attribution: using rich linguistic features when training data is scarce. In: CLEF 2012 Evaluation Labs and Workshop, 17–20 September, Rome, Italy, September 2012. ISBN 978-88-904810-3-1. ISSN 2038-4963Google Scholar
  8. 8.
    Ludovic, T., Assaf, U., Basilio, C., Nabil, H., Franck, S.: A Multitude of Linguistically-rich Features for Authorship Attribution. CLEF 2011 Labs and Workshops, 19–22 September, Amsterdam, Netherlands, September 2011. ISBN 978-88-904810-1-7. ISSN 2038-4963Google Scholar
  9. 9.
    Navot, A.: Authorship and plagiarism detection using binary BOW features. In: CLEF 2012 Evaluation Labs and Workshop, 17–20 September, Rome, Italy, September 2012. ISBN 978-88-904810-3-1. ISSN 2038-4963Google Scholar
  10. 10.
    Wei, Z., Feng, Wu, Lap-Keung, C., Domenic, S., A discriminative and semantic feature selection method for text categorization. Int. J. Prod. Econom. Elsevier, 215–222 (2015)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • P. Buddha Reddy
    • 1
  • T. Raghunadha Reddy
    • 1
  • M. Gopi Chand
    • 1
  • A. Venkannababu
    • 2
  1. 1.Department of ITVardhaman College of EngineeringHyderabadIndia
  2. 2.Department of CSESri Vasavi Engineering CollegeTadepalligudemIndia

Personalised recommendations