Skip to main content

When was Macbeth Written? Mapping Book to Time

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9041))

Abstract

We address the question of predicting the time when a book was written using the Google Books Ngram corpus. This prediction could be useful for authorship and plagiarism detection, identification of literary movements, and forensic document examination. We propose an unsupervised approach and compare this with four baseline measures on a dataset consisting of 36 books written between 1551 and 1969. The proposed approach could be applicable to other languages as long as corpora of those languages similar to the Google Books Ngram are available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akiva, N.: Authorship and plagiarism detection using binary bow features. In: Forner, P., Karlgren, J., Womser-Hacker, C. (eds.) CLEF (Online Working Notes/Labs/Workshop) (2012)

    Google Scholar 

  2. Amancio, D.R., Oliveira, O.N., da Fontoura Costa, L.: Identification of literary movements using complex networks to represent texts. New Journal of Physics 14, 043029 (2012)

    Google Scholar 

  3. A simplified guide to forensic document examination (2013), http://www.crime-scene-investigator.net/SimplifiedGuideQuestionedDocuments.pdf (accessed: February 7, 2015)

  4. Michel, J.B., Shen, Y.K., Aiden, A.P., Veres, A., Gray, M.K., Team, T.G.B., Pickett, J.P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M.A., Aiden, E.L.: Quantitative analysis of culture using millions of digitized books. Science 331, 176–182 (2011)

    Article  Google Scholar 

  5. Lin, Y., Michel, J.B., Aiden, E.L., Orwant, J., Brockman, W., Petrov, S.: Syntactic annotations for the Google books ngram corpus. In: Proceedings of the ACL 2012 System Demonstrations, ACL 2012, pp. 169–174. Association for Computational Linguistics, Stroudsburg (2012)

    Google Scholar 

  6. Barufaldi, B., Santana, E., Filho, J., van der Poel, J., Marques, M., Batista, L.: Text classification by literary period using ppm-c data compression. In: 2009 Seventh Brazilian Symposium in Information and Human Language Technology (STIL), pp. 125–133 (2009)

    Google Scholar 

  7. Kim, S., Kim, H., Weninger, T., Han, J.: Authorship classification: A syntactic tree mining approach. In: Proceedings of the ACM SIGKDD Workshop on Useful Patterns, UP 2010, pp. 65–73. ACM, New York (2010)

    Chapter  Google Scholar 

  8. Kessler, B., Numberg, G., SchÃŒtze, H.: Automatic detection of text genre. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, ACL 1998, pp. 32–38. Association for Computational Linguistics, Stroudsburg (1997)

    Google Scholar 

  9. Thisted, R., Efron, B.: Did Shakespeare write a newly-discovered poem? Biometrika 74, 445–455 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  10. Thompson, J.R., Rasp, J.: Did C. S. Lewis write The Dark Tower?: An examination of the small-sample properties of the Thisted-Efron tests of authorship. Austrian Journal of Statistics 38, 71–82 (2009)

    Google Scholar 

  11. Brants, T., Franz, A.: Web 1T 5-gram corpus version 1.1. Technical report, Google Research (2006)

    Google Scholar 

  12. http://www.goodreads.com/ (accessed: January 15, 2015)

  13. https://www.gutenberg.org/ (accessed: January 15, 2015)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aminul Islam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Islam, A., Mei, J., Milios, E.E., Kešelj, V. (2015). When was Macbeth Written? Mapping Book to Time. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18111-0_6

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18110-3

  • Online ISBN: 978-3-319-18111-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics