Skip to main content

Extraction of Character Profiles from the Gutenberg Archive

  • Conference paper
  • First Online:
Book cover Metadata and Semantic Research (MTSR 2019)

Abstract

Online text repositories such as Gutenberg.org have been increasing in number, size and adoption. This growing availability prompts new investigations for insights into the knowledge emerging from the content of e.g. literature and drama. However, the process relies upon the repositories’ ability to fulfill FAIR principles. We present the preparatory work on the semantic analysis of drama literature in Gutenberg, aiming at the extraction and profiling of fictional characters and their narrative roles. Our preliminary analysis matches such characters and their corresponding profiles in knowledge bases such as DBpedia and Wikidata.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    VIAF linking properties: dbo:viafId (DBpedia) and wpprop:P214 (Wikidata).

  2. 2.

    LoC linking properties: dbo:lccnId (DBpedia) and wpprop:P244 (Wikidata).

  3. 3.

    https://www.gutenberg.org/wiki/Cataloging_Guidelines#Check_the_author.28s.29, http://id.loc.gov/authorities/names/n79022935.html.

  4. 4.

    See e.g. VIAPy - https://pypi.org/project/viapy/.

  5. 5.

    Gutenberg linkset in Turtle format, https://dhtk.unil.ch/static/sameas.ttl.

References

  1. Argamon, S., Dhawle, S., Koppel, M., Pennebaker, J.W.: Lexical predictors of personality type. In: Proceedings of Interface and the Classification Society of North America (2005)

    Google Scholar 

  2. Bamman, D., Underwood, T., Smith, N.A.: A Bayesian mixed effects model of literary character. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1, Long Papers, pp. 370–379 (2014)

    Google Scholar 

  3. Celli, F., Lepri, B., Biel, J.I., Gatica-Perez, D., Riccardi, G., Pianesi, F.: The workshop on computational personality recognition 2014. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 1245–1246. ACM (2014)

    Google Scholar 

  4. Chaturvedi, S., Srivastava, S., Daume III, H., Dyer, C.: Modeling evolving relationships between characters in literary novels. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence (2016)

    Google Scholar 

  5. Egloff, M., Lieto, A., Picca, D.: An ontological model for inferring psychological profiles and narrative roles of characters. In: Palau, J.G., Russell, I.G. (eds.) Digital Humanities 2018, DH 2018, Book of Abstracts, El Colegio de México, UNAM, and RedHD, Mexico City, Mexico, June 26–29, 2018, pp. 649–650. Red de Humanidades Digitales A. C. (2018). https://dh2018.adho.org/en/an-ontological-model-for-inferring-psychological-profiles-and-narrative-roles-of-characters/

  6. Flekova, L., Gurevych, I.: Personality profiling of fictional characters using sense-level links between lexical resources. In: Màrquez, L., Callison-Burch, C., Su, J., Pighin, D., Marton, Y. (eds.) Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp. 1805–1816. The Association for Computational Linguistics (2015). http://aclweb.org/anthology/D/D15/D15-1208.pdf

  7. Gill, A.J., Nowson, S., Oberlander, J.: What are they blogging about? Personality, topic and motivation in blogs. In: Proceedings of the Third International Conference on Weblogs and Social Media, ICWSM 2009, San Jose, California, USA, May 17–20, 2009 (2009). http://aaai.org/ocs/index.php/ICWSM/09/paper/view/199

  8. Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 30, 457–500 (2007). https://doi.org/10.1613/jair.2349

    Article  MATH  Google Scholar 

  9. Nowson, S., Oberlander, J.: Identifying more bloggers: towards large scale personality classification of personal weblogs. In: Glance, N.S., Nicolov, N., Adar, E., Hurst, M., Liberman, M., Salvetti, F. (eds.) Proceedings of the First International Conference on Weblogs and Social Media, ICWSM 2007, Boulder, Colorado, USA, March 26–28, 2007 (2007). http://www.icwsm.org/papers/paper4.html

  10. Picca, D., Egloff, M.: DHTK: the digital humanities toolkit. In: Adamou, A., Daga, E., Isaksen, L. (eds.) Proceedings of the Second Workshop on Humanities in the Semantic Web (WHiSe II) Co-Located with 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, October 22, 2017. CEUR Workshop Proceedings, vol. 2014, pp. 81–86. CEUR-WS.org (2017). http://ceur-ws.org/Vol-2014/paper-09.pdf

  11. Smith, N.A., Bamman, D., OConnor, B.: Learning latent personas of film characters. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (2013)

    Google Scholar 

  12. Srivastava, S., Chaturvedi, S., Mitchell, T.: Inferring interpersonal relations in narrative summaries. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alessandro Adamou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Egloff, M., Picca, D., Adamou, A. (2019). Extraction of Character Profiles from the Gutenberg Archive. In: Garoufallou, E., Fallucchi, F., William De Luca, E. (eds) Metadata and Semantic Research. MTSR 2019. Communications in Computer and Information Science, vol 1057. Springer, Cham. https://doi.org/10.1007/978-3-030-36599-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-36599-8_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-36598-1

  • Online ISBN: 978-3-030-36599-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics