Skip to main content

Using Biographical Texts as Linked Data for Prosopographical Research and Applications

  • Conference paper
  • First Online:
Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection (EuroMed 2018)

Abstract

This paper argues that representing texts as semantic Linked Data provides a useful basis for analyzing their contents in Digital Humanities research and for Cultural Heritage application development. The idea is to transform Cultural Heritage texts into a knowledge graph and a Linked Data service that can be used flexibly in different applications via a SPARQL endpoint. The argument is discussed and evaluated in the context of biographical and prosopographical research and a case study where over 13 000 life stories form biographical collections of Biographical Centre of the Finnish Literature Society were transformed into RDF, enriched by data linking, and published in a SPARQL endpoint. Tools for biography and prosopography, data clustering, network analysis, and linguistic analysis were created with promising first results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.w3.org/standards/semanticweb/ accessed: 13 August 2018.

  2. 2.

    In some use cases, e.g., in person registries [12], the whole registry entry may be written using this kind of semi-formal language.

  3. 3.

    http://cidoc-crm.org accessed: 13 August 2018.

  4. 4.

    http://persistence.uni-leipzig.org/nlp2rdf/specification/core.html accessed: 13 August 2018.

  5. 5.

    http://dublincore.org/documents/dcmi-terms/ accessed: 13 August 2018.

  6. 6.

    Denoted with prefix nbf in the RDF examples.

  7. 7.

    http://turkunlp.github.io/Finnish-dep-parser/ accessed: 13 August 2018.

  8. 8.

    http://universaldependencies.org/format.html accessed: 13 August 2018.

  9. 9.

    For example, the SeCo LAS [19] is a combination of several Finnish NLP tools.

  10. 10.

    The running time complexity O(mn), where n is the amount of files and m their size in bytes.

  11. 11.

    https://kansallisbiografia.fi/ accessed: 13 August 2018.

  12. 12.

    http://www.ldf.fi accessed: 13 August 2018.

  13. 13.

    http://seco.cs.aalto.fi/projects/dcert/ accessed: 13 August 2018.

  14. 14.

    http://developers.google.com/maps/ accessed: 13 August 2018.

  15. 15.

    http://www.tfidf.com/ accessed: 13 August 2018.

  16. 16.

    https://radimrehurek.com/gensim/ accessed: 13 August 2018.

  17. 17.

    https://gephi.org/ accessed: 13 August 2018.

  18. 18.

    http://www.oxforddnb.com/ accessed: 13 August 2018.

  19. 19.

    http://www.anb.org/ accessed: 13 August 2018.

  20. 20.

    http://www.ndb.badw-muenchen.de accessed: 13 August 2018.

  21. 21.

    https://sok.riksarkivet.se/Sbl/Start.aspx accessed: 13 August 2018.

  22. 22.

    http://www.biografischportaal.nl/en accessed: 13 August 2018.

  23. 23.

    http://www.biographynet.nl/ accessed: 13 August 2018.

References

  1. Chiarcos, C., Fäth, C.: CoNLL-RDF: linked corpora done in an NLP-friendly way. In: Gracia, J., Bond, F., McCrae, J.P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) LDK 2017. LNCS (LNAI), vol. 10318, pp. 74–88. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59888-8_6

    Chapter  Google Scholar 

  2. Fokkens, A., et al.: Biographynet: extracting relations between people and events. In: Europa baut auf Biographien, pp. 193–224. New Academic Press, Wien (2017)

    Google Scholar 

  3. Gardiner, E., Musto, R.G.: The Digital Humanities: A Primer for Students and Scholars. Cambridge University Press, Cambridge (2015)

    Book  Google Scholar 

  4. Hakosalo, H., Jalagin, S., Junila, M., Kurvinen, H.: Historiallinen elämä - Biografia ja historiantutkimus. Suomalaisen Kirjallisuuden Seura (SKS) (2014)

    Google Scholar 

  5. Haverinen, K., et al.: Building the essential resources for Finnish: the Turku Dependency Treebank. Lang. Resour. Eval. 48, 493–531 (2014). https://doi.org/10.1007/s10579-013-9244-1. Open access

    Article  Google Scholar 

  6. Heath, T., Bizer, C.: Linked data: evolving the web into a global data space. Synthesis Lectures on the Semantic Web: Theory and Technology, 1 edn. Morgan & Claypool, Palo Alto (2011). http://linkeddatabook.com/editions/1.0/. Accessed 13 Aug 2018

    Article  Google Scholar 

  7. Hellmann, S., Lehmann, J., Auer, S., Brümmer, M.: Integrating NLP using linked data. In: Alani, H. (ed.) ISWC 2013. LNCS, vol. 8219, pp. 98–113. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41338-4_7

    Chapter  Google Scholar 

  8. Hitzler, P., Krötzsch, M., Rudolph, S.: Foundations of Semantic Web Technologies. Springer, Heidelberg (2010)

    Google Scholar 

  9. Hyvönen, E.: Publishing and Using Cultural Heritage Linked Data on the Semantic Web. Synthesis Lectures on the Semantic Web: Theory and Technology. Morgan & Claypool, Palo Alto (2012)

    Book  Google Scholar 

  10. Hyvönen, E., Alonen, M., Ikkala, E., Mäkelä, E.: Life stories as event-based linked data: case semantic national biography. In: Proceedings of ISWC 2014 Posters & Demonstrations Track. CEUR Workshop Proceedings, October 2014. http://ceur-ws.org/Vol-1272/. Accessed 13 Aug 2018

  11. Hyvönen, E., Ikkala, E., Tuominen, J.: Linked data brokering service for historical places and maps. In: Proceedings of the 1st Workshop on Humanities in the Semantic Web (WHiSe), vol. 1608, pp. 39–52. CEUR Workshop Proceedings (2016). http://ceur-ws.org/Vol-1608/#paper-06. Accessed 13 Aug 2018

  12. Hyvönen, E., Leskinen, P., Heino, E., Tuominen, J., Sirola, L.: Reassembling and enriching the life stories in printed biographical registers: norssi high school alumni on the semantic web. In: Gracia, J., Bond, F., McCrae, J.P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) LDK 2017. LNCS (LNAI), vol. 10318, pp. 113–119. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59888-8_9

    Chapter  Google Scholar 

  13. Hyvönen, E., Leskinen, P., Tamper, M., Tuominen, J., Keravuori, K.: Semantic national biography of Finland. In: Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018), vol. 2084, pp. 372–385. CEUR Workshop Proceedings, March 2018. http://www.ceur-ws.org/Vol-2084/short12.pdf. Accessed 13 Aug 2018

  14. Hyvönen, E., Tuominen, J., Alonen, M., Mäkelä, E.: Linked data Finland: a 7-star model and platform for publishing and re-using linked datasets. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8798, pp. 226–230. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11955-7_24

    Chapter  Google Scholar 

  15. Ikkala, E., Tuominen, J., Hyvönen, E.: Contextualizing historical places in a gazetteer by using historical maps and linked data. In: Proceedings of Digital Humanities 2016, Short Papers, pp. 573–577 (2016)

    Google Scholar 

  16. Jannach, D., Zanker, M., Felfernig, A., Friedrich, G.: Recommender Systems: An Introduction. Cambridge University Press, Cambridge (2011)

    Google Scholar 

  17. Leskinen, P., Hyvönen, E., Tuominen, J.: Analyzing and visualizing prosopographical linked data based on short biographies. In: Biographical Data in a Digital World 2017 (BD 2017), Linz, Austria, November 2017. http://ceur-ws.org/Vol-2119/paper7.pdf. Accessed 13 Aug 2018

  18. McSweeney, P.J.: Gephi network statistics. Google Summer Code, pp. 1–8 (2009)

    Google Scholar 

  19. Mäkelä, E.: LAS: an integrated language analysis tool for multiple languages. J. Open Source Softw. 1(6) (2016). https://doi.org/10.21105/joss.00035. Accessed 13 Aug 2018

    Article  Google Scholar 

  20. Otte, E., Rousseau, R.: Social network analysis: a powerful strategy, also for the information sciences. J. Inf. Sci. 28(6), 441–453 (2002)

    Article  Google Scholar 

  21. Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: ten Teije, A. (ed.) EKAW 2012. LNCS (LNAI), vol. 7603, pp. 114–129. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33876-2_12

    Chapter  Google Scholar 

  22. Pyysalo, S., Ginter, F.: Collaborative development of annotation guidelines with application to universal dependencies. In: The Fifth Swedish Language Technology Conference (2014)

    Google Scholar 

  23. Roberts, B.: Biographical Research. Understanding Social Research. Open University Press (2002)

    Google Scholar 

  24. Rospocher, M., et al.: Building event-centric knowledge graphs from news. Web Semant. Sci., Serv. Agents World Wide Web 37, 132–151 (2016)

    Article  Google Scholar 

  25. Shultz, K.: What is distant reading? New York Times, 24 June 2011. https://www.nytimes.com/2011/06/26/books/review/the-mechanic-muse-what-is-distant-reading.html. Accessed 13 Aug 2018

  26. Tuominen, J., Hyvönen, E., Leskinen, P.: Bio CRM: a data model for representing biographical data for prosopographical research. In: Proceedings of the Biographical Data in a Digital World 2017 (BD2017). CEUR Workshop Proceedings (2018). http://ceur-ws.org/Vol-2119/paper10.pdf. Accessed 13 Aug 2018

  27. Verboven, K., Carlier, M., Dumolyn, J.: A short manual to the art of prosopography. In: Prosopography Approaches and Applications. A Handbook, pp. 35–70. Unit for Prosopographical Research (Linacre College) (2007)

    Google Scholar 

  28. Wu, Y., Sun, H., Yan, C.: An event timeline extraction method based on news corpus. In: 2017 IEEE 2nd International Conference on Big Data Analysis, pp. 697–702. IEEE (2017)

    Google Scholar 

Download references

Acknowledgements

Our research is part of the Severi project (http://seco.cs.aalto.fi/projects/severi accessed: 13 August 2018), funded mainly by Business Finland.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minna Tamper .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tamper, M., Leskinen, P., Apajalahti, K., Hyvönen, E. (2018). Using Biographical Texts as Linked Data for Prosopographical Research and Applications. In: Ioannides, M., et al. Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. EuroMed 2018. Lecture Notes in Computer Science(), vol 11196. Springer, Cham. https://doi.org/10.1007/978-3-030-01762-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01762-0_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01761-3

  • Online ISBN: 978-3-030-01762-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics