Using Biographical Texts as Linked Data for Prosopographical Research and Applications

  • Minna TamperEmail author
  • Petri Leskinen
  • Kasper Apajalahti
  • Eero Hyvönen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11196)


This paper argues that representing texts as semantic Linked Data provides a useful basis for analyzing their contents in Digital Humanities research and for Cultural Heritage application development. The idea is to transform Cultural Heritage texts into a knowledge graph and a Linked Data service that can be used flexibly in different applications via a SPARQL endpoint. The argument is discussed and evaluated in the context of biographical and prosopographical research and a case study where over 13 000 life stories form biographical collections of Biographical Centre of the Finnish Literature Society were transformed into RDF, enriched by data linking, and published in a SPARQL endpoint. Tools for biography and prosopography, data clustering, network analysis, and linguistic analysis were created with promising first results.



Our research is part of the Severi project ( accessed: 13 August 2018), funded mainly by Business Finland.


  1. 1.
    Chiarcos, C., Fäth, C.: CoNLL-RDF: linked corpora done in an NLP-friendly way. In: Gracia, J., Bond, F., McCrae, J.P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) LDK 2017. LNCS (LNAI), vol. 10318, pp. 74–88. Springer, Cham (2017). Scholar
  2. 2.
    Fokkens, A., et al.: Biographynet: extracting relations between people and events. In: Europa baut auf Biographien, pp. 193–224. New Academic Press, Wien (2017)Google Scholar
  3. 3.
    Gardiner, E., Musto, R.G.: The Digital Humanities: A Primer for Students and Scholars. Cambridge University Press, Cambridge (2015)CrossRefGoogle Scholar
  4. 4.
    Hakosalo, H., Jalagin, S., Junila, M., Kurvinen, H.: Historiallinen elämä - Biografia ja historiantutkimus. Suomalaisen Kirjallisuuden Seura (SKS) (2014)Google Scholar
  5. 5.
    Haverinen, K., et al.: Building the essential resources for Finnish: the Turku Dependency Treebank. Lang. Resour. Eval. 48, 493–531 (2014). Open accessCrossRefGoogle Scholar
  6. 6.
    Heath, T., Bizer, C.: Linked data: evolving the web into a global data space. Synthesis Lectures on the Semantic Web: Theory and Technology, 1 edn. Morgan & Claypool, Palo Alto (2011). Accessed 13 Aug 2018CrossRefGoogle Scholar
  7. 7.
    Hellmann, S., Lehmann, J., Auer, S., Brümmer, M.: Integrating NLP using linked data. In: Alani, H. (ed.) ISWC 2013. LNCS, vol. 8219, pp. 98–113. Springer, Heidelberg (2013). Scholar
  8. 8.
    Hitzler, P., Krötzsch, M., Rudolph, S.: Foundations of Semantic Web Technologies. Springer, Heidelberg (2010)Google Scholar
  9. 9.
    Hyvönen, E.: Publishing and Using Cultural Heritage Linked Data on the Semantic Web. Synthesis Lectures on the Semantic Web: Theory and Technology. Morgan & Claypool, Palo Alto (2012)CrossRefGoogle Scholar
  10. 10.
    Hyvönen, E., Alonen, M., Ikkala, E., Mäkelä, E.: Life stories as event-based linked data: case semantic national biography. In: Proceedings of ISWC 2014 Posters & Demonstrations Track. CEUR Workshop Proceedings, October 2014. Accessed 13 Aug 2018
  11. 11.
    Hyvönen, E., Ikkala, E., Tuominen, J.: Linked data brokering service for historical places and maps. In: Proceedings of the 1st Workshop on Humanities in the Semantic Web (WHiSe), vol. 1608, pp. 39–52. CEUR Workshop Proceedings (2016). Accessed 13 Aug 2018
  12. 12.
    Hyvönen, E., Leskinen, P., Heino, E., Tuominen, J., Sirola, L.: Reassembling and enriching the life stories in printed biographical registers: norssi high school alumni on the semantic web. In: Gracia, J., Bond, F., McCrae, J.P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) LDK 2017. LNCS (LNAI), vol. 10318, pp. 113–119. Springer, Cham (2017). Scholar
  13. 13.
    Hyvönen, E., Leskinen, P., Tamper, M., Tuominen, J., Keravuori, K.: Semantic national biography of Finland. In: Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018), vol. 2084, pp. 372–385. CEUR Workshop Proceedings, March 2018. Accessed 13 Aug 2018
  14. 14.
    Hyvönen, E., Tuominen, J., Alonen, M., Mäkelä, E.: Linked data Finland: a 7-star model and platform for publishing and re-using linked datasets. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8798, pp. 226–230. Springer, Cham (2014). Scholar
  15. 15.
    Ikkala, E., Tuominen, J., Hyvönen, E.: Contextualizing historical places in a gazetteer by using historical maps and linked data. In: Proceedings of Digital Humanities 2016, Short Papers, pp. 573–577 (2016)Google Scholar
  16. 16.
    Jannach, D., Zanker, M., Felfernig, A., Friedrich, G.: Recommender Systems: An Introduction. Cambridge University Press, Cambridge (2011)Google Scholar
  17. 17.
    Leskinen, P., Hyvönen, E., Tuominen, J.: Analyzing and visualizing prosopographical linked data based on short biographies. In: Biographical Data in a Digital World 2017 (BD 2017), Linz, Austria, November 2017. Accessed 13 Aug 2018
  18. 18.
    McSweeney, P.J.: Gephi network statistics. Google Summer Code, pp. 1–8 (2009)Google Scholar
  19. 19.
    Mäkelä, E.: LAS: an integrated language analysis tool for multiple languages. J. Open Source Softw. 1(6) (2016). Accessed 13 Aug 2018CrossRefGoogle Scholar
  20. 20.
    Otte, E., Rousseau, R.: Social network analysis: a powerful strategy, also for the information sciences. J. Inf. Sci. 28(6), 441–453 (2002)CrossRefGoogle Scholar
  21. 21.
    Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: ten Teije, A. (ed.) EKAW 2012. LNCS (LNAI), vol. 7603, pp. 114–129. Springer, Heidelberg (2012). Scholar
  22. 22.
    Pyysalo, S., Ginter, F.: Collaborative development of annotation guidelines with application to universal dependencies. In: The Fifth Swedish Language Technology Conference (2014)Google Scholar
  23. 23.
    Roberts, B.: Biographical Research. Understanding Social Research. Open University Press (2002)Google Scholar
  24. 24.
    Rospocher, M., et al.: Building event-centric knowledge graphs from news. Web Semant. Sci., Serv. Agents World Wide Web 37, 132–151 (2016)CrossRefGoogle Scholar
  25. 25.
    Shultz, K.: What is distant reading? New York Times, 24 June 2011. Accessed 13 Aug 2018
  26. 26.
    Tuominen, J., Hyvönen, E., Leskinen, P.: Bio CRM: a data model for representing biographical data for prosopographical research. In: Proceedings of the Biographical Data in a Digital World 2017 (BD2017). CEUR Workshop Proceedings (2018). Accessed 13 Aug 2018
  27. 27.
    Verboven, K., Carlier, M., Dumolyn, J.: A short manual to the art of prosopography. In: Prosopography Approaches and Applications. A Handbook, pp. 35–70. Unit for Prosopographical Research (Linacre College) (2007)Google Scholar
  28. 28.
    Wu, Y., Sun, H., Yan, C.: An event timeline extraction method based on news corpus. In: 2017 IEEE 2nd International Conference on Big Data Analysis, pp. 697–702. IEEE (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Semantic Computing Research Group (SeCo)Aalto UniversityHelsinkiFinland
  2. 2.HELDIG – Helsinki Centre for Digital HumanitiesUniversity of HelsinkiHelsinkiFinland

Personalised recommendations