Skip to main content

Named Entity Linking in a Complex Domain: Case Second World War History

  • Conference paper
  • First Online:
Language, Data, and Knowledge (LDK 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10318))

Included in the following conference series:

Abstract

This paper discusses the challenges of applying named entity linking in a rich, complex domain – specifically, the linking of (1) military units, (2) places and (3) people in the context of interlinked Second World War data. Multiple sub-scenarios are discussed in detail through concrete evaluations, analyzing the problems faced, and the solutions developed. A key contribution of this work is to highlight the heterogeneity of problems and approaches needed even inside a single domain, depending on both the source data as well as the target authority.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.ldf.fi/dataset/warsa.

  2. 2.

    http://sotasampo.fi/en/.

  3. 3.

    http://kronos.narc.fi/kartta/kartta.html.

  4. 4.

    http://www.maanmittauslaitos.fi/en/digituotteet/geographic-names.

  5. 5.

    http://cidoc-crm.org.

  6. 6.

    http://www.kansallisbiografia.fi/english/.

  7. 7.

    https://github.com/jiemakel/arpa/.

  8. 8.

    http://www.ling.helsinki.fi/~fkarlsso/genkau2.html.

  9. 9.

    https://github.com/jiemakel/arpa/.

  10. 10.

    https://github.com/SemanticComputing/python-arpa-linker, with the Warsampo configurations at https://github.com/SemanticComputing/warsa-linkers.

References

  1. Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, vol. 6, pp. 9–16 (2006)

    Google Scholar 

  2. Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: EMNLP-CoNLL, vol. 7, pp. 708–716 (2007)

    Google Scholar 

  3. Doerr, M.: The CIDOC CRM - an ontological approach to semantic interoperability of metadata. AI Mag. 24(3), 75–92 (2003)

    Google Scholar 

  4. Godoy, J., Atkinson, J., Rodriguez, A.: Geo-referencing with semi-automatic gazetteer expansion using lexico-syntactical patterns and co-reference analysis. Int. J. Geogr. Inf. Sci. 25(1), 149–170 (2011). http://dx.doi.org/10.1080/13658816.2010.513981

    Article  Google Scholar 

  5. Gracia, J., Mena, E.: Multiontology semantic disambiguation in unstructured web contexts. In: Proceedings of the 2009 K-CAP Workshop on Collective Knowledge Capturing and Representation, pp. 1–9 (2009)

    Google Scholar 

  6. Grishman, R., Sundheim, B.: Message understanding conference-6: a brief history. In: Coling, vol. 96, pp. 466–471 (1996)

    Google Scholar 

  7. Grover, C., Tobin, R., Byrne, K., Woollard, M., Reid, J., Dunn, S., Ball, J.: Use of the edinburgh geoparser for georeferencing digitized historical collections. Philos. Trans. R. Soc. Lond. A Math. Phys. Eng. Sci. 368(1925), 3875–3889 (2010). http://rsta.royalsocietypublishing.org/content/368/1925/3875

    Google Scholar 

  8. Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with Wikipedia. Artif. Intell. 194, 130–150 (2013). http://dx.doi.org/10.1016/j.artint.2012.04.005

    Article  MathSciNet  MATH  Google Scholar 

  9. Hoffart, J., Yosef, M.A., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, G.: Robust disambiguation of named entities in text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 782–792 (2011). http://dl.acm.org/citation.cfm?id=2145432.2145521

  10. Hu, Y., Janowicz, K., Prasad, S.: Improving Wikipedia-based place name disambiguation in short texts using structured data from DBpedia. In: Proceedings of the 8th Workshop on Geographic Information Retrieval, GIR 2014, NY, USA, pp. 8:1–8:8 (2014). http://doi.acm.org/10.1145/2675354.2675356

  11. Hyvönen, E., Heino, E., Leskinen, P., Ikkala, E., Koho, M., Tamper, M., Tuominen, J., Mäkelä, E.: WarSampo Data Service and Semantic Portal for Publishing Linked Open Data About the Second World War History. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 758–773. Springer, Cham (2016). doi:10.1007/978-3-319-34129-3_46

    Chapter  Google Scholar 

  12. Hyvönen, E., Tuominen, J., Kauppinen, T., Väätäinen, J.: Representing and utilizing changing historical places as an ontology time series. In: Ashish, N., Sheth, A. (eds.) Geospatial Semantics and Semantic Web: Foundations, Algorithms, and Applications. Springer, New York (2011)

    Google Scholar 

  13. Kettunen, K., Mäkelä, E., Kuokkala, J., Ruokolainen, T., Niemi, J.: Modern tools for old content - in search of named entities in a finnish ocred historical newspaper collection 1771–1910. In: Proceedings of LWDA 2016, September 2016

    Google Scholar 

  14. Koho, M., Hyvönen, E., Heino, E., Tuominen, J., Leskinen, P., Mäkelä, E.: Linked death - representing, publishing, and using second world war death records as linked open data. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) The Semantic Web: ESWC 2016 Satellite Events. Springer, Heidelberg (2016)

    Google Scholar 

  15. Löfberg, L., Archer, D., Piao, S., Rayson, P., McEnery, T., Varantola, K., Juntunen, J.P.: Porting an English semantic tagger to the finnish language. In: Proceedings of the Corpus Linguistics 2003 conference, pp. 457–464 (2003)

    Google Scholar 

  16. Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8. ACM (2011)

    Google Scholar 

  17. Mäkelä, E.: Combining a REST Lexical Analysis Web Service with SPARQL for Mashup Semantic Annotation from Text. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8798, pp. 424–428. Springer, Cham (2014). doi:10.1007/978-3-319-11955-7_60

    Google Scholar 

  18. Mäkelä, E.: LAS: an integrated language analysis tool for multiple languages. J. Open Source Softw. 1(6), 2 (2016). http://dx.doi.org/10.21105/joss.00035

    Article  Google Scholar 

  19. Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Invest. 30(1), 3–26 (2007)

    Article  Google Scholar 

  20. Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)

    Article  Google Scholar 

  21. The Association for Military History in Finland: Kansa taisteli lehdet 1957–1986 (2014). http://www.sshs.fi/sitenews/view/-/nid/92/ngid/1

  22. Wentland, W., Knopp, J., Silberer, C., Hartung, M.: Building a multilingual lexical resource for named entity disambiguation, translation and transliteration. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation, LREC 2008, European Language Resources Association (ELRA), Marrakech, Morocco, May 2008. http://www.lrec-conf.org/proceedings/lrec2008/

Download references

Acknowledgements

Our work is funded by the Open Science and Research Initiative (http://openscience.fi/) of the Finnish Ministry of Education and Culture, the Finnish Cultural Foundation, and the Academy of Finland

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erkki Heino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Heino, E. et al. (2017). Named Entity Linking in a Complex Domain: Case Second World War History. In: Gracia, J., Bond, F., McCrae, J., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds) Language, Data, and Knowledge. LDK 2017. Lecture Notes in Computer Science(), vol 10318. Springer, Cham. https://doi.org/10.1007/978-3-319-59888-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59888-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59887-1

  • Online ISBN: 978-3-319-59888-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics