Skip to main content

Who is Mona L.? Identifying Mentions of Artworks in Historical Archives

  • Conference paper
  • First Online:
Digital Libraries for Open Knowledge (TPDL 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11799))

Included in the following conference series:

Abstract

Named entity recognition (NER) plays an important role in many information retrieval tasks, including automatic knowledge graph construction. Most NER systems are typically limited to a few common named entity types, such as person, location, and organization. However, for cultural heritage resources, such as art historical archives, the recognition of titles of artworks as named entities is of high importance. In this work, we focus on identifying mentions of artworks, e.g. paintings and sculptures, from historical archives. Current state of the art NER tools are unable to adequately identify artwork titles due to the particular difficulties presented by this domain. The scarcity of training data for NER for cultural heritage poses further hindrances. To mitigate this, we propose a semi-supervised approach to create high-quality training data by leveraging existing cultural heritage resources. Our experimental evaluation shows significant improvement in NER performance for artwork titles as compared to baseline approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Linked Open Data: http://www.w3.org/DesignIssues/LinkedData.

  2. 2.

    OpenGLAM: http://openglam.org.

  3. 3.

    Europeana: http://europeana.eu.

  4. 4.

    SpaCy: https://spacy.io/, version 2.1.3.

  5. 5.

    from the exhibition catalogue “Lukas Cranach: Gemälde, Zeichnungen, Druckgraphik” (https://digi.ub.uni-heidelberg.de/diglit/koepplin1974bd1/0084).

  6. 6.

    https://query.wikidata.org/.

  7. 7.

    https://github.com/HPI-Information-Systems/enno.

  8. 8.

    https://wpi.art.

References

  1. Chinchor, N.: Overview of MUC-7. In: Proceedings of the Seventh Message Understanding Conference (MUC-7) (1998)

    Google Scholar 

  2. de Boer, V., et al.: Supporting linked data production for cultural heritage institutes: the Amsterdam museum case study. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 733–747. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_56

    Chapter  Google Scholar 

  3. Dijkshoorn, C., et al.: The Rijksmuseum collection as linked data. Semant. Web 9(2), 221–230 (2018)

    Article  Google Scholar 

  4. Ehrmann, M., Colavizza, G., Rochat, Y., Kaplan, F.: Diachronic evaluation of NER systems on old newspapers. In: Proceedings of the 13th Conference on Natural Language Processing (KONVENS 2016), pp. 97–107 (2016)

    Google Scholar 

  5. Pradhan, S., et al.: Towards robust linguistic analysis using OntoNotes. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pp. 143–152 (2013)

    Google Scholar 

  6. Prokofyev, R., Demartini, G., Cudré-Mauroux, P.: Effective named entity recognition for idiosyncratic web collections. In: Proceedings of the 23rd International Conference on World Wide Web (WWW), pp. 397–408. ACM (2014)

    Google Scholar 

  7. Rodriquez, K.J., Bryant, M., Blanke, T., Luszczynska, M.: Comparison of Named entity recognition tools for raw OCR text. In: Konvens, pp. 410–414 (2012)

    Google Scholar 

  8. Sang, E.F.T.K., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. Development 922, 1341 (1837)

    Google Scholar 

  9. Szekely, P., et al.: Connecting the smithsonian american art museum to the linked data cloud. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 593–607. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38288-8_40

    Chapter  Google Scholar 

  10. Van Hooland, S., De Wilde, M., Verborgh, R., Steiner, T., Van de Walle, R.: Exploring entity recognition and disambiguation for cultural heritage collections. Digit. Sch. Humanit. 30(2), 262–279 (2013)

    Article  Google Scholar 

  11. Van Hooland, S., Verborgh, R.: Linked Data for Libraries, Archives and Museums: How to Clean, Link and Publish Your Metadata. Facet Publishing, London (2014)

    Google Scholar 

  12. Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledge base. Commun. ACM 57(10), 78–85 (2014). https://doi.org/10.1145/2629489

    Article  Google Scholar 

Download references

Acknowledgements

We thank the Wildenstein Plattner InstituteFootnote 8 for providing the corpus used in this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nitisha Jain .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jain, N., Krestel, R. (2019). Who is Mona L.? Identifying Mentions of Artworks in Historical Archives. In: Doucet, A., Isaac, A., Golub, K., Aalberg, T., Jatowt, A. (eds) Digital Libraries for Open Knowledge. TPDL 2019. Lecture Notes in Computer Science(), vol 11799. Springer, Cham. https://doi.org/10.1007/978-3-030-30760-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30760-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30759-2

  • Online ISBN: 978-3-030-30760-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics