Skip to main content

Minimizing Training Data for Reliable Writer Identification in Medieval Manuscripts

  • Conference paper
  • First Online:
Book cover New Trends in Image Analysis and Processing – ICIAP 2019 (ICIAP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11808))

Included in the following conference series:

Abstract

Palaeography aims to study ancient documents and the identification of the people who participated in the handwriting process of a given document is one of the most important problems. To this aim, expert paleographers typically analyze handwriting features such as letter heights and widths, distances between characters and angles of inclination. With the aim of achieving more precise measures and also thanks to the availability of high-quality digital images, paleographers are starting to use digital tools. In this context, in previous studies, we proposed a pattern recognition system for distinguishing the writers of mediaeval books and also investigated which is the minimum amount of training data needed to achieve satisfactory results in terms of accuracy. In this paper, we present a reject option that allows us to implement a highly-reliable system for writer identification, trained on a reduced set of data. The experimental results, performed on two sets of digital images from medieval Bibles, show that rejecting only a few samples it is possible to strongly reduce the error rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Antonacopoulos, A., Downton, A.C.: Special issue on the analysis of historical documents. IJDAR 9(2–4), 75–77 (2007)

    Article  Google Scholar 

  2. Bozzolo, C., Coq, D., Muzerelle, D., Ornato, E.: Noir et blanc. Premiers résultats d’une enquête sur la mise en page dans le livre médiéval. In: Il libro e il testo, Urbino, pp. 195–221 (1982)

    Google Scholar 

  3. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  4. Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 701–717 (2007)

    Article  Google Scholar 

  5. Cilia, N., De Stefano, C., Fontanella, F., Scotto di Freca, A.: A ranking-based feature selection approach for handwritten character recognition. Pattern Recogn. Lett. 121, 77–86 (2018)

    Article  Google Scholar 

  6. Ciula, A.: The palaeographical method under the light of a digital approach. In: Rehbein, M., Sahle, P., Schaßan, T. (eds.) Kodikologie und Paläographie im digitalen Zeitalter-Codicology and Palaeography in the Digital Age, pp. 219–237. Bod, Norderstedt (2009)

    Google Scholar 

  7. Cordella, L.P., De Stefano, C., Fontanella, F., Scotto di Freca, A.: A weighted majority vote strategy using bayesian networks. In: Petrosino, A. (ed.) ICIAP 2013. LNCS, vol. 8157, pp. 219–228. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41184-7_23

    Chapter  Google Scholar 

  8. Cordella, L.P., De Stefano, C., Fontanella, F., Marrocco, C., Scotto di Freca, A.: Combining single class features for improving performance of a two stage classifier. In: 20th International Conference on Pattern Recognition (ICPR 2010), pp. 4352–4355. IEEE Computer Society (2010)

    Google Scholar 

  9. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (2006)

    Article  Google Scholar 

  10. De Stefano, C., D’Elia, C., Marcelli, A.: A dynamic approach to learning vector quantization. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), vol. 4, pp. 601–604 (August 2004)

    Google Scholar 

  11. De Stefano, C., D’Elia, C., Marcelli, A., Scotto di Freca, A.: Improving dynamic learning vector quantization. In: Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), vol. 2, pp. 804–807 (August 2006)

    Google Scholar 

  12. De Stefano, C., Folino, G., Fontanella, F., Scotto Freca, A.: Using bayesian networks for selecting classifiers in GP ensembles. Inf. Sci. 258, 200–216 (2014)

    Article  MathSciNet  Google Scholar 

  13. De Stefano, C., Fontanella, F., Folino, G., di Freca, A.S.: A bayesian approach for combining ensembles of GP classifiers. In: Sansone, C., Kittler, J., Roli, F. (eds.) MCS 2011. LNCS, vol. 6713, pp. 26–35. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21557-5_5

    Chapter  Google Scholar 

  14. De Stefano, C., Fontanella, F., Marrocco, C.: A GA-based feature selection algorithm for remote sensing images. In: Giacobini, M., et al. (eds.) EvoWorkshops 2008. LNCS, vol. 4974, pp. 285–294. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78761-7_29

    Chapter  Google Scholar 

  15. De Stefano, C., Maniaci, M., Fontanella, F., Scotto Freca, A.: Layout measures for writer identification in mediaeval documents. Measurement 127, 443–452 (2018)

    Article  Google Scholar 

  16. De Stefano, C., Maniaci, M., Fontanella, F., Scotto di Freca, A.: Reliable writer identification in medieval manuscripts through page layout features: The Avila Bible case. Eng. Appl. Artif. Intell. 72, 99–110 (2018)

    Article  Google Scholar 

  17. De Stefano, C., D’Elia, C., Scotto di Freca, A., Marcelli, A.: Classifier combination by bayesian networks for handwriting recognition. Int. J. Pattern Recogn. Artif. Intell. 23(05), 887–905 (2009)

    Article  Google Scholar 

  18. De Stefano, C., Fontanella, F., Maniaci, M., Scotto di Freca, A.: A method for scribe distinction in medieval manuscripts using page layout features. In: Maino, G., Foresti, G.L. (eds.) ICIAP 2011. LNCS, vol. 6978, pp. 393–402. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24085-0_41

    Chapter  Google Scholar 

  19. Dhali, M.A., He, S., Popovic, M., Tigchelaar, E., Schomaker, L.: A digital palaeographic approach towards writer identification in the dead sea scrolls. In: Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, ICPRAM, pp. 693–702 (2017)

    Google Scholar 

  20. Dinstein, I., Shapira, Y.: Ancient hebraic handwriting identification with run-length histograms. IEEE Trans. Syst. Man Cybern. 12(3), 405–409 (1982)

    Article  Google Scholar 

  21. Gurrado, M.: “Graphoskop”, uno strumento informatico per l’analisi paleografica quantitativa. In: Rehbein, M., Sahle, P., Schaßan, T. (eds.) Kodikologie und Paläographie im digitalen Zeitalter-Codicology and Palaeography in the Digital Age, pp. 251–259. Bod, Norderstedt (2009)

    Google Scholar 

  22. He, S., Samara, P., Burgers, J., Schomaker, L.: Image-based historical manuscript dating using contour and stroke fragments. Pattern Recogn. 58, 159–171 (2016)

    Article  Google Scholar 

  23. Liang, Y., Fairhurst, M.C., Guest, R.M., Erbilek, M.: Automatic handwriting feature extraction, analysis and visualization in the context of digital palaeography. IJPRAI 30(4), 1653001 (2016). 1–26

    Google Scholar 

  24. Maniaci, M., Ornato, G.: Prime considerazioni sulla genesi e la storia della bibbia di avila. In: Miscellanea F. Magistrale (2010)

    Google Scholar 

  25. Quinlan, J.R.: C4. 5 Programs for Machine Learning. Morgan Kaufmann Series in Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  26. Schomaker, L., Franke, K., Bulacu, M.: Using codebooks of fragmented connected-component contours in forensic and historic writer identification. Pattern Recogn. Lett. 28(6), 719–727 (2007). Pattern Recognition in Cultural Heritage and Medical Applications

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Fontanella .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cilia, N.D., De Stefano, C., Fontanella, F., Molinara, M., Scotto di Freca, A. (2019). Minimizing Training Data for Reliable Writer Identification in Medieval Manuscripts. In: Cristani, M., Prati, A., Lanz, O., Messelodi, S., Sebe, N. (eds) New Trends in Image Analysis and Processing – ICIAP 2019. ICIAP 2019. Lecture Notes in Computer Science(), vol 11808. Springer, Cham. https://doi.org/10.1007/978-3-030-30754-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30754-7_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30753-0

  • Online ISBN: 978-3-030-30754-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics