A Graphic Matching Process for Searching and Retrieving Information in Digital Libraries of Manuscripts

  • Nicola Barbuti
  • Tommaso Caldarola
  • Stefano Ferilli
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 806)

Abstract

This paper outlines ICRPad, a pattern recognition system based on a graphic matching algorithm, which works on images by shape contour recognition, without requiring any segmentation process. The algorithm starts the process from a region of interest (ROI) selected in the image, using it as a shape model and looking for similar patterns in one or many target images. The process was developed and tested with the aim of proposing a new approach for searching and retrieving information in digital libraries. This approach is based on the application of data science, the fourth paradigm of knowledge development in the scientific field, that is at the basis of science informatics, to studies in data humanities. Following this approach, the algorithm is applied to find new research hypotheses through the discovery of patterns directly inferred from large digital libraries.

Keywords

Graphic pattern Pattern recognition Digital libraries Manuscripts Graphic matching algorithm 

References

  1. 1.
    Barbuti, N., Caldarola, T.: An innovative character recognition for ancient book and archival materials: a segmentation and self-learning based approach. In: Agosti, M., Esposito, F., Ferilli, S., Ferro, N. (eds.) IRCDL 2012. CCIS, vol. 354, pp. 261–270. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-35834-0_26 CrossRefGoogle Scholar
  2. 2.
    Fischer, A., Bunke, H.: Character prototype selection for handwriting recognition in historical documents. In: Proceedings of 19th European Signal Processing Conference, EUSIPCO, pp. 1435–1439 (2011)Google Scholar
  3. 3.
    Indermühle, E., Eichemberger-Liwicki, M., Bunke, H.: Recognition of handwritten historical documents: HMM-adaptation vs. writer specific training. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition, Montreal, Quebec, Canada, pp. 186–191 (2008)Google Scholar
  4. 4.
    Bulacu, M., Schomaker, L.: Automatic handwriting identification on medieval documents. In: 14th International Conference on Image Analysis and Processing, ICIAP 2007, pp. 279–284 (2007)Google Scholar
  5. 5.
    Rath, M.T., Manmatha, R.A., Lavrenko, V.: Search engine for historical manuscript images. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 369–376 (2004)Google Scholar
  6. 6.
    Srihari, S., Huang, C., Srinavasan, H.: A search engine for handwritten documents. In: Document Recognition and Retrieval XII, vol. 154, no. 3, pp. 66–75 (2005)Google Scholar
  7. 7.
    Fischer, A., Wüthrich, M., Liwicki, M., Frinken, L., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: Proceedings of 15th International Conference on Virtual Systems and Multimedia, pp. 137–142 (2009)Google Scholar
  8. 8.
    Adamek, T., O’Connor, E.N., Smeaton, A.F.: Word matching using single closed contours for indexing handwritten historical documents. Int. J. Doc. Anal. Recogn. (IJDAR) 9(2–4), 153–165 (2007)CrossRefGoogle Scholar
  9. 9.
    Herzog, R., Neumann, B., Solth, A.: Computer-based stroke extraction in historical manuscripts, manuscript cultures. Newsletter 3, 14–24 (2011)Google Scholar
  10. 10.
    Krtolica, R.V., Malitsky, S.: Multifont optical character recognition using a box connectivity approach (EP0649113A2) (2012). http://worldwide.espacenet.com/publicationDetails/biblio?CC=EP&NR=0649113&KC=&FT=E&locale=en_EP. Accessed 20 May 2012
  11. 11.
    Leydier, Y., Le Bourgeois, F., Emptoz, H.: Textual indexation of ancient documents. In: Proceedings of the 2005 ACM Symposium on Document Engineering, pp. 111–117 (2005)Google Scholar
  12. 12.
    Dalton, J., Davis, T., van Schaik, S.: Beyond anonymity: paleographic analyses of the Dunhuang manuscripts. J. Int. Assoc. Tibet. Stud. 3, 1–23 (2007)Google Scholar
  13. 13.
    Le Bourgeois, F., Emptoz, H.: DEBORA: Digital AccEss to BOoks of the RenaissAnce. IJDAR 9(2–4), 193–221 (2007)CrossRefGoogle Scholar
  14. 14.
    Bar-Yosef, I., Mokeichev, A., Kedem, K., Dinstein, I.: Adaptive shape prior for recognition and variational segmentation of degraded historical characters. Pattern Recogn. 42(12), 3348–3354 (2008)CrossRefMATHGoogle Scholar
  15. 15.
    Gordo, A., Llorenz, D., Marzal, A., Prat, F., Vilar, J.M.: State: a multimodal assisted text-transcription system for ancient documents. In: Proceedings of 8th IAPR International Workshop on Document Analysis Systems, DAS 2008, pp. 135–142 (2008)Google Scholar
  16. 16.
    Cheriet, M., et al.: Handwriting recognition research: twenty years of achievement… and beyond. Pattern Recogn. 42, 3131–3135 (2006)CrossRefGoogle Scholar
  17. 17.
    Le Bourgeois, F., Emptoz, H.: Towards an omnilingual word retrieval system for ancient manuscripts. Pattern Recogn. 42(9), 2089–2105 (2009)CrossRefMATHGoogle Scholar
  18. 18.
    Nel, E.-M., Preez, J.A., Herbst, B.M.: A pseudo-skeletonization algorithm for static handwritten scripts. Int. J. Doc. Anal. Recogn. (IJDAR) 12, 47–62 (2009)CrossRefGoogle Scholar
  19. 19.
    Stokes, P.A.: Computer-aided palaeography, present and future. In: Rehbein, M., et al. (eds.) Codicology and Palaeography in the Digital Age, Schriften des Instituts fur Dokumentologie und Editorik, Band 2. Book on Demand GmbH, Norderstedt (2009)Google Scholar
  20. 20.
    Toselli, A.H., Romero, V., Pastor, M., Vidal, E.: Multimodal interactive transcription of text images. Pattern Recogn. 43(5), 1814–1825 (2010)CrossRefMATHGoogle Scholar
  21. 21.
    Fischer, A., Wüthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: Proceedings 15th International Conference on Virtual Systems and Multimedia, pp. 137–142 (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Nicola Barbuti
    • 1
  • Tommaso Caldarola
    • 2
  • Stefano Ferilli
    • 3
  1. 1.Department of Humanities (DISUM)University of Bari Aldo MoroBariItaly
  2. 2.D.A.BI.MUS. Ltd., Spin Off of University of Bari Aldo MoroBariItaly
  3. 3.Department of Computer Science (DIB)University of Bari Aldo MoroBariItaly

Personalised recommendations