Abstract
This chapter concerns applications of dense correspondences to images of a very different nature than those considered in previous chapters. Rather than images of natural or man-made scenes and objects, here, we deal with images of texts. We present a novel, dense correspondence-based approach to text image analysis instead of the more traditional approach of analysis at the character level (e.g., existing optical character recognition methods) or word level (the so called word spotting approach). We focus on the challenging domain of historical text image analysis. Such texts are handwritten and are often severely corrupted by noise and degradation, making them difficult to handle with existing methods. Our system is designed for the particular task of aligning such manuscript images to their transcripts. Our proposed alternative to performing this task manually is a system which directly matches the historical text image with a synthetic image rendered from the transcript. These matches are performed at the pixel level, by using SIFT flow applied to a novel per pixel representation. Our pipeline is robust to document degradation, variations between script styles and nonlinear image transformations. More importantly, this per pixel matching approach does not require prior learning of the particular script used in the documents being processed, and so can easily be applied to manuscripts of widely varying origins, languages, and characteristics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)
Al Azawi, M., Liwicki, M., Breuel, T.M.: WFST-based ground truth alignment for difficult historical documents with text modification and layout variations. In: IS&T/SPIE Electronic Imaging, International Society for Optics and Photonics (2013)
Asi, A., Rabaev, I., Kedem, K., El-Sana, J.: User-assisted alignment of arabic historical manuscripts. In: Proceedings of Workshop on Historical Document Imaging and Processing, pp. 22–28. ACM, New York (2011)
Barnes, C., Shechtman, E., Goldman, D.B., Finkelstein, A.: The generalized PatchMatch correspondence algorithm. In: Proceedings of ECCV (2010)
Dovgalecs, V., Burnett, A., Tranouez, P., Nicolas, S., Heutte, L.: Spot it! Finding words and patterns in historical documents. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1039–1043. IEEE, New York (2013)
Ebert, S., Larlus, D., Schiele, B.: Extracting structures in image collections for object recognition. In: Proceedings of ECCV (2010)
Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of Latin manuscripts using hidden Markov models. In: Proceedings of HIP (2011)
Guillaumin, M., Verbeek, J., Schmid, C., Lear, I., Kuntzmann, L.: Is that you? Metric learning approaches for face identification. In: Proceedings of ICCV (2009)
HaCohen, Y., Shechtman, E., Goldman, D.B., Lischinski, D.: Non-rigid dense correspondence with applications for image enhancement. ACM Trans. Graph. 30(4), 70:1–70:9 (2011)
Hassner, T., Rehbein, M., Stokes, P.A., Wolf, L.: Computation and palaeography: potentials and limits. Dagstuhl Manifestos 2(1), 14–35 (2013)
Hassner, T., Wolf, L., Dershowitz, N.: OCR-free transcript alignment. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1310–1314 (2013)
Heikkilä, M., Pietikäinen, M., Schmid, C.: Description of interest regions with center-symmetric local binary patterns. In: Indian Conference Computer Vision, Graphics and Image Processing (2006)
Hobby, J.D.: Matching document images with ground truth. Int. J. Doc. Anal. Recognit. 1(1), 52–61 (1998)
Holmes, M.: The UVic image markup tool project. Available: http://tapor.uvic.ca/~mholmes/image_markup (2008)
Huang, C., Srihari, S.N.: Mapping transcripts to handwritten text. In: Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition, pp. 15–20 (2006)
Jose, D., Bhardwaj, A., Govindaraju, V.: Transcript mapping for handwritten English documents. In: Yanikoglu, B.A., Berkner, K. (eds.) DRR, SPIE Proceedings, vol. 6815, SPIE (2008)
Kellokumpu, V., Zhao, G., Pietikainen, M.: Human activity recognition using a dynamic texture based method. In: Proceedings of BMVC (2008)
Korman, S., Avidan, S.: Coherency sensitive hashing. In: Proceedings of the IEEE International Conference on Computer Vision (2011)
Kornfield, E.M., Manmatha, R., Allan, J.: Text alignment with handwritten documents. In: Proceedings of Document Image Analysis for Libraries (DIAL), pp. 195–211. IEEE Computer Society, Cambridge (2004)
Kovesi, P.: Fast almost-Gaussian filtering. In: Proceedings of International Conference on Digital Image Computing: Techniques and Applications, pp. 121–125 (2010)
Kuster, M., Ludwig, C., Al-Hajj, Y., Selig, T.: Textgrid provenance tools for digital humanities ecosystems. In: Proceedings of Conference on Digital Ecosystems and Technologies Conference, pp. 317–323. IEEE, New York (2011)
Lavrenko, V., Rath, T.M., Manmatha, R.: Holistic word recognition for handwritten historical documents. In: Proceedings of Document Image Analysis for Libraries (DIAL), pp. 278–287 (2004)
Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Ojala, T., Pietikainen, M., Harwood, D.: A comparative-study of texture measures with classification based on feature distributions. Pattern Recognition 29(1), 51–59 (1996)
Ojala, T., Pietikäinen, M., Mäenpää, T.: A generalized local binary pattern operator for multiresolution gray scale and rotation invariant texture classification. In: Proceedings of ICAPR (2001)
Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Rabaev, I., Biller, O., El-Sana, J., Kedem, K., Dinstein, I.: Case study in Hebrew character searching. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1080–1084. IEEE, New York (2011)
Rothfeder, J.L., Manmatha, R., Rath, T.M.: Aligning transcripts to automatically segmented handwritten manuscripts. In: Bunke, H., Spitz, A.L. (eds.) Document Analysis Systems. Lecture Notes in Computer Science, vol. 3872, pp. 84–95. Springer, Berlin (2006)
Sadeh, G., Wolf, L., Hassner, T., Dershowitz, N., Ben-Ezra, D.S., Ben-Ezra Stökl, D.: Viral transcription alignment. In: Proceedings of International Conference on Document Analysis and Recognition (2015)
Sevilla-Lara, L., Learned-Miller., E.: Distribution fields for tracking. In: Proceedings of CVPR (2012)
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proceedings of CVPR, pp. 1–8 (2007). doi:10.1109/CVPR.2007.383198
Terras, M., Cayless, H., Noel, W.: Text-image linking environment (TILE). Available: http://mith.umd.edu/tile (2009)
Tomai, C.I., Zhang, B., Govindaraju, V.: Transcript mapping for historic handwritten document images. In: Frontiers in Handwriting Recognition, pp. 413–418 (2002)
Vedaldi, A., Fulkerson, B.: Vlfeat: an open and portable library of computer vision algorithms. In: Proceedings of International Conference on Multimedia, pp. 1469–1472 (2010)
Wei, H., Gao, G.: A keyword retrieval system for historical Mongolian document images. Int. J. Doc. Anal. Recognit. 17(1), 33–45 (2014)
Wolf, L., Hassner, T., Taigman, Y.: Descriptor based methods in the wild. In: Post-ECCV Faces in Real-Life Images Workshop (2008)
Wolf, L., Hassner, T., Taigman, Y.: Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. Trans. Pattern Anal. Mach. Intell. 33(10), 1978–1990 (2011)
Wolf, L., Littman, R., Mayer, N., German, T., Dershowitz, N., Shweka, R., Choueka, Y.: Identifying join candidates in the Cairo Genizah. Int. J. Comput. Vis. 94(1), 118–135 (2011)
Yin, F., Wang, Q.F., Liu, C.L.: Integrating geometric context for text alignment of handwritten Chinese documents. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 7–12. IEEE, New York (2010)
Zhang, L., Chu, R., Xiang, S., Liao, S., Li, S.: Face detection based on multi-block LBP representation. In: IAPR/IEEE International Conference on Biometrics (2007)
Zhang, J., Huang, K., Yu, Y., Tan, T.: Boosted local structured HOG-LBP for object localization. In: Proceedings of CVPR, pp. 1393–1400 (2011)
Zhu, B., Nakagawa, M.: Online handwritten Japanese text recognition by improving segmentation quality. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition, Montreal, pp. 379–384 (2008)
Zimmermann, M., Bunke, H.: Automatic segmentation of the IAM off-line database for handwritten English text. In: Proceedings of ICPR, vol. 4, pp. 35–39 (2002)
Acknowledgements
MS Kaufmann A50 by courtesy of the Oriental Collection of the Library and Information Centre of the Hungarian Academy of Sciences. This research was initiated at the Dagstuhl Perspectives Workshop 12382, “Computation and Palaeography: Potentials and Limits” and seminar 14302 on “Digital Palaeography: New Machines and Old Texts.”
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Hassner, T., Wolf, L., Dershowitz, N., Sadeh, G., Stökl Ben-Ezra, D. (2016). Dense Correspondences and Ancient Texts. In: Hassner, T., Liu, C. (eds) Dense Image Correspondences for Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-319-23048-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-23048-1_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23047-4
Online ISBN: 978-3-319-23048-1
eBook Packages: EngineeringEngineering (R0)