Skip to main content

Abstract

This chapter concerns applications of dense correspondences to images of a very different nature than those considered in previous chapters. Rather than images of natural or man-made scenes and objects, here, we deal with images of texts. We present a novel, dense correspondence-based approach to text image analysis instead of the more traditional approach of analysis at the character level (e.g., existing optical character recognition methods) or word level (the so called word spotting approach). We focus on the challenging domain of historical text image analysis. Such texts are handwritten and are often severely corrupted by noise and degradation, making them difficult to handle with existing methods. Our system is designed for the particular task of aligning such manuscript images to their transcripts. Our proposed alternative to performing this task manually is a system which directly matches the historical text image with a synthetic image rendered from the transcript. These matches are performed at the pixel level, by using SIFT flow applied to a novel per pixel representation. Our pipeline is robust to document degradation, variations between script styles and nonlinear image transformations. More importantly, this per pixel matching approach does not require prior learning of the particular script used in the documents being processed, and so can easily be applied to manuscripts of widely varying origins, languages, and characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)

    Article  MATH  Google Scholar 

  2. Al Azawi, M., Liwicki, M., Breuel, T.M.: WFST-based ground truth alignment for difficult historical documents with text modification and layout variations. In: IS&T/SPIE Electronic Imaging, International Society for Optics and Photonics (2013)

    Book  Google Scholar 

  3. Asi, A., Rabaev, I., Kedem, K., El-Sana, J.: User-assisted alignment of arabic historical manuscripts. In: Proceedings of Workshop on Historical Document Imaging and Processing, pp. 22–28. ACM, New York (2011)

    Google Scholar 

  4. Barnes, C., Shechtman, E., Goldman, D.B., Finkelstein, A.: The generalized PatchMatch correspondence algorithm. In: Proceedings of ECCV (2010)

    Book  Google Scholar 

  5. Dovgalecs, V., Burnett, A., Tranouez, P., Nicolas, S., Heutte, L.: Spot it! Finding words and patterns in historical documents. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1039–1043. IEEE, New York (2013)

    Google Scholar 

  6. Ebert, S., Larlus, D., Schiele, B.: Extracting structures in image collections for object recognition. In: Proceedings of ECCV (2010)

    Book  Google Scholar 

  7. Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of Latin manuscripts using hidden Markov models. In: Proceedings of HIP (2011)

    Book  Google Scholar 

  8. Guillaumin, M., Verbeek, J., Schmid, C., Lear, I., Kuntzmann, L.: Is that you? Metric learning approaches for face identification. In: Proceedings of ICCV (2009)

    Book  Google Scholar 

  9. HaCohen, Y., Shechtman, E., Goldman, D.B., Lischinski, D.: Non-rigid dense correspondence with applications for image enhancement. ACM Trans. Graph. 30(4), 70:1–70:9 (2011)

    Google Scholar 

  10. Hassner, T., Rehbein, M., Stokes, P.A., Wolf, L.: Computation and palaeography: potentials and limits. Dagstuhl Manifestos 2(1), 14–35 (2013)

    Google Scholar 

  11. Hassner, T., Wolf, L., Dershowitz, N.: OCR-free transcript alignment. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1310–1314 (2013)

    Google Scholar 

  12. Heikkilä, M., Pietikäinen, M., Schmid, C.: Description of interest regions with center-symmetric local binary patterns. In: Indian Conference Computer Vision, Graphics and Image Processing (2006)

    Book  MATH  Google Scholar 

  13. Hobby, J.D.: Matching document images with ground truth. Int. J. Doc. Anal. Recognit. 1(1), 52–61 (1998)

    Google Scholar 

  14. Holmes, M.: The UVic image markup tool project. Available: http://tapor.uvic.ca/~mholmes/image_markup (2008)

  15. Huang, C., Srihari, S.N.: Mapping transcripts to handwritten text. In: Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition, pp. 15–20 (2006)

    Google Scholar 

  16. Jose, D., Bhardwaj, A., Govindaraju, V.: Transcript mapping for handwritten English documents. In: Yanikoglu, B.A., Berkner, K. (eds.) DRR, SPIE Proceedings, vol. 6815, SPIE (2008)

    Google Scholar 

  17. Kellokumpu, V., Zhao, G., Pietikainen, M.: Human activity recognition using a dynamic texture based method. In: Proceedings of BMVC (2008)

    Google Scholar 

  18. Korman, S., Avidan, S.: Coherency sensitive hashing. In: Proceedings of the IEEE International Conference on Computer Vision (2011)

    Book  Google Scholar 

  19. Kornfield, E.M., Manmatha, R., Allan, J.: Text alignment with handwritten documents. In: Proceedings of Document Image Analysis for Libraries (DIAL), pp. 195–211. IEEE Computer Society, Cambridge (2004)

    Google Scholar 

  20. Kovesi, P.: Fast almost-Gaussian filtering. In: Proceedings of International Conference on Digital Image Computing: Techniques and Applications, pp. 121–125 (2010)

    Google Scholar 

  21. Kuster, M., Ludwig, C., Al-Hajj, Y., Selig, T.: Textgrid provenance tools for digital humanities ecosystems. In: Proceedings of Conference on Digital Ecosystems and Technologies Conference, pp. 317–323. IEEE, New York (2011)

    Google Scholar 

  22. Lavrenko, V., Rath, T.M., Manmatha, R.: Holistic word recognition for handwritten historical documents. In: Proceedings of Document Image Analysis for Libraries (DIAL), pp. 278–287 (2004)

    Google Scholar 

  23. Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)

    Article  Google Scholar 

  24. Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  25. Ojala, T., Pietikainen, M., Harwood, D.: A comparative-study of texture measures with classification based on feature distributions. Pattern Recognition 29(1), 51–59 (1996)

    Article  Google Scholar 

  26. Ojala, T., Pietikäinen, M., Mäenpää, T.: A generalized local binary pattern operator for multiresolution gray scale and rotation invariant texture classification. In: Proceedings of ICAPR (2001)

    Book  MATH  Google Scholar 

  27. Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)

    Article  MATH  Google Scholar 

  28. Rabaev, I., Biller, O., El-Sana, J., Kedem, K., Dinstein, I.: Case study in Hebrew character searching. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1080–1084. IEEE, New York (2011)

    Google Scholar 

  29. Rothfeder, J.L., Manmatha, R., Rath, T.M.: Aligning transcripts to automatically segmented handwritten manuscripts. In: Bunke, H., Spitz, A.L. (eds.) Document Analysis Systems. Lecture Notes in Computer Science, vol. 3872, pp. 84–95. Springer, Berlin (2006)

    Google Scholar 

  30. Sadeh, G., Wolf, L., Hassner, T., Dershowitz, N., Ben-Ezra, D.S., Ben-Ezra Stökl, D.: Viral transcription alignment. In: Proceedings of International Conference on Document Analysis and Recognition (2015)

    Google Scholar 

  31. Sevilla-Lara, L., Learned-Miller., E.: Distribution fields for tracking. In: Proceedings of CVPR (2012)

    Google Scholar 

  32. Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proceedings of CVPR, pp. 1–8 (2007). doi:10.1109/CVPR.2007.383198

    Google Scholar 

  33. Terras, M., Cayless, H., Noel, W.: Text-image linking environment (TILE). Available: http://mith.umd.edu/tile (2009)

  34. Tomai, C.I., Zhang, B., Govindaraju, V.: Transcript mapping for historic handwritten document images. In: Frontiers in Handwriting Recognition, pp. 413–418 (2002)

    Google Scholar 

  35. Vedaldi, A., Fulkerson, B.: Vlfeat: an open and portable library of computer vision algorithms. In: Proceedings of International Conference on Multimedia, pp. 1469–1472 (2010)

    Google Scholar 

  36. Wei, H., Gao, G.: A keyword retrieval system for historical Mongolian document images. Int. J. Doc. Anal. Recognit. 17(1), 33–45 (2014)

    Article  Google Scholar 

  37. Wolf, L., Hassner, T., Taigman, Y.: Descriptor based methods in the wild. In: Post-ECCV Faces in Real-Life Images Workshop (2008)

    Google Scholar 

  38. Wolf, L., Hassner, T., Taigman, Y.: Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. Trans. Pattern Anal. Mach. Intell. 33(10), 1978–1990 (2011)

    Article  Google Scholar 

  39. Wolf, L., Littman, R., Mayer, N., German, T., Dershowitz, N., Shweka, R., Choueka, Y.: Identifying join candidates in the Cairo Genizah. Int. J. Comput. Vis. 94(1), 118–135 (2011)

    Article  Google Scholar 

  40. Yin, F., Wang, Q.F., Liu, C.L.: Integrating geometric context for text alignment of handwritten Chinese documents. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 7–12. IEEE, New York (2010)

    Google Scholar 

  41. Zhang, L., Chu, R., Xiang, S., Liao, S., Li, S.: Face detection based on multi-block LBP representation. In: IAPR/IEEE International Conference on Biometrics (2007)

    Book  Google Scholar 

  42. Zhang, J., Huang, K., Yu, Y., Tan, T.: Boosted local structured HOG-LBP for object localization. In: Proceedings of CVPR, pp. 1393–1400 (2011)

    Google Scholar 

  43. Zhu, B., Nakagawa, M.: Online handwritten Japanese text recognition by improving segmentation quality. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition, Montreal, pp. 379–384 (2008)

    Google Scholar 

  44. Zimmermann, M., Bunke, H.: Automatic segmentation of the IAM off-line database for handwritten English text. In: Proceedings of ICPR, vol. 4, pp. 35–39 (2002)

    Google Scholar 

Download references

Acknowledgements

MS Kaufmann A50 by courtesy of the Oriental Collection of the Library and Information Centre of the Hungarian Academy of Sciences. This research was initiated at the Dagstuhl Perspectives Workshop 12382, “Computation and Palaeography: Potentials and Limits” and seminar 14302 on “Digital Palaeography: New Machines and Old Texts.”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tal Hassner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Hassner, T., Wolf, L., Dershowitz, N., Sadeh, G., Stökl Ben-Ezra, D. (2016). Dense Correspondences and Ancient Texts. In: Hassner, T., Liu, C. (eds) Dense Image Correspondences for Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-319-23048-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23048-1_12

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23047-4

  • Online ISBN: 978-3-319-23048-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics