Skip to main content

A Hybrid Binarization Technique for Document Images

  • Chapter
Learning Structure and Schemas from Documents

Part of the book series: Studies in Computational Intelligence ((SCI,volume 375))

Abstract

In this chapter, a binarization technique specifically designed for historical document images is presented. Existing binarization techniques focus either on finding an appropriate global threshold or adapting a local threshold for each area in order to remove smear, strains, uneven illumination etc. Here, a hybrid approach is presented that first applies a global thresholding technique and, then, identifies the image areas that are more likely to still contain noise. Each of these areas is re-processed separately to achieve better quality of binarization. Evaluation results are presented that compare our technique with existing ones and indicate that the proposed approach is effective, combining the advantages of global and local thresholding. Finally, future directions of our research are mentioned.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Couasnon, B., Camillerapp, J., Leplumey, I.: Making handwritten archives documents accessible to public with a generic system of document image analysis. In: DIAL 2004, pp. 270–277 (2004)

    Google Scholar 

  2. Baird, H.S.: Difficult and Urgent Open Problems in Document Image Analysis for Libraries. In: DIAL 2004, pp. 25–32 (2004)

    Google Scholar 

  3. Marinai, S., Marino, E., Cesarini, F., Soda, G.: A general system for the retrieval of document images from digital libraries. In: DIAL 2004, pp. 150–173 (2004)

    Google Scholar 

  4. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Systems Man Cybernet. 9(1), 62–66 (1979)

    Article  MathSciNet  Google Scholar 

  5. Bernsen, J.: Dynamic thresholding of grey-level images. In: 8th Int. Conf. on Pattern Recognition, pp. 1251–1255 (1986)

    Google Scholar 

  6. Niblack, W.: An Introduction to Digital image processing, pp. 115–116. Prentice-Hall, Englewood Cliffs (1986)

    Google Scholar 

  7. Stathis, P., Kavallieratou, E., Papamarkos, N.: An evaluation technique for binarization algorithms. Journal of Universal Computer Science 14(18), 3011–3030 (2008)

    Google Scholar 

  8. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  9. Kavallieratou, E.: A Binarization Algorithm Specialized on Document Images and Photos. In: 8th Int. Conf. on Document Analysis and Recognition, pp. 463–467 (2005)

    Google Scholar 

  10. Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recognition 33, 225–236 (2000)

    Article  Google Scholar 

  11. Leedham, G., Varma, S., Patankar, A., Govindaraju, V.: Separating Text and Background in Degraded Document Images. In: Proceedings Eighth InternationalWorkshop on Frontiers of Handwriting Recognition, pp. 244–249 (September 2002)

    Google Scholar 

  12. Shapiro, L., Stockman, G.: Computer Vision. Prentice-Hall, Englewood Cliffs (2001)

    Google Scholar 

  13. Gottesfeld Brown, L.: A survey of image registration techniques. ACM Computing Surveys 24(4), 325–396 (1992)

    Article  Google Scholar 

  14. Kitte, T.D., Evans, B.L., Daamera-Venkata, N., Bovil, A.C.: Image Quality Assessment Based on Degradation Model. IEEE Trans. Image Processing 9, 909–922 (2000)

    Article  Google Scholar 

  15. Veksler, O.: Fast Variable Window for Stereo Correspondence using Integral Images. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2003, vol. 1, pp. I-556 – I-561 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Sokratis, V., Kavallieratou, E., Paredes, R., Sotiropoulos, K. (2011). A Hybrid Binarization Technique for Document Images. In: Biba, M., Xhafa, F. (eds) Learning Structure and Schemas from Documents. Studies in Computational Intelligence, vol 375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22913-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22913-8_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22912-1

  • Online ISBN: 978-3-642-22913-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics