Skip to main content

A Comparison of Some Morphological Filters for Improving OCR Performance

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9082))

Abstract

Studying discrete space representations has recently lead to the development of novel morphological operators. To date, there has been no study evaluating the performances of those novel operators with respect to a specific application. This article compares the capability of several morphological operators, both old and new, to improve OCR performance when used as preprocessing filters. We design an experiment using the Tesseract OCR engine on binary images degraded with a realistic document-dedicated noise model. We assess the performances of some morphological filters acting in complex, graph and vertex spaces, including the area filters. This experiment reveals the good overall performance of complex and graph filters. MSE measures have also been performed to evaluate the denoising capability of these filters, which again confirms the performances of both complex and graph filtering on this aspect.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baird, H.S.: Document image defect models. In: Structured Document Image Analysis, pp. 546–556. Springer (1992)

    Google Scholar 

  2. Baird, H.S.: Calibration of document image defect models. In: Annual Symp. on Doc. Anal. and Inf. Retr., pp. 1–16 (1993)

    Google Scholar 

  3. Baird, H.S.: The state of the art of document image degradation modelling. In: Digital Document Processing, pp. 261–279. Springer (2007)

    Google Scholar 

  4. Cousty, J., Najman, L., Dias, F., Serra, J.: Morphological filtering on graphs. Computer Vision and Image Understanding 117(4), 370–385 (2013)

    Article  Google Scholar 

  5. Dias, F., Cousty, J., Najman, L.: Dimensional operators for mathematical morphology on simplicial complexes. PRL 47, 111–119 (2014)

    Article  Google Scholar 

  6. Dias, F., Cousty, J., Najman, L.: Some morphological operators on simplicial complex spaces. In: Debled-Rennesson, I., Domenjoud, E., Kerautret, B., Even, P. (eds.) DGCI 2011. LNCS, vol. 6607, pp. 441–452. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  7. Heijmans, H.J.A.M., Nacken, P., Toet, A., Vincent, L.: Graph morphology. Journal of Visual Communication and Image Representation 3(1), 24–38 (1992)

    Article  Google Scholar 

  8. Ho, T.K., Baird, H.S.: Evaluation of ocr accuracy using synthetic data. In: Annual Symp. on Doc. Anal. and Inf. Retr. (1995)

    Google Scholar 

  9. Kanungo, T., Haralick, R.M., Baird, H.S., Stuezle, W., Madigan, D.: A statistical, nonparametric methodology for document degradation model validation. PAMI 22(11), 1209–1223 (2000)

    Article  Google Scholar 

  10. Kanungo, T., Haralick, R.M., Phillips, I.: Global and local document degradation models. In: Proceedings of the Second International Conference on Document Analysis and Recognition, pp. 730–734. IEEE (1993)

    Google Scholar 

  11. Mennillo, L., Cousty, J., Najman, L.: Morphological filters for ocr: a performance comparison. Tech. rep. (December 2012), http://hal.archives-ouvertes.fr/hal-00762631

  12. Meyer, F., Angulo, J.: Micro-viscous morphological operators. In: ISMM 2007, pp. 165–176. INPE (October 2007)

    Google Scholar 

  13. Nartker, T.A., Rice, S.V., Jenkins, F.R.: OCR accuracy: UNLV’s fourth annual test. Inform 9(7), 38–46 (1995)

    Google Scholar 

  14. Nartker, T.A., Rice, S.V., Lumos, S.E.: Software tools and test data for research and testing of page-reading ocr systems. In: Document Recognition and Retrieval XII. SPIE, vol. 5676, pp. 37–47 (2005)

    Google Scholar 

  15. Rice, S.V., Nagy, G., Nartker, T.A.: Optical character recognition: An illustrated guide to the frontier. Springer (1999)

    Google Scholar 

  16. Serra, J.: Image analysis and mathematical morphology. Academic Press (1982)

    Google Scholar 

  17. Smith, R.: An overview of the tesseract ocr engine. In: ICDAR 2007, vol. 2, pp. 629–633 (2007)

    Google Scholar 

  18. Vincent, L.: Graphs and mathematical morphology. Signal Processing 16(4), 365–388 (1989)

    Article  MathSciNet  Google Scholar 

  19. Vincent, L.: Morphological area openings and closings for greyscale images. In: Shape in Picture. Nato ASI Series, vol. 126, pp. 197–208. Springer, Heidelberg (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laurent Mennillo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Mennillo, L., Cousty, J., Najman, L. (2015). A Comparison of Some Morphological Filters for Improving OCR Performance. In: Benediktsson, J., Chanussot, J., Najman, L., Talbot, H. (eds) Mathematical Morphology and Its Applications to Signal and Image Processing. ISMM 2015. Lecture Notes in Computer Science(), vol 9082. Springer, Cham. https://doi.org/10.1007/978-3-319-18720-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18720-4_12

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18719-8

  • Online ISBN: 978-3-319-18720-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics