Skip to main content

Color Components Clustering and Power Law Transform for the Effective Implementation of Character Recognition in Color Images

  • Conference paper
  • First Online:
Progress in Intelligent Computing Techniques: Theory, Practice, and Applications

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 518))

  • 1041 Accesses

Abstract

The problem related to the recognition of words in the color images is discussed here. The challenges with the color images are the distortions due to the low illumination, that cause problems while binarizing the color images. A novel method of Clustering of Color components and Power Law Transform for binarizing the camera captured color images containing texts. It is reported that the binarization by the proposed technique is improved and hence the rate of recognition in the camera captured color images gets increased. The similar color components are clustered together, and hence, Canny edge detection technique is applied to every image. The union operation is performed on images of different color components. In order to differentiate the text from background, a separate rectangular box is applied to every edge and each edge is applied with Power Law Transform. The optimum value for gamma is fixed as 1.5, and the operations are performed. The experiment is exhaustively performed on the datasets, and it is reported that the proposed algorithm performs well. The best performing algorithm in ICDAR Robust Reading Challenge is 62.5% by TH-OCR after pre-processing and 64.3% by ABBYY Fine Reader after pre-processing. The proposed algorithm reports the recognition rate of 79.2% using ABBYY Fine Reader. This proposed method can also be applied for the recognition of Born Digital Images, for which the recognition rate is 72.5%, earlier in the literature which is reported as 63.4% using ABBYY Fine Reader.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abbyy Fine reader. http://www.abbyy.com/.

  2. Adobe Reader. http://www.adobe.com/products/acrobatpro/scanning-ocrto-pdf.html Document Analysis and Recognition, pp. 11–16, September 2011, 2011.

  3. S. M. Lucas et.al, “ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions”, International Journal on Document Analysis and Recognition, vol. 7, no. 2, pp. 105–122, June 2005.

    Google Scholar 

  4. A. Mishra, K. Alahari and C. V. Jawahar, “An MRF Model for Binarization of Natural Scene Text”, Proc. 11th International Conference of Document Analysis and Recognition, pp. 11–16, September 2011, 2011.

    Google Scholar 

  5. A. Shahab, F. Shafait and A. Dengel, “ICDAR 2011 Robust Reading Competition—Challenge 2: Reading Text in Scene Images”, Proc. 11th International Conference of Document Analysis and Recognition, pp. 1491–1496, September 2011, 2011.

    Google Scholar 

  6. Thotreingam Kasar, Jayant Kumar and A.G. Ramakrishnan, “Font and Background Color Independent Text Binarization”, Proc. II International Workshop on Camera-Based Document Analysis and Recognition (CBDAR 2007), Curitiba, Brazil, September 22, 2007, pp. 3–9.

    Google Scholar 

  7. T. Kasar and A. G. Ramakrishnan, “COCOCLUST: Contour-based Color Clustering for Robust Text Segmentation and Binarization,” Proc. 3rd workshop on Camera-based Document Analysis and Recognition, pp. 11–17, 2009, Spain.

    Google Scholar 

  8. Deepak Kumar and A. G. Ramakrishnan, “Power-law transformation for enhanced recognition of born-digital word images,” Proc. 9th International Conference on Signal Processing and Communications (SPCOM 2012), 22–25 July 2012, Bangalore, India.

    Google Scholar 

  9. J. N. Kapur, P. K. Sahoo, and A. Wong. A new method for gray-level picture thresholding using the entropy of the histogram. Computer Vision Graphics Image Process., 29:273–285, 1985.

    Google Scholar 

  10. C. Wolf, J. Jolion, and F. Chassaing. Text localization, enhancement and binarization in multimedia documents. ICPR, 4:1037–1040, 2002.

    Google Scholar 

  11. N. Otsu. A threshold selection method from gray-level histograms. IEEE Trans. Systems Man Cybernetics, 9(1):62–66, 1979.

    Google Scholar 

  12. Deepak Kumar, M. N. Anil Prasad and A. G. Ramakrishnan, “NESP: Nonlinear enhancement and selection of plane for optimal segmentation and recognition of scene word images,” Proc. International Conference on Document Recognition and Retrieval(DRR) XX, 5–7 February 2013, San Francisco, CA USA.

    Google Scholar 

  13. Deepak Kumar, M. N. Anil Prasad and A. G. Ramakrishnan, “MAPS: Midline analysis and propagation of segmentation,” Proc. 8th Indian Conference on Vision, Graphics and Image Processing (ICVGIP 2012), 16–19 December 2012, Mumbai, India.

    Google Scholar 

  14. D. Karatzas, S. Robles Mestre, J. Mas, F. Nourbakhsh and P. Pratim Roy, “ICDAR 2011 Robust Reading Competition—Challenge 1: Reading Text in Born-Digital Images (Web and Email)”, Proc. 11th International Conference of Document Analysis and Recognition, pp. 1485–1490, September 2011, 2011. http://www.cv.uab.es/icdar2011competition.

  15. D. Kumar and A. G. Ramkrishnan, “OTCYMIST: Otsu-Canny Minimal Spanning Tree for Born-Digital Images”, Proc. 10th International workshop on Document Analysis and Systems, 2012.

    Google Scholar 

  16. J.Canny. Acomputationalapproachtoedgedetection. IEEE trans. PAMI, 8(6):679–698, 1986.

    Google Scholar 

  17. B. Epshtein, E. Ofek and Y. Wexler, “Detecting text in natural scenes with stroke width transform”, Proc. 23rd IEEE conference on Computer Vision and Pattern Recognition, pp. 2963–2970, 2010.

    Google Scholar 

  18. P. Clark and M. Mirmhedi. Rectifying perspective views of text in 3-d scenes using vanishing points. Pattern Recognition, 36:2673–2686, 2003.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ravichandran Giritharan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Giritharan, R., Ramakrishnan, A.G. (2018). Color Components Clustering and Power Law Transform for the Effective Implementation of Character Recognition in Color Images. In: Sa, P., Sahoo, M., Murugappan, M., Wu, Y., Majhi, B. (eds) Progress in Intelligent Computing Techniques: Theory, Practice, and Applications. Advances in Intelligent Systems and Computing, vol 518. Springer, Singapore. https://doi.org/10.1007/978-981-10-3373-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3373-5_12

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3372-8

  • Online ISBN: 978-981-10-3373-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics