Color Components Clustering and Power Law Transform for the Effective Implementation of Character Recognition in Color Images

Giritharan, Ravichandran; Ramakrishnan, A. G.

doi:10.1007/978-981-10-3373-5_12

Ravichandran Giritharan¹⁹ &
A. G. Ramakrishnan²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 518))

1041 Accesses

Abstract

The problem related to the recognition of words in the color images is discussed here. The challenges with the color images are the distortions due to the low illumination, that cause problems while binarizing the color images. A novel method of Clustering of Color components and Power Law Transform for binarizing the camera captured color images containing texts. It is reported that the binarization by the proposed technique is improved and hence the rate of recognition in the camera captured color images gets increased. The similar color components are clustered together, and hence, Canny edge detection technique is applied to every image. The union operation is performed on images of different color components. In order to differentiate the text from background, a separate rectangular box is applied to every edge and each edge is applied with Power Law Transform. The optimum value for gamma is fixed as 1.5, and the operations are performed. The experiment is exhaustively performed on the datasets, and it is reported that the proposed algorithm performs well. The best performing algorithm in ICDAR Robust Reading Challenge is 62.5% by TH-OCR after pre-processing and 64.3% by ABBYY Fine Reader after pre-processing. The proposed algorithm reports the recognition rate of 79.2% using ABBYY Fine Reader. This proposed method can also be applied for the recognition of Born Digital Images, for which the recognition rate is 72.5%, earlier in the literature which is reported as 63.4% using ABBYY Fine Reader.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abbyy Fine reader. http://www.abbyy.com/.
Adobe Reader. http://www.adobe.com/products/acrobatpro/scanning-ocrto-pdf.html Document Analysis and Recognition, pp. 11–16, September 2011, 2011.
S. M. Lucas et.al, “ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions”, International Journal on Document Analysis and Recognition, vol. 7, no. 2, pp. 105–122, June 2005.
Google Scholar
A. Mishra, K. Alahari and C. V. Jawahar, “An MRF Model for Binarization of Natural Scene Text”, Proc. 11th International Conference of Document Analysis and Recognition, pp. 11–16, September 2011, 2011.
Google Scholar
A. Shahab, F. Shafait and A. Dengel, “ICDAR 2011 Robust Reading Competition—Challenge 2: Reading Text in Scene Images”, Proc. 11th International Conference of Document Analysis and Recognition, pp. 1491–1496, September 2011, 2011.
Google Scholar
Thotreingam Kasar, Jayant Kumar and A.G. Ramakrishnan, “Font and Background Color Independent Text Binarization”, Proc. II International Workshop on Camera-Based Document Analysis and Recognition (CBDAR 2007), Curitiba, Brazil, September 22, 2007, pp. 3–9.
Google Scholar
T. Kasar and A. G. Ramakrishnan, “COCOCLUST: Contour-based Color Clustering for Robust Text Segmentation and Binarization,” Proc. 3rd workshop on Camera-based Document Analysis and Recognition, pp. 11–17, 2009, Spain.
Google Scholar
Deepak Kumar and A. G. Ramakrishnan, “Power-law transformation for enhanced recognition of born-digital word images,” Proc. 9th International Conference on Signal Processing and Communications (SPCOM 2012), 22–25 July 2012, Bangalore, India.
Google Scholar
J. N. Kapur, P. K. Sahoo, and A. Wong. A new method for gray-level picture thresholding using the entropy of the histogram. Computer Vision Graphics Image Process., 29:273–285, 1985.
Google Scholar
C. Wolf, J. Jolion, and F. Chassaing. Text localization, enhancement and binarization in multimedia documents. ICPR, 4:1037–1040, 2002.
Google Scholar
N. Otsu. A threshold selection method from gray-level histograms. IEEE Trans. Systems Man Cybernetics, 9(1):62–66, 1979.
Google Scholar
Deepak Kumar, M. N. Anil Prasad and A. G. Ramakrishnan, “NESP: Nonlinear enhancement and selection of plane for optimal segmentation and recognition of scene word images,” Proc. International Conference on Document Recognition and Retrieval(DRR) XX, 5–7 February 2013, San Francisco, CA USA.
Google Scholar
Deepak Kumar, M. N. Anil Prasad and A. G. Ramakrishnan, “MAPS: Midline analysis and propagation of segmentation,” Proc. 8th Indian Conference on Vision, Graphics and Image Processing (ICVGIP 2012), 16–19 December 2012, Mumbai, India.
Google Scholar
D. Karatzas, S. Robles Mestre, J. Mas, F. Nourbakhsh and P. Pratim Roy, “ICDAR 2011 Robust Reading Competition—Challenge 1: Reading Text in Born-Digital Images (Web and Email)”, Proc. 11th International Conference of Document Analysis and Recognition, pp. 1485–1490, September 2011, 2011. http://www.cv.uab.es/icdar2011competition.
D. Kumar and A. G. Ramkrishnan, “OTCYMIST: Otsu-Canny Minimal Spanning Tree for Born-Digital Images”, Proc. 10th International workshop on Document Analysis and Systems, 2012.
Google Scholar
J.Canny. Acomputationalapproachtoedgedetection. IEEE trans. PAMI, 8(6):679–698, 1986.
Google Scholar
B. Epshtein, E. Ofek and Y. Wexler, “Detecting text in natural scenes with stroke width transform”, Proc. 23rd IEEE conference on Computer Vision and Pattern Recognition, pp. 2963–2970, 2010.
Google Scholar
P. Clark and M. Mirmhedi. Rectifying perspective views of text in 3-d scenes using vanishing points. Pattern Recognition, 36:2673–2686, 2003.
Google Scholar

Download references

Author information

Authors and Affiliations

E.G.S. Pillay Engineering College, Nagapattinam, Tamil Nadu, India
Ravichandran Giritharan
Medical Intelligence and Language Engineering Laboratory, Indian Institute of Science, Bengaluru, Karnataka, India
A. G. Ramakrishnan

Authors

Ravichandran Giritharan
View author publications
You can also search for this author in PubMed Google Scholar
A. G. Ramakrishnan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ravichandran Giritharan .

Editor information

Editors and Affiliations

National Institute of Technology, Dept. of Computer Science & Engineering National Institute of Technology, Rourkela, Odisha, India
Pankaj Kumar Sa
National Institute of Technology, Dept. of Computer Science & Engineering National Institute of Technology, Rourkela, Odisha, India
Manmath Narayan Sahoo
Universiti Malaysia Perlis (UniMAP), School of Mecahtronics Engineering Universiti Malaysia Perlis (UniMAP), Arau, Perlis, Malaysia
M. Murugappan
The University of Exeter, Lecturer The University of Exeter, Exeter, Devon, United Kingdom
Yulei Wu
National Institute of Technology, Dept. of Computer Science & Engineering National Institute of Technology, Rourkela, Odisha, India
Banshidhar Majhi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Giritharan, R., Ramakrishnan, A.G. (2018). Color Components Clustering and Power Law Transform for the Effective Implementation of Character Recognition in Color Images. In: Sa, P., Sahoo, M., Murugappan, M., Wu, Y., Majhi, B. (eds) Progress in Intelligent Computing Techniques: Theory, Practice, and Applications. Advances in Intelligent Systems and Computing, vol 518. Springer, Singapore. https://doi.org/10.1007/978-981-10-3373-5_12

Download citation

DOI: https://doi.org/10.1007/978-981-10-3373-5_12
Published: 13 July 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3372-8
Online ISBN: 978-981-10-3373-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics