Color Components Clustering and Power Law Transform for the Effective Implementation of Character Recognition in Color Images
The problem related to the recognition of words in the color images is discussed here. The challenges with the color images are the distortions due to the low illumination, that cause problems while binarizing the color images. A novel method of Clustering of Color components and Power Law Transform for binarizing the camera captured color images containing texts. It is reported that the binarization by the proposed technique is improved and hence the rate of recognition in the camera captured color images gets increased. The similar color components are clustered together, and hence, Canny edge detection technique is applied to every image. The union operation is performed on images of different color components. In order to differentiate the text from background, a separate rectangular box is applied to every edge and each edge is applied with Power Law Transform. The optimum value for gamma is fixed as 1.5, and the operations are performed. The experiment is exhaustively performed on the datasets, and it is reported that the proposed algorithm performs well. The best performing algorithm in ICDAR Robust Reading Challenge is 62.5% by TH-OCR after pre-processing and 64.3% by ABBYY Fine Reader after pre-processing. The proposed algorithm reports the recognition rate of 79.2% using ABBYY Fine Reader. This proposed method can also be applied for the recognition of Born Digital Images, for which the recognition rate is 72.5%, earlier in the literature which is reported as 63.4% using ABBYY Fine Reader.
KeywordsColor component clustering Canny edge detection Power law transform Binarization Word recognition
- 1.Abbyy Fine reader. http://www.abbyy.com/.
- 2.Adobe Reader. http://www.adobe.com/products/acrobatpro/scanning-ocrto-pdf.html Document Analysis and Recognition, pp. 11–16, September 2011, 2011.
- 3.S. M. Lucas et.al, “ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions”, International Journal on Document Analysis and Recognition, vol. 7, no. 2, pp. 105–122, June 2005.Google Scholar
- 4.A. Mishra, K. Alahari and C. V. Jawahar, “An MRF Model for Binarization of Natural Scene Text”, Proc. 11th International Conference of Document Analysis and Recognition, pp. 11–16, September 2011, 2011.Google Scholar
- 5.A. Shahab, F. Shafait and A. Dengel, “ICDAR 2011 Robust Reading Competition—Challenge 2: Reading Text in Scene Images”, Proc. 11th International Conference of Document Analysis and Recognition, pp. 1491–1496, September 2011, 2011.Google Scholar
- 6.Thotreingam Kasar, Jayant Kumar and A.G. Ramakrishnan, “Font and Background Color Independent Text Binarization”, Proc. II International Workshop on Camera-Based Document Analysis and Recognition (CBDAR 2007), Curitiba, Brazil, September 22, 2007, pp. 3–9.Google Scholar
- 7.T. Kasar and A. G. Ramakrishnan, “COCOCLUST: Contour-based Color Clustering for Robust Text Segmentation and Binarization,” Proc. 3rd workshop on Camera-based Document Analysis and Recognition, pp. 11–17, 2009, Spain.Google Scholar
- 8.Deepak Kumar and A. G. Ramakrishnan, “Power-law transformation for enhanced recognition of born-digital word images,” Proc. 9th International Conference on Signal Processing and Communications (SPCOM 2012), 22–25 July 2012, Bangalore, India.Google Scholar
- 9.J. N. Kapur, P. K. Sahoo, and A. Wong. A new method for gray-level picture thresholding using the entropy of the histogram. Computer Vision Graphics Image Process., 29:273–285, 1985.Google Scholar
- 10.C. Wolf, J. Jolion, and F. Chassaing. Text localization, enhancement and binarization in multimedia documents. ICPR, 4:1037–1040, 2002.Google Scholar
- 11.N. Otsu. A threshold selection method from gray-level histograms. IEEE Trans. Systems Man Cybernetics, 9(1):62–66, 1979.Google Scholar
- 12.Deepak Kumar, M. N. Anil Prasad and A. G. Ramakrishnan, “NESP: Nonlinear enhancement and selection of plane for optimal segmentation and recognition of scene word images,” Proc. International Conference on Document Recognition and Retrieval(DRR) XX, 5–7 February 2013, San Francisco, CA USA.Google Scholar
- 13.Deepak Kumar, M. N. Anil Prasad and A. G. Ramakrishnan, “MAPS: Midline analysis and propagation of segmentation,” Proc. 8th Indian Conference on Vision, Graphics and Image Processing (ICVGIP 2012), 16–19 December 2012, Mumbai, India.Google Scholar
- 14.D. Karatzas, S. Robles Mestre, J. Mas, F. Nourbakhsh and P. Pratim Roy, “ICDAR 2011 Robust Reading Competition—Challenge 1: Reading Text in Born-Digital Images (Web and Email)”, Proc. 11th International Conference of Document Analysis and Recognition, pp. 1485–1490, September 2011, 2011. http://www.cv.uab.es/icdar2011competition.
- 15.D. Kumar and A. G. Ramkrishnan, “OTCYMIST: Otsu-Canny Minimal Spanning Tree for Born-Digital Images”, Proc. 10th International workshop on Document Analysis and Systems, 2012.Google Scholar
- 16.J.Canny. Acomputationalapproachtoedgedetection. IEEE trans. PAMI, 8(6):679–698, 1986.Google Scholar
- 17.B. Epshtein, E. Ofek and Y. Wexler, “Detecting text in natural scenes with stroke width transform”, Proc. 23rd IEEE conference on Computer Vision and Pattern Recognition, pp. 2963–2970, 2010.Google Scholar
- 18.P. Clark and M. Mirmhedi. Rectifying perspective views of text in 3-d scenes using vanishing points. Pattern Recognition, 36:2673–2686, 2003.Google Scholar