Extraction and Identification of Manipuri and Mizo Texts from Scene and Document Images
Abstract
The content inside an image is exceptionally compelling. As such, text within an image can be of special interest and compared to other semantic contents, it tends to be effectively extracted. Text detection within an image is the task of detecting and localizing the portion of an image that contains the text information. Manipuri and Mizo are respectively the lingua francas of two neighboring northeastern states of Manipur and Mizoram in India. While Manipuri, is currently written using Meetei Mayek script and Bengali script, Mizo is written in Roman script with circumflex accent added to the vowels. In this work, we report the task of text detection in natural scene images and document images in Manipuri and Mizo. We made a comparative study between Maximally Stable Extremal Regions (MSER) coupled with Stroke Width Transform (SWT) and Efficient and Accurate Scene Text Detector (EAST) for the text detection. The detected text portion of both the languages is subjected to Optical Character Recognition (OCR) and a post OCR processing of spelling correction. In our experiment of the text detection, EAST outperformed the other method.
Keywords
Text detection SWT MSER EAST OCR Manipuri MizoNotes
Acknowledgments
This work is supported by Scheme for Promotion of Academic and Research Collaboration (SPARC) Project Code: P995 of No: SPARC/2018-2019/119/SL(IN) under MHRD, Govt of India.
References
- 1.Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 8(6), 679–698 (1986)CrossRefGoogle Scholar
- 2.Cano, J., Pérez-Cortés, J.-C.: Vehicle license plate segmentation in natural images. In: Perales, F.J., Campilho, A.J.C., de la Blanca, N.P., Sanfeliu, A. (eds.) IbPRIA 2003. LNCS, vol. 2652, pp. 142–149. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-44871-6_17CrossRefGoogle Scholar
- 3.Chen, D.M., Tsai, S.S., Girod, B., Hsu, C.H., Kim, K.H., Singh, J.P.: Building book inventories using smartphones. In: Proceedings of the 18th ACM international conference on Multimedia, pp. 651–654. ACM (2010)Google Scholar
- 4.Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: 2011 18th IEEE International Conference on Image Processing, pp. 2609–2612. IEEE (2011)Google Scholar
- 5.Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004, CVPR 2004, vol. 2, pp. II–II. IEEE (2004)Google Scholar
- 6.Devi, C.N., Devi, H.M., Das, D.: Text detection from natural scene images for manipuri meetei mayek script. In: 2015 IEEE International Conference on Computer Graphics, Vision and Information Security (CGVIS), pp. 248–251. IEEE (2015)Google Scholar
- 7.Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2963–2970. IEEE (2010)Google Scholar
- 8.Ezaki, N., Bulacu, M., Schomaker, L.: Text detection from natural scene images: towards a system for visually impaired persons. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004, ICPR 2004, vol. 2, pp. 683–686. IEEE (2004)Google Scholar
- 9.Karatzas, D., et al.: ICDAR 2015 competition on robust reading. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1156–1160. IEEE (2015)Google Scholar
- 10.Kim, K.I., Jung, K., Kim, J.H.: Color Texture-based object detection: an application to license plate localization. In: Lee, S.-W., Verri, A. (eds.) SVM 2002. LNCS, vol. 2388, pp. 293–309. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45665-1_23CrossRefGoogle Scholar
- 11.Kim, S.K., Kim, D.W., Kim, H.J.: A recognition of vehicle license plate using a genetic algorithm based segmentation. In: Proceedings of 3rd IEEE International Conference on Image Processing, vol. 2, pp. 661–664. IEEE (1996)Google Scholar
- 12.Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: Seventh International Conference on Document Analysis and Recognition, 2003, Proceedings, pp. 682–687. Citeseer (2003)Google Scholar
- 13.Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)CrossRefGoogle Scholar
- 14.Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)CrossRefGoogle Scholar
- 15.Özgen, A.C., Fasounaki, M., Ekenel, H.K.: Text detection in natural and computer-generated images. In: 2018 26th Signal Processing and Communications Applications Conference (SIU), pp. 1–4. IEEE (2018)Google Scholar
- 16.Tsai, S.S., et al.: Mobile product recognition. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1587–1590. ACM (2010)Google Scholar
- 17.Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. Pattern Recogn. 28(10), 1523–1535 (1995)CrossRefGoogle Scholar
- 18.Zhou, X., et al.: East: an efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5551–5560 (2017)Google Scholar