Headline Based Text Extraction from Outdoor Images
The goal of this article is to design an effective scheme for extraction of Bangla/Devnagari text from outdoor images. We first segment a color image using fuzzy c-means algorithm. In Bangla/Devnagari script, text may be attached/unattached to the headlines. Hence, after segmentation, headlines are detected from each connected components using morphology. Now, the components attached or close to the detected headlines are separated. Further by applying certain shape and position based purification we could distinguish text and non text. Our experiments on a dataset of 100 outdoor images containing Bangla and/or Devnagari text reveals satisfactory performance.
- 3.Jung, K., Kim, I.K., Kurata, T., Kourogi, M., Han, H.J.: Text scanner with text detection technology on image sequences. In: Proc. of Int. Conf. on Pattern Recognition, vol. 3, pp. 473–476 (2002)Google Scholar
- 4.Bhattacharya, U., Parui, S.K., Mondal, S.: Devanagari and bangla text extraction from natural scene images. In: Proc. of the Int. Conf. on Document Analysis and Recognition, pp. 171–175 (2009)Google Scholar
- 5.Roy, A., Parui, S.K., Paul, A., Roy, U.: A color based image segmentation and its application to text segmentation. In: Proc. of Ind. Conf. on Computer Vision, Graphics & Image Processing, pp. 313–319 (2008)Google Scholar