A Technique for Segmentation of Gurmukhi Text
This paper describes a technique for text segmentation of machine printed Gurmukhi script documents. Research in the field of segmentation of Gurmukhi script faces major problems mainly related to the unique characteristics of the script like connectivity of characters on the headline, two or more characters in a word having intersecting minimum bounding rectangles, multi-component characters, touching characters which are present even in clean documents. The segmentation problems unique to the Gurmukhi script such as horizontally overlapping text segments and touching characters in various zonal positions in a word have been discussed in detail and a solution has been proposed.
Keywordstext segmentation Gurmukhi script
Unable to display preview. Download preview PDF.
- Pal, U., Chaudhuri, B.B.: Printed Devnagri Script OCR System. Vivek, 10 (1997) 12–24Google Scholar
- Bansal, V.: Integrating knowledge sources in Devanagri text recognition. Ph.D. thesis, IIT Kanpur, INDIA (1999)Google Scholar
- Goyal, A.K., Lehal, G.S., Deol, S.S.: Segmentation of Machine Printed Gurmukhi Script. In: Proceedings 9th International Graphonomics Society Conference, Singapore (1999) 293–297Google Scholar
- Lehal, G.S., Singh, S.: Text segmentation of Machine Printed Gurmukhi Script. Document Recognition and Retrieval VIII, Kantor, P.B., Lopresti, D.P., Jiangying Zhou, (eds.): Proceedings SPIE, USA, Vol. 4307 (2001) 223–231Google Scholar