Abstract
This paper presents the design of a full fledged OCR system for printed Kannada text. The machine recognition of Kannada characters is dificult due to similarity in the shapes of different characters, script complexity and non-uniqueness in the representation of diacritics. The document image is subject to line segmentation, word segmentation and zone detection. From the zonal information, base characters, vowel modifiers and consonant conjucts are separated. Knowledge based approach is employed for recognizing the base characters. Various features are employed for recognising the characters. These include the coefficients of the Discrete Cosine Transform, Discrete Wavelet Transform and Karhunen-Louve Transform. These features are fed to different classifiers. Structural features are used in the subsequent levels to discriminate confused characters. Use of structural features, increases recognition rate from 93% to 98%. Apart from the classical pattern classification technique of nearest neighbour, Artificial Neural Network (ANN) based classifiers like Back Propogation and Radial Basis Function (RBF) Networks have also been studied. The ANN classifiers are trained in supervised mode using the transform features. Highest recognition rate of 99% is obtained with RBF using second level approximation coefficients of Haar wavelets as the features on presegmented base characters.
Keywords
- Radial Basis Function
- Discrete Cosine Transform
- Discrete Wavelet Transform
- Near Neighbour
- Base Character
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Chaudhuri, B.B., Pal, U.: A Complete Printed Bangla OCR System. Pattern Recognition, Vol. 31, No. 5 (1998) 531–549
Lu, Y.I.: Machine Printed Character Segmentation — An Overview. Pattern Recognition, Vol. 28, No. 1 (1995) 67–80
Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison Wesley, New York (1993)
Haykin, S.: Neural Networks. A Comprehensive Foundation. Pearson Education Asia (1999)
Jagadeesh, G.S., Gopinath, V.: Kantex, A Transliteration Package for Kannada. Kantex Manual. http://langmuir.eecs.berkeley.edu/venkates/KanTex 1.00.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vijay Kumar, B., Ramakrishnan, A.G. (2002). Machine Recognition of Printed Kannada Text. In: Lopresti, D., Hu, J., Kashi, R. (eds) Document Analysis Systems V. DAS 2002. Lecture Notes in Computer Science, vol 2423. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45869-7_4
Download citation
DOI: https://doi.org/10.1007/3-540-45869-7_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44068-0
Online ISBN: 978-3-540-45869-2
eBook Packages: Springer Book Archive