Abstract
Script identification from document images is an essential task before choosing script-specific OCR for a Multi-lingual/Multi-script country like India. The problem becomes more complex when handwritten document images are considered. Several techniques have been developed so far for HSI (Handwritten Script Identification) problem and the work is still in progress. But the issue of dimensionality reduction of the feature set for script identification problem has not been addressed in the literature till date. This paper presents a statistical performance analysis of different attribute selection techniques in a multi-classifier environment for HSI problem on Indic scripts. A GAS (Greedy Attribute Selection) technique for HSI problem has also been proposed here. Encouraging outcomes are found observing the complexities of handwritten Indic scripts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Obaidullah, S.M., Das, S.K., Roy, K.: A system for handwritten script identification from indian document. J. Pattern Recognit. Res. 8(1), 1–12 (2013)
Ghosh, D., Dube, T., Shivprasad, S.P.: Script recognition—a review. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2142–2161 (2010)
Chaudhuri, B.B., Pal, U.: A complete printed Bangla OCR. Pattern Recogn. 31, 531–549 (1998)
Pal, U., Chaudhuri, B.B.: Identification of different script lines from multi-script documents. Image Vis. Comput. 20(13-14), 945–954 (2002)
Hochberg, J., Kelly, P., Thomas, T., Kerns, L.: Automatic script identification from document images using cluster-based templates. IEEE Trans. Pattern Anal. Mach. Intell. 19, 176–181 (1997)
Chaudhury, S., Harit, G., Madnani, S., Shet, R.B.: Identification of scripts of Indian languages by combining trainable classifiers. In: Proceedings of Indian Conference on Computer Vision, Graphics and Image Processing, Bangalore, India, Dec-20–22 2000
Dhanya, D., Ramakrishnan, A.G., Pati, P.B.: Script identification in printed bilingual documents. In: Sadhana, vol. 27, part-1, pp. 73–82 (2002)
Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. Pattern Recogn. Lett. 29(9), 1218–1229 (2008)
Obaidullah, S.M., Mondal, A., Das, N., Roy, K.: Script Identification from printed indian document images and performance evaluation using different classifiers. Appl. Comput. Intell. Soft Comput, vol. 2014, p. 12. Article ID 896128 (2014). doi:10.1155/2014/896128
Roy, K., Banerjee, A., Pal, U.: A system for word-wise handwritten script identification for indian postal automation. In: Proceedings of IEEE India Annual Conference 2004, pp. 266-271 (2004)
Vajda, S., Roy, K., Pal, U., Chaudhuri, B.B., Belaid, A.: Automation of Indian postal documents written in Bangla and English. Int. J. Pattern Recognit. Artif. Intell. 23(8), 1599–1632 (2009)
http://www.mathworks.in/help/pdf_doc/images/images_tb.pdf. Accessed 01 Feb 2015
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11, 10–18 (2009)
http://www.scholarpedia.org/article/Evolution_strategies. Accessed 01 March 2015
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Boston (1989)
Guetlein, M., Frank, E. Hall, M., Karwath, A.: Large scale attribute selection using wrappers. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, pp. 332–339 (2009)
Moraglio, A., Chio, D., Poli, C.R.: Geometric particle swarm optimization. In: Proceedings of the 10th European Conference on Genetic Programming, Berlin, Heidelberg, pp. 125–136 (2007)
Hall, M., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15(6), 1437–1447 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer India
About this paper
Cite this paper
Obaidullah, S.M., Halder, C., Das, N., Roy, K. (2016). Handwritten Indic Script Identification from Document Images—A Statistical Comparison of Different Attribute Selection Techniques in Multi-classifier Environment. In: Satapathy, S., Raju, K., Mandal, J., Bhateja, V. (eds) Proceedings of the Second International Conference on Computer and Communication Technologies. Advances in Intelligent Systems and Computing, vol 381. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2526-3_51
Download citation
DOI: https://doi.org/10.1007/978-81-322-2526-3_51
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2525-6
Online ISBN: 978-81-322-2526-3
eBook Packages: EngineeringEngineering (R0)