Abstract
A new large Urdu handwriting database, which includes isolated digits, numeral strings with/without decimal points, five special symbols, 44 isolated characters, 57 Urdu words (mostly financial related), and Urdu dates in different patterns, was designed at Centre for Pattern Recognition and Machine Intelligence (CENPARMI). It is the first database for Urdu off-line handwriting recognition. It involves a large number of Urdu native speakers from different regions of the world. Moreover, the database has different formats – true color, gray level and binary. Experiments on Urdu digits recognition has been conducted with an accuracy of 98.61%. Methodologies in image pre-processing, gradient feature extraction and classification using SVM have been described, and a detailed error analysis is presented on the recognition results.
Chapter PDF
Similar content being viewed by others
References
Anwar, W., Wang, X., Wang. X.-L.: A survey of automatic Urdu language processing. In: Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, China, pp. 13–16 (2006)
Dehghan, M., Faez, K., Ahmadi, M., Shridhar, M.: Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM. Pattern Recognition 34(5), 1057–1065 (2001)
Liu, C.-L., Nakashima, K., Sako, H., Fujisawa, H.: Handwritten digit recognition: Investigation of normalization and feature extraction techniques. Pattern Recognition 37(2), 265–279 (2004)
Liu, C.-L., Suen, C.Y.: A new benchmark on the recognition of handwritten Bangla and Farsi numeral characters. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition (ICFHR), Montreal, Canada, pp. 278–283 (2008)
Otsu, N.: A threshold selection method from gray-level histogram. IEEE Trans. System Man Cybernet. 9, 1569–1576 (1979)
Shi, M., Fujisawa, Y., Wakabayashi, T., Kimura, F.: Handwritten numeral recognition using gradient and curvature of gray scale image. Pattern Recognition 35(10), 2051–2059 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sagheer, M.W., He, C.L., Nobile, N., Suen, C.Y. (2009). A New Large Urdu Database for Off-Line Handwriting Recognition. In: Foggia, P., Sansone, C., Vento, M. (eds) Image Analysis and Processing – ICIAP 2009. ICIAP 2009. Lecture Notes in Computer Science, vol 5716. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04146-4_58
Download citation
DOI: https://doi.org/10.1007/978-3-642-04146-4_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04145-7
Online ISBN: 978-3-642-04146-4
eBook Packages: Computer ScienceComputer Science (R0)