Abstract
The market of handwriting recognition applications is increasing rapidly due to continuous advancement in OCR technology. This paper summarizes our recent efforts on offline handwritten Chinese script recognition using a segmentation-driven approach. We address two essential problems, namely isolated character recognition and establishment of the probabilistic segmentation model. To improve the isolated character recognition accuracy, we propose a heteroscedastic linear discriminant analysis algorithm to extract more discrimination information from original character features, and implement a minimum classification error learning scheme to optimize classifier parameters. In the segmentation stage, information from three different sources, namely geometric layout, character recognition confidence, and semantic model are integrated into a probabilistic framework to give the best script interpretation. Experimental results on postal address and bank check recognition have demonstrated the effectiveness of our proposed algorithms: A more than 80% correct recognition rate is achieved on 1,000 handwritten Chinese address items, and the recognition reliability of bank checks is largely improved after combining courtesy amount recognition result with legal amount recognition result. Some preliminary research work on Arabic script recognition is also shown.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Senior, A.W., Robinson, A.J.: An off-line cursive handwriting recognition system. IEEE Trans. PAMI 20(3), 309–321 (1998)
Arica, N., Yarman, V.F.T.: An overview of character recognition focused on off-line handwriting. IEEE Trans. on Systems, Man, and Cybernetics—Part C: Applications and Reviews 31(2), 216–233 (2001)
Koerich, A.L., Sabourin, R., Suen, C.Y.: Large vocabulary off-line handwriting recognition: A survey. Pattern Analysis and Applications 6(2), 97–121 (2003)
Bunke, H.: Recognition of cursive Roman handwriting - past, present and future. In: Proc. of 7th International Conference on Document Analysis and Recognition, pp. 448–459 (2003)
Casey, R.G., Lecolinet, E.: A survey of methods and strategies in character segmentation. IEEE Trans. PAMI 18(7), 690–706 (1996)
Lu, Y., Shridhar, M.: Character segmentation in handwritten words - An overview. Pattern Recognition 29(1), 77–96 (1996)
Tseng, L.Y., Chen, R.C.: Segmenting handwritten Chinese characters based on heuristic merging of stroke bounding boxes and dynamic programming. Pattern Recognition Letters 19(8), 963–973 (1998)
Tseng, Y.H., Lee, H.J.: Recognition-based handwritten Chinese character segmentation using a probabilistic viterbi algorithm. Pattern Recognition Letters 20(8), 791–806 (1999)
Gao, J., Ding, X.Q., Wu, Y.S.: A segmentation algorithm for handwritten Chinese character recognition. In: Proc. of 5th International Conference on Document Analysis and Recognition, pp. 633–636 (1999)
Zhao, S.Y., Chi, Z.R., Shi, P.F., Yan, H.: Two-stage segmentation of unconstrained handwritten Chinese characters. Pattern Recognition 36(1), 145–156 (2003)
Li, Y.X., Ding, X.Q., Tan, C.L., Liu, C.S.: Contextual post-processing based on the confusion matrix in offline handwritten Chinese script recognition. Pattern Recognition 37(9), 1901–1912 (2004)
Xue, J.L., Ding, X.Q.: Location and interpretation of destination addresses on handwritten Chinese envelopes. Pattern Recognition Letters 22(6), 639–656 (2001)
Yu, M.L., Kwok, P.C.K., Leung, C.H., et al.: Segmentation and recognition of Chinese bank check amounts. International Journal on Document Analysis and Recognition 3(4), 207–217 (2001)
Liu, C.L., Koga, M., Fujisawa, H.: Lexicon-driven segmentation and recognition of handwritten character strings for Japanese address reading. IEEE Trans. PAMI 24(11), 1425–1437 (2002)
Lu, Y., Tan, C.L., Shi, P.F., Zhang, K.H.: Segmentation of handwritten Chinese characters from destination addresses of mail pieces. International Journal of Pattern Recognition and Artificial Intelligence 16(1), 85–96 (2002)
Tang, H.S., Augustin, E., Suen, C.Y., et al.: Recognition of unconstrained legal amounts handwritten on Chinese bank check. In: Proc. of 17th International Conference on Pattern Recognition, pp. 610–613 (2004)
Plamandon, R., Srihari, S.N.: Online and offline handwriting recognition: A comprehensive survey. IEEE Trans. PAMI 22(1), 63–84 (2000)
Suen, C.Y., Mori, S., Kim, S.H., Leung, C.H.: Analysis and recognition of Asian scripts - the state of the art. In: Proc. of 7th International Conference on Document Analysis and Recognition, pp. 866–878 (2003)
Yamada, H., Yamamoto, K., Saito, T.: A nonlinear normalization method for handprinted Kanji character recognition – line density equalization. Pattern Recognition 23(9), 1023–1029 (1990)
Liu, C.L., Nakashima, K., Sako, H., Fujisawa, H.: Handwritten digit recognition: investigation of normalization and feature extraction techniques. Pattern Recognition 37(2), 265–279 (2004)
Liu, H.L., Ding, X.Q.: Handwritten character recognition using gradient feature and quadratic classifier with multiple discrimination schemes. In: Proc. of 8th International Conference on Document Analysis and Recognition, pp. 19–25 (2005)
Kimura, F., Takashina, K., Tsuruoka, S., Miyake, Y.: Modified quadratic discriminant functions and its application to Chinese character recognition. IEEE Trans. PAMI 9(1), 149–153 (1987)
Liu, H.L., Ding, X.Q.: Improve handwritten character recognition performance by Heteroscedastic linear discriminant analysis. In: Proc. of 18th International Conference on Pattern Recognition, vol. 1, pp. 880–883 (2006)
Loog, M., Duin, R.P.W., Haeb-Umbach, R.: Multiclass linear dimension reduction by weighted pairwise fisher criteria. IEEE Trans. PAMI 23(7), 762–766 (2001)
Loog, M., Duin, R.P.W.: Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion. IEEE Trans. PAMI 26(6), 732–739 (2004)
Juang, B.H., Katagiri, S.: Discriminative learning for minimum error classification. IEEE Trans. on Signal Processing 40(12), 3043–3054 (1992)
Katagiri, S., Juang, B.H., Lee, C.H.: Pattern recognition using a family of design algorithms based upon the generalized probability descent method. Proceedings of the IEEE 86(11), 2345–2373 (1998)
Watanabe, H., Katagiri, S.: Subspace method for minimum error pattern recognition. IEICE Trans. on Information and System E80-D(12), 1095–1104 (1997)
Zhang, R., Ding, X.Q., Zhang, J.Y.: Offline handwritten character recognition based on discriminative training of orthogonal Gaussian mixture model. In: Proc. of 6th International Conference on Document Analysis and Recognition, pp. 221–225 (2001)
Liu, C.L., Sako, H., Fujisawa, H.: Discriminative learning quadratic discriminant function for handwriting recognition. IEEE Trans. on Neural Networks 15(2), 430–444 (2004)
Tseng, L.Y., Chuang, C.T.: An efficient knowledge based stoke extraction method for multi-font Chinese characters. Pattern Recognition 25(12), 1445–1458 (1992)
Wang, R., Ding, X.Q., Liu, C.S.: Handwritten Chinese address segmentation and recognition based on merging strokes. Journal of Tsinghua Univ (Sci & Tech) 44(4), 498–502 (2004)
Liu, C.L., Nakagawa, M.: Precise candidate selection for large character set recognition by confidence evaluation. IEEE Trans. PAMI 22(6), 636–642 (2000)
Fu, Q., Ding, X.Q., Liu, C.S., Jiang, Y., Ren, Z.: A Hiddern Markov Model based segmentation and recognition algorithm for Chinese handwritten address character strings. In: Proc. of 8th International Conference on Document Analysis and Recognition, pp. 590–594 (2005)
Jiang, Y., Ding, X.Q., Fu, Q., Ren, Z.: Application of Bi-gram driven Chinese handwritten character segmentation for an address reading system. In: 7th International Workshop on Document Analysis Systems, pp. 220–231 (2006)
Olivier, C., Miled, H., et al.: Segmentation and coding of Arabic handwritten words. In: Proc. of 13th International Conference on Pattern Recognition, pp. 264–268 (1996)
Cheung, A., Bennamoun, M., Bergmann, N.W.: An Arabic optical character recognition system using recognition-based segmentation. Pattern Recognition 34(2), 215–233 (2001)
Xiu, P.P., Peng, L.R., Ding, X.Q., Wang, H.: Offline handwritten Arabic character segmentation with probabilistic model. In: Proc. of 7th International Workshop on Document Analysis Systems, pp. 402–412 (2006)
Jin, J.M., Wang, H., Ding, X.Q., Peng, L.R.: Printed Arabic document recognition system. In: Latecki, L.J., Mount, D.M., Wu, A.Y. (eds.) Vision Geometry XIII. Proceedings of the SPIE 5676, pp. 48–55 (2004)
Jiang, Y., Ding, X.Q., Ren, Z.: Substring alignment method for lexicon based handwritten Chinese string recognition and its application to address line recognition. In: Proc. of 18th International Conference on Pattern Recognition, vol. 2, pp. 683–686 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ding, X., Liu, H. (2008). Segmentation-Driven Offline Handwritten Chinese and Arabic Script Recognition. In: Doermann, D., Jaeger, S. (eds) Arabic and Chinese Handwriting Recognition. SACH 2006. Lecture Notes in Computer Science, vol 4768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78199-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-78199-8_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78198-1
Online ISBN: 978-3-540-78199-8
eBook Packages: Computer ScienceComputer Science (R0)