Abstract
Two methods, Symbolic Indirect Correlation (SIC) and Style Constrained Classification (SCC), are proposed for recognizing handwritten Arabic and Chinese words and phrases. SIC reassembles variable-length segments of an unknown query that match similar segments of labeled reference words. Recognition is based on the correspondence between the order of the feature vectors and of the lexical transcript in both the query and the references. SIC implicitly incorporates language context in the form of letter n-grams. SCC is based on the notion that the style (distortion or noise) of a character is a good predictor of the distortions arising in other characters, even of a different class, from the same source. It is adaptive in the sense that, with a long-enough field, its accuracy converges to that of a style-specific classifier trained on the writer of the unknown query. Neither SIC nor SCC requires the query words to appear among the references.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Casey, R.G., Nagy, G.: Recognition of Printed Chinese Characters. IEEE Transactions on Electronic Computers 15, 91–100 (1966)
Casey, R.G., Nagy, G.: Chinese Character Recognition: A Twenty-five-year Perspective. In: Proc. International Conference on Pattern Recognition, Rome, Italy, pp. 1023–1026 (1988)
Kanai, J., Liu, Y., Nagy, G.: An OCR-oriented Overview of Ideographic Writing Systems. In: Bunke, H., Wang, P.S.P. (eds.) Handbook of Character Recognition and Document Image Analysis, pp. 285–304. World Scientific, Singapore (1997)
Lorigo, L.M., Govindaraju, V.: Offline Arabic Handwriting Recognition: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 712–724 (2006)
Lopresti, D., Nagy, G.: Mobile Interactive Support System for Time-Critical Document Exploitation. In: Proc. Symposium on Document Image Understanding Technology, College Park, MD, pp. 111–119 (2005)
Nagy, G., Seth, S.C., Mehta, S.K., Lin, Y.: Indirect Symbolic Correlation Approach to Unsegmented Text Recognition. In: Proc. Conference on Computer Vision and Pattern Recognition Workshop on Document Image Analysis and Retrieval (DIAR 2003), Madison, WI, pp. 22–32 (2003)
Nagy, G., Lopresti, D., Krishnamoorthy, M., Lin, Y., Seth, S., Mehta, S.: A Nonparametric classifier for unsegmented text. In: Proc. IS&T-SPIE International Symposium on Document Recognition and Retrieval, San Jose, pp. 102–108 (2004)
Joshi, A., Nagy, G.: Online Handwriting Recognition Using Time-Order of Lexical and Signal Co-Occurrences. In: Proc. 12th Biennial Conference of the International Graphonomics Society, Salerno, Italy, pp. 201–205 (2005)
Lopresti, D., Joshi, A., Nagy, G.: Match Graph Generation for Symbolic Indirect Correlation. In: Proc. IS&T-SPIE International Symposium on Document Recognition and Retrieval, San Jose, CA, vol. 6067-06 (2006)
Joshi, A., Nagy, G., Lopresti, D., Seth, S.: A Maximum-Likelihood Approach to Symbolic Indirect Correlation. In: Proc. International Conference on Pattern Recognition, Hong Kong, China, pp. 99–103 (2006)
Joshi, A.: Symbolic Indirect Correlation Classifier. Rensselaer Polytechnic Institute, ECSE Department, Troy, NY, Ph.D. Thesis (2006)
Märgner, V., Pechwitz, M.: IFN/ENIT-database: Database of Handwritten Arabic Words. Available online at (2007), http://www.ifnenit.com/index.htm
El-Hajj, R., Likforman-Sulem, L., Mokbel, C.: Arabic Handwriting Recognition Using Baseline Dependant Features and Hidden Markov Modeling. In: Proc. Int. Conference on Document Analysis and Recognition ICDAR, pp. 893–897 (2005)
Smith, T.F., Waterman, M.S.: Identification of Common Molecular Sequences. Journal of Molecular Biology 147, 195–197 (1981)
Hull, J.J.: Incorporating Language Syntax in Visual Text Recognition with a Statistical Model. IEEE Transactions on Pattern Analysis and Machine Intelligence 18, 1251–1256 (1996)
Nagy, G., Shelton, G.L.: Self-Corrective Character Recognition System. IEEE Transactions on Information Theory 12, 215–222 (1966)
Baird, H.S., Nagy, G.: A Self-correcting 100-font Classifier. In: Proc. IS&T/SPIE International Symposium on Document Recognition and Retrieval, San Jose, CA, pp. 106–115 (1994)
Xu, Y., Nagy, G.: Prototype Extraction and Adaptive OCR. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 1280–1296 (1999)
Ho, T.K., Nagy, G.: OCR with No Shape Training. In: Proc. International Conference on Pattern Recognition, Barcelona, Spain, pp. 27–30 (2000)
Marosi, I., Tóth, L.: OCR Voting Methods for Recognizing Low Contrast Printed Documents. In: Proc. 2nd IEEE International Conference on Document Image Analysis for Libraries (DIAL 2006), Lyon, France, pp. 108–115.
Veeramachaneni, S., Nagy, G.: Adaptive Classifiers for Multisource OCR. International Journal of Document Analysis and Recognition 6, 154–166 (2004)
Veeramachaneni, S., Nagy, G.: Style Context with Second Order Statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 14–22 (2005)
Sarkar, P., Nagy, G.: Style consistent classification of isogenous patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 88–98 (2005)
Embley, D.W., Lopresti, D., Nagy, G.: Notes on Contemporary Table Recognition. In: Proc. 7th International Workshop on Document Analysis Systems (DAS 2006), Nelson, New Zealand, pp. 164–175 (2006)
Nagy, G., Lopresti, D.: Interactive Document Processing and Digital Libraries. In: Proc. 2nd IEEE International Conference on Document Image Analysis for Libraries (DIAL 2006), Lyon, France, pp. 2–11 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lopresti, D., Nagy, G., Seth, S., Zhang, X. (2008). Multi-character Field Recognition for Arabic and Chinese Handwriting. In: Doermann, D., Jaeger, S. (eds) Arabic and Chinese Handwriting Recognition. SACH 2006. Lecture Notes in Computer Science, vol 4768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78199-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-78199-8_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78198-1
Online ISBN: 978-3-540-78199-8
eBook Packages: Computer ScienceComputer Science (R0)