Abstract
Three different methods for the synthetic generation of handwritten text are introduced. These methods are experimentally evaluated in the context of a cursive handwriting recognition task, using an HMM-based recognizer. In the experiments, the performance of a traditional recognizer, which is trained on data produced by human writers, is compared to a system that is trained on synthetic data only. Under the most elaborate synthetic handwriting generation model, a level of performance comparable to, or even slightly better than, the system trained on the writing of humans was observed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Plamondon, R., Srihari, S.: On-line and off-line handwriting recognition: A comprehensive survey. IEEE Trans. PAMI 22, 63–84 (2000)
Kaltenmeier, A., Caesar, T., Gloger, J., Mandler, E.: Sophisticated topology of hidden Markov models for cursive script recognition. In: Proc. 2nd Int. Conf. on Document Analysis and Recognition, vol. 2, pp. 1097–1101 (1997)
Impedovo, S., Wang, P., Bunke, H. (eds.): Automatic Bankcheck Processing. World Scientific, Singapore (1997)
Baird, H.S.: Document image defect models. In: Baird, H.S., Bunke, H., Yamamoto, K. (eds.) Structured Document Image Analysis, pp. 546–556. Springer, Heidelberg (1992)
Fossey, R., Baird, H.S.: A 100-Font Classifier. In: Proc. 1st Int. Conf. on Document Analysis and Recognition (1991)
Baird, H.S.: The state of the art of document image degradation modelling. In: Proc. 4th IAPR Int. Workshop on Document Image Analysis Systems, pp. 1–13 (2000)
Märgner, V., Pechwitz, M.: Synthetic data for Arabic OCR system development. In: Proc. 6th Int. Conf. on Document Analysis and Recognition (2001)
Guyon, I.: Handwriting synthesis from handwritten glyphs. In: Proc. of the 5th Int. Workshop Frontiers in Handwriting Recognition, pp. 309–312 (1996)
Marti, U.-V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. Journal of Pattern Recognition and Art. Intelligence 15, 65–90 (2001)
Marti, U.-V., Bunke, H.: Text line segmentation and word recognition in a system for general writer independent handwriting recognition. In: Proc. 6th Int. Conference on Document Analysis and Recognition, pp. 159–163 (2001)
Wang, J., Wu, C., Xu, Y.-Q., Shum, H.-Y., Ji, L.: Learning-based cursive handwriting synthesis. In: Proc. 8th Int. Workshop on Frontiers in Handwriting Recognition, pp. 157–162 (2002)
Setlur, S., Govindaraju, V.: Generating manifold samples from a handwritten word. Pattern Recognition Letters 15, 901–905 (1994)
Plamondon, R., Guerfali, W.: The generation of handwriting with delta-lognormal synergies. Biological Cybernetics 78, 119–132 (1998)
Frantzen, B., Fürhauser, W.: DATA BECKERs Goldene Serie: Meine Handschrift, DATA BECKER GmbH (1999)
Zimmermann, M., Bunke, H.: Automatic segmentation of the IAM off-line database for handwritten English text. In: Proc. 16th Int. Conference on Pattern Recognition, vol. IV, pp. 35–39 (2002)
Francis, W.N., Kucera, H.: Brown Corpus of Standard American English. Brown University, Providence (1961)
Helmers, M.: Verwendung von künstlich erzeugten Texten in der Handschrifterkennung, diploma thesis, university of Bern, Switzerland (2002)
Johansson, S., Leech, G.N., Goodluck, H.: Manual of Information to accompany the Lancaster-Oslo/Bergen Corpus of British English, for use with digital Computers, Department of English. University of Oslo, Oslo (1978)
Young, S., Jansen, J., Odell, J., Ollason, D., Woodland, P.: The HTK Book, Entropic (1999)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 257–285 (1989)
Bunke, H., Marti, U.-V.: The IAM-database: An English sentence database for offline handwriting recognition. Int. Journal of Document Analysis and Recognition 5, 39–46 (2002)
Rowley, H.A., Goyal, M., Bennett, J.: The effect of large training set sizes on online Japanese Kanji and English cursive recognizers. In: Proc. 8th Int. Workshop on Frontiers in Handwriting Recognition, pp. 36–40 (2002)
Smith, S.J.: Handwritten character classification using nearest neighbour in large databases. IEEE Trans. PAMI 16, 915–919 (1994)
Cano, J., Perez-Cortes, J.-C., Arlandis, J., Llobet, R.: Training set expansion in handwritten character recognition. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 548–556. Springer, Heidelberg (2002)
Ha, T., Bunke, H.: Off-line handwritten numeral recognition by perturbation method. IEEE Trans. PAMI 19, 535–539 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Helmers, M., Bunke, H. (2003). Generation and Use of Synthetic Training Data in Cursive Handwriting Recognition. In: Perales, F.J., Campilho, A.J.C., de la Blanca, N.P., Sanfeliu, A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2003. Lecture Notes in Computer Science, vol 2652. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44871-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-540-44871-6_39
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40217-6
Online ISBN: 978-3-540-44871-6
eBook Packages: Springer Book Archive