Skip to main content

Generation and Use of Synthetic Training Data in Cursive Handwriting Recognition

  • Conference paper
  • First Online:
Book cover Pattern Recognition and Image Analysis (IbPRIA 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2652))

Included in the following conference series:

Abstract

Three different methods for the synthetic generation of handwritten text are introduced. These methods are experimentally evaluated in the context of a cursive handwriting recognition task, using an HMM-based recognizer. In the experiments, the performance of a traditional recognizer, which is trained on data produced by human writers, is compared to a system that is trained on synthetic data only. Under the most elaborate synthetic handwriting generation model, a level of performance comparable to, or even slightly better than, the system trained on the writing of humans was observed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Plamondon, R., Srihari, S.: On-line and off-line handwriting recognition: A comprehensive survey. IEEE Trans. PAMI 22, 63–84 (2000)

    Article  Google Scholar 

  2. Kaltenmeier, A., Caesar, T., Gloger, J., Mandler, E.: Sophisticated topology of hidden Markov models for cursive script recognition. In: Proc. 2nd Int. Conf. on Document Analysis and Recognition, vol. 2, pp. 1097–1101 (1997)

    Google Scholar 

  3. Impedovo, S., Wang, P., Bunke, H. (eds.): Automatic Bankcheck Processing. World Scientific, Singapore (1997)

    MATH  Google Scholar 

  4. Baird, H.S.: Document image defect models. In: Baird, H.S., Bunke, H., Yamamoto, K. (eds.) Structured Document Image Analysis, pp. 546–556. Springer, Heidelberg (1992)

    Chapter  Google Scholar 

  5. Fossey, R., Baird, H.S.: A 100-Font Classifier. In: Proc. 1st Int. Conf. on Document Analysis and Recognition (1991)

    Google Scholar 

  6. Baird, H.S.: The state of the art of document image degradation modelling. In: Proc. 4th IAPR Int. Workshop on Document Image Analysis Systems, pp. 1–13 (2000)

    Google Scholar 

  7. Märgner, V., Pechwitz, M.: Synthetic data for Arabic OCR system development. In: Proc. 6th Int. Conf. on Document Analysis and Recognition (2001)

    Google Scholar 

  8. Guyon, I.: Handwriting synthesis from handwritten glyphs. In: Proc. of the 5th Int. Workshop Frontiers in Handwriting Recognition, pp. 309–312 (1996)

    Google Scholar 

  9. Marti, U.-V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. Journal of Pattern Recognition and Art. Intelligence 15, 65–90 (2001)

    Article  Google Scholar 

  10. Marti, U.-V., Bunke, H.: Text line segmentation and word recognition in a system for general writer independent handwriting recognition. In: Proc. 6th Int. Conference on Document Analysis and Recognition, pp. 159–163 (2001)

    Google Scholar 

  11. Wang, J., Wu, C., Xu, Y.-Q., Shum, H.-Y., Ji, L.: Learning-based cursive handwriting synthesis. In: Proc. 8th Int. Workshop on Frontiers in Handwriting Recognition, pp. 157–162 (2002)

    Google Scholar 

  12. Setlur, S., Govindaraju, V.: Generating manifold samples from a handwritten word. Pattern Recognition Letters 15, 901–905 (1994)

    Article  Google Scholar 

  13. Plamondon, R., Guerfali, W.: The generation of handwriting with delta-lognormal synergies. Biological Cybernetics 78, 119–132 (1998)

    Article  Google Scholar 

  14. Frantzen, B., Fürhauser, W.: DATA BECKERs Goldene Serie: Meine Handschrift, DATA BECKER GmbH (1999)

    Google Scholar 

  15. Zimmermann, M., Bunke, H.: Automatic segmentation of the IAM off-line database for handwritten English text. In: Proc. 16th Int. Conference on Pattern Recognition, vol. IV, pp. 35–39 (2002)

    Google Scholar 

  16. Francis, W.N., Kucera, H.: Brown Corpus of Standard American English. Brown University, Providence (1961)

    Google Scholar 

  17. Helmers, M.: Verwendung von künstlich erzeugten Texten in der Handschrifterkennung, diploma thesis, university of Bern, Switzerland (2002)

    Google Scholar 

  18. Johansson, S., Leech, G.N., Goodluck, H.: Manual of Information to accompany the Lancaster-Oslo/Bergen Corpus of British English, for use with digital Computers, Department of English. University of Oslo, Oslo (1978)

    Google Scholar 

  19. Young, S., Jansen, J., Odell, J., Ollason, D., Woodland, P.: The HTK Book, Entropic (1999)

    Google Scholar 

  20. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 257–285 (1989)

    Article  Google Scholar 

  21. Bunke, H., Marti, U.-V.: The IAM-database: An English sentence database for offline handwriting recognition. Int. Journal of Document Analysis and Recognition 5, 39–46 (2002)

    Article  Google Scholar 

  22. Rowley, H.A., Goyal, M., Bennett, J.: The effect of large training set sizes on online Japanese Kanji and English cursive recognizers. In: Proc. 8th Int. Workshop on Frontiers in Handwriting Recognition, pp. 36–40 (2002)

    Google Scholar 

  23. Smith, S.J.: Handwritten character classification using nearest neighbour in large databases. IEEE Trans. PAMI 16, 915–919 (1994)

    Article  Google Scholar 

  24. Cano, J., Perez-Cortes, J.-C., Arlandis, J., Llobet, R.: Training set expansion in handwritten character recognition. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 548–556. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  25. Ha, T., Bunke, H.: Off-line handwritten numeral recognition by perturbation method. IEEE Trans. PAMI 19, 535–539 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Helmers, M., Bunke, H. (2003). Generation and Use of Synthetic Training Data in Cursive Handwriting Recognition. In: Perales, F.J., Campilho, A.J.C., de la Blanca, N.P., Sanfeliu, A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2003. Lecture Notes in Computer Science, vol 2652. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44871-6_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-44871-6_39

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40217-6

  • Online ISBN: 978-3-540-44871-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics