Advertisement

Computer Assisted Transcription of Text Images

  • Alejandro Héctor Toselli
  • Enrique Vidal
  • Francisco Casacuberta

Abstract

Grounded in the interactive–predictive transcription framework drawn in the previous chapter, an interactive approach for efficient transcription of handwritten text images, along with its more ergonomic and multimodal variants are presented. All these approaches, rather than full automation, aim at assisting the expert in the proper transcription process in an efficient way. In this sense, an interactive scenario is stated, where both automatic handwriting recognition system and human transcriber (user) cooperate to produce the final transcription of text-images.

Additionally, an explanation of both basic off- and on-line HTR systems used embedded in the CATTI approaches is given in some detail. This focusing mainly on the preprocessing, feature extraction and on specific aspects of the modeling and decoding-searching process, which complement the ones already introduced in Sect.  2.2.

Moreover, in this chapter, it will be shown how user-interaction feedback directly allows us to improve system accuracy, while multimodality increases system ergonomics and user acceptability. Multimodal interaction is approached in such a way that both the main and the feedback data streams help each-other to optimize overall performance and usability. All these are supported by experimental results obtained on three cursive handwritten tasks suggesting that, using these approaches, considerable amounts of user effort can be saved with respect to both pure manual work and non-interactive, post-editing processing.

Keywords

Language Model Text Line Viterbi Algorithm Handwritten Text Handwritten Document 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Amengual, J. C., & Vidal, E. (1998). Efficient error-correcting Viterbi parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10), 1109–1116. CrossRefGoogle Scholar
  2. 2.
    Bazzi, I., Schwartz, R., & Makhoul, J. (1999). An omnifont open-vocabulary OCR system for English and Arabic. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(6), 495–504. CrossRefGoogle Scholar
  3. 3.
    Brakensiek, A., Rottland, J., Kosmala, A., & Rigoll, G. (2000). Off-line handwriting recognition using various hybrid modeling techniques and character n-grams. In Proceedings of the international workshop on frontiers in handwriting recognition (IWFHR’00) (pp. 343–352), Amsterdam, The Netherlands. Google Scholar
  4. 4.
    Chelba, C., & Jelinek, F. (1999). Recognition performance of a structured language model. In Proceedings of European conference on speech communication and technology (Eurospeech) (Vol. 4, pp. 1567–1570). Google Scholar
  5. 5.
    Chen, C. H. (Ed.) (2003). Frontiers of remote sensing information processing. Singapore: World Scientific. Google Scholar
  6. 6.
    Drira, F. (2006). Towards restoring historic documents degraded over time. In Proceedings of the international conference on document image analysis for libraries (DIAL’06) (pp. 350–357), Washington, DC, USA. Los Alamitos: IEEE Computer Society. CrossRefGoogle Scholar
  7. 7.
    Guyon, I., Schomaker, L., Plamondon, R., Liberman, M., & Janet, S. (1994). UNIPEN project of on-line data exchange and recognizer benchmarks. In Proceedings of the international conference on pattern recognition (ICPR’94) (pp. 29–33), Jerusalem, Israel. Google Scholar
  8. 8.
    Huang, B. Q., Zhang, Y. B., & Kechadi, M. T. (2007). Preprocessing techniques for online handwriting recognition. In Proceedings of the international conference on intelligent systems design and applications (ISDA’07) (pp. 793–800), Washington, DC, USA. Los Alamitos: IEEE Computer Society. CrossRefGoogle Scholar
  9. 9.
    Jaeger, S., Manke, S., Reichert, J., & Waibel, A. (2001). On-line handwriting recognition: the NPen++ recognizer. International Journal on Document Analysis and Recognition, 3(3), 169–181. CrossRefGoogle Scholar
  10. 10.
    Jelinek, F. (1998). Statistical methods for speech recognition. Cambridge: MIT Press. Google Scholar
  11. 11.
    Johansson, S., Atwell, E., Garside, R., & Leech, G. (1996). The tagged LOB corpus, user’s manual. Norwegian Computing Center for the Humanities, Bergen, Norway. Google Scholar
  12. 12.
    Lowerre, B. T. (1976). The harpy speech recognition system. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA, USA. Google Scholar
  13. 13.
    Marti, U.-V., & Bunke, H. (1999). A full English sentence database for off-line handwriting recognition. In Proceedings of the international conference on document analysis and recognition (ICDAR’99) (pp. 705–708), Washington, DC, USA. Los Alamitos: IEEE Computer Society. Google Scholar
  14. 14.
    Marti, U.-V., & Bunke, H. (2001). Using a statistical language model to improve the preformance of an HMM-based cursive handwriting recognition system. International Journal of Pattern Recognition and Artificial Intelligence, 15(1), 65–90. CrossRefGoogle Scholar
  15. 15.
    Marti, U.-V., & Bunke, H. (2002). The IAM-database: an English sentence database for off-line handwriting recognition. International Journal on Document Analysis and Recognition, 5(1), 39–46. MATHCrossRefGoogle Scholar
  16. 16.
    Ogawa, A., Takeda, K., & Itakura, F. (1998). Balancing acoustic and linguistic probabilities. In Proceedings of the IEEE conference acoustics, speech and signal processing (ICASSP’98) (Vol. 1, pp. 181–184), Seattle, WA, USA. Google Scholar
  17. 17.
    O’Gorman, L., & Kasturi, R. (Eds.) (1995). Document image analysis. Los Alamitos: IEEE Computer Society. Google Scholar
  18. 18.
    Parizeau, M., Lemieux, A., & Gagné, C. (2001). Character recognition experiments using UNIPEN data. In Proceedings of the international conference on document analysis and recognition (ICDAR’01) (pp. 481–485). CrossRefGoogle Scholar
  19. 19.
    Pastor, M., Toselli, A. H., & Vidal, E. (2004). Projection profile based algorithm for slant removal. In Lecture notes in computer science: Vol. 3212. Proceedings of the international conference on image analysis and recognition (ICIAR’04) (pp. 183–190), Porto, Portugal. Berlin: Springer. CrossRefGoogle Scholar
  20. 20.
    Pastor, M., Toselli, A. H., & Vidal, E. (2005). Writing speed normalization for on-line handwritten text recognition. In Proceedings of the international conference on document analysis and recognition (ICDAR’05) (pp. 1131–1135), Seoul, Korea. CrossRefGoogle Scholar
  21. 21.
    Plamondon, R., & Srihari, S. N. (2000). On-line and off-line handwriting recognition: a comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 63–84. CrossRefGoogle Scholar
  22. 22.
    Ratzlaff, E. H. (2003). Methods, report and survey for the comparison of diverse isolated character recognition results on the UNIPEN database. In Proceedings of the international conference on document analysis and recognition (ICDAR’03) (Vol. 1, pp. 623–628), Edinburgh, Scotland. CrossRefGoogle Scholar
  23. 23.
    Romero, V., Pastor, M., Toselli, A. H., & Vidal, E. (2006). Criteria for handwritten off-line text size normalization. In Proceedings of the IASTED international conference on visualization, imaging, and image processing (VIIP’06), Palma de Mallorca, Spain. Google Scholar
  24. 24.
    Romero, V., Toselli, A. H., Rodríguez, L., & Vidal, E. (2007). Computer assisted transcription for ancient text images. In Lecture notes in computer science: Vol. 4633. Proceedings of the international conference on image analysis and recognition (ICIAR’07) (pp. 1182–1193). Berlin: Springer. CrossRefGoogle Scholar
  25. 25.
    Toselli, A., Juan, A., & Vidal, E. (2004). Spontaneous handwriting recognition and classification. In Proceedings of the international conference on pattern recognition (ICPR’04) (pp. 433–436), Cambridge, UK. Google Scholar
  26. 26.
    Toselli, A. H., Juan, A., Keysers, D., González, J., Salvador, I., Ney, H., Vidal, E. & Casacuberta, F. (2004). Integrated handwriting recognition and interpretation using finite-state models. International Journal of Pattern Recognition and Artificial Intelligence, 18(4), 519–539. CrossRefGoogle Scholar
  27. 27.
    Toselli, A. H., Pastor, M., & Vidal, E. (2007). On-line handwriting recognition system for Tamil handwritten characters. In Lecture notes in computer science: Vol. 4477. Proceedings of the Iberian conference on pattern recognition and image analysis (IbPRIA’07) (pp. 370–377), Girona, Spain. Berlin: Springer. CrossRefGoogle Scholar
  28. 28.
    Toselli, A. H., Romero, V., Rodríguez, L., & Vidal, E. (2007). Computer assisted transcription of handwritten text. In Proceedings of the international conference on document analysis and recognition (ICDAR’07) (pp. 944–948), Curitiba, Paraná, Brazil. Los Alamitos: IEEE Computer Society. Google Scholar
  29. 29.
    Toselli, A. H., Romero, V., & Vidal, E. (2008). Computer assisted transcription of text images and multimodal interaction. In Lecture notes in computer science: Vol. 5237. Proceedings of the joint workshop on multimodal interaction and related machine learning algorithms (pp. 296–308), Utrecht, The Netherlands. CrossRefGoogle Scholar
  30. 30.
    Toselli, A. H., Romero, V., Pastor, M., & Vidal, E. (2009). Multimodal interactive transcription of text images. Pattern Recognition, 43(5), 1814–1825. CrossRefGoogle Scholar
  31. 31.
    Vuori, V., Laaksonen, J., Oja, E., & Kangas, J. (2001). Speeding up on-line recognition of handwritten characters by pruning the prototype set. In Proceedings of the international conference on document analysis and recognition (ICDAR’01) (pp. 0501–0507), Seattle, Washington. CrossRefGoogle Scholar
  32. 32.
    Young, S., Odell, J., Ollason, D., Valtchev, V., & Woodland, P. (1997). The HTK book: hidden Markov models toolkit V2.1. Cambridge Research Laboratory Ltd. Google Scholar
  33. 33.
    Zimmermann, M., Chappelier, J.-C., & Bunke, H. (2006). Off-line grammar-based recognition of handwritten sentences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 818–821. CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Alejandro Héctor Toselli
    • Enrique Vidal
      • Francisco Casacuberta

        There are no affiliations available

        Personalised recommendations