Computer Assisted Transcription: General Framework

  • Alejandro Héctor Toselli
  • Enrique Vidal
  • Francisco Casacuberta


This chapter described the common basics on which are grounded the computer assisted transcription approaches described in the three subsequent chapters: Chaps.  3,  4 and  5. Besides, a general overview is provided of the common features characterizing the up-to-date systems we have employed for handwritten text and speech recognition.

Specific mathematical formulation and modeling adequate for interactive transcription of handwritten text images and speech signals are derived from a particular instantiation of the interactive–predictive general framework already introduced in Sect.  1.3.3. Moreover, on this ground and by adopting the passive left-to-right interaction protocol described in Sect.  1.4.2, the two basic computer assisted handwriting and speech transcription approaches were developed (detailed in Chaps.  3 and  4, respectively), along with the evaluation measures used to assess their performance.


Language Model Speech Signal Automatic Speech Recognition Viterbi Algorithm Word Sequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Barrachina, S., Bender, O., Casacuberta, F., Civera, J., Cubel, E., Khadivi, S., Ney, A. L. H., Tomás, J., & Vidal, E. (2009). Statistical approaches to computer-assisted translation. Computational Linguistics, 35(1), 3–28. MathSciNetCrossRefGoogle Scholar
  2. 2.
    Jelinek, F. (1998). Statistical methods for speech recognition. Cambridge: MIT Press. Google Scholar
  3. 3.
    Katz, S. M. (1987). Estimation of probabilities from sparse data for the language model component of a speech recognizer. I.E.E.E. Transactions on Acoustics, Speech, and Signal Processing, ASSP-35, 400–401. CrossRefGoogle Scholar
  4. 4.
    Kneser, R., & Ney, H. (1995). Improved backing-off for n-gram language modeling. In Proceedings of the international conference on acoustics, speech and signal processing (ICASSP) (Vol. 1, pp. 181–184). Google Scholar
  5. 5.
    Liu, P., & Soong, F. K. (2006). Word graph based speech recognition error correction by handwriting input. In Proceedings of the international conference on multimodal interfaces (ICMI’06) (pp. 339–346), New York, NY, USA. New York: ACM. Google Scholar
  6. 6.
    Serrano, N., Sanchis, A., & Juan, A. (2010). Balancing error and supervision effort in interactive–predictive handwritten text recognition. In Proceedings of the international conference on intelligent user interfaces (IUI’10) (pp. 373–376), Hong Kong, China. Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Alejandro Héctor Toselli
    • Enrique Vidal
      • Francisco Casacuberta

        There are no affiliations available

        Personalised recommendations