Active Interaction and Learning in Handwritten Text Transcription

  • Alejandro Héctor Toselli
  • Enrique Vidal
  • Francisco Casacuberta


Computer-assisted systems are being increasingly used in a variety of real-world tasks, though their application to handwritten text transcription in old manuscripts remains largely unexplored. The basic idea explored in this chapter is to follow a sequential, line-by-line transcription of the whole manuscript in which a continuously retrained system interacts with the user to efficiently transcribe each new line. User interaction is expensive in terms of time and cost. Our top priority is to take advantage of these interactions, while trying to reduce them as most as possible.

To this end, we study three different frameworks: (a) improve a recognition system from newly recognized transcriptions via adaptation techniques, using semi-supervised learning techniques; (b) study how to best adapt from limited user supervisions, which is related to active learning; and (c) develop a simple error estimate, which is used to let the user adjust the error in a computer-assisted transcription task. In addition, we test these approaches in the sequential transcription of two old text documents.


Text Line Recognition Error Word Error Rate Handwritten Text Word Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bertolami, R., & Bunke, H. (2008). Hidden Markov model-based ensemble methods for offline handwritten text line recognition. Pattern Recognition 41, 3452–3460. MATHCrossRefGoogle Scholar
  2. 2.
    Kristjannson, T., Culotta, A., Viola, P., & McCallum, A. (2004). Interactive information extraction with constrained conditional random fields. In Proceedings of the 19th national conference on artificial intelligence (AAAI 2004) (pp. 412–418), San Jose, CA, USA. Google Scholar
  3. 3.
    Le Bourgeois, F., & Emptoz, H. (2007). DEBORA: Digital AccEss to BOoks of the RenAissance. International Journal on Document Analysis and Recognition, 9, 193–221. CrossRefGoogle Scholar
  4. 4.
    Likforman-Sulem, L., Zahour, A., & Taconet, B. (2007). Text line segmentation of historical documents: a survey. International Journal on Document Analysis and Recognition, 9, 123–138. CrossRefGoogle Scholar
  5. 5.
    Pérez, D., Tarazón, L., Serrano, N., Castro, F., Ramos-Terrades, O., & Juan, A. (2009). The GERMANA database. In Proceedings of the 10th international conference on document analysis and recognition (ICDAR 2009) (pp. 301–305), Barcelona, Spain. CrossRefGoogle Scholar
  6. 6.
    Plötz, T., & Fink, G. A. (2009). Markov models for offline handwriting recognition: a survey. International Journal on Document Analysis and Recognition, 12, 269–298. CrossRefGoogle Scholar
  7. 7.
    Serrano, N., Pérez, D., Sanchis, A., & Juan, A. (2009). Adaptation from partially supervised handwritten text transcriptions. In Proceedings of the 11th international conference on multimodal interfaces and the 6th workshop on machine learning for multimodal interaction (ICMI-MLMI 2009) (pp. 289–292), Cambridge, MA, USA. CrossRefGoogle Scholar
  8. 8.
    Serrano, N., Castro, F., & Juan, A. (2010). The RODRIGO database. In Proceedings of the 7th international conference on language resources and evaluation (LREC 2010) (pp. 2709–2712), Valleta, Malta. Google Scholar
  9. 9.
    Settles, B. (2009). Active learning literature survey (Computer Sciences Technical Report No. 1648). University of Wisconsin-Madison. Google Scholar
  10. 10.
    Tarazón, L., Pérez, D., Serrano, N., Alabau, V., Ramos-Terrades, O., Sanchis, A., & Juan, A. (2009). Confidence measures for error correction in interactive transcription of handwritten text. In Proceedings of the 15th international conference on image analysis and processing (ICIAP 2009) (pp. 567–574), Vietri sul Mare, Italy. Google Scholar
  11. 11.
    Wessel, F., & Ney, H. (2005). Unsupervised training of acoustic models for large vocabulary continuous speech recognition. IEEE Transactions on Speech and Audio Processing, 13(1), 23–31. CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Alejandro Héctor Toselli
    • Enrique Vidal
      • Francisco Casacuberta

        There are no affiliations available

        Personalised recommendations