Skip to main content

Collaborator Effort Optimisation in Multimodal Crowdsourcing for Transcribing Historical Manuscripts

  • Conference paper
  • First Online:
Advances in Speech and Language Technologies for Iberian Languages (IberSPEECH 2016)

Abstract

Crowdsourcing is a powerful tool for massive transcription at a relatively low cost, since the transcription effort is distributed into a set of collaborators, and therefore, supervision effort of professional transcribers may be dramatically reduced. Nevertheless, collaborators are a scarce resource, which makes optimisation very important in order to get the maximum benefit from their efforts. In this work, the optimisation of the work load in the side of collaborators is studied in a multimodal crowdsourcing platform where speech dictation of handwritten text lines is used as transcription source. The experiments explore how this optimisation allows to obtain similar results reducing the number of collaborators and the number of text lines that they have to read.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.mturk.com/mturk/.

References

  1. Fischer, A., Wüthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: Proceedings of the 15th VSMM, pp. 137–142 (2009)

    Google Scholar 

  2. Plamondon, R., Srihari, S.N.: On-line and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)

    Article  Google Scholar 

  3. Doan, A., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the world-wide web. Commun. ACM 54(4), 86–96 (2011)

    Article  Google Scholar 

  4. Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall, Upper Saddle River (1993)

    MATH  Google Scholar 

  5. Hinton, G., Deng, L., Dong, Y., Dahl, G.E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig. Process. Mag. 29(6), 82–97 (2012)

    Article  Google Scholar 

  6. Granell, E., Martínez-Hinarejos, C.D.: A multimodal crowdsourcing framework for transcribing historical handwritten documents. In Proceedings of the 16th DocEng, pp. 157–163 (2016)

    Google Scholar 

  7. Bellegarda, J.R.: Statistical language model adaptation: review and perspectives. Speech Commun. 42(1), 93–108 (2004)

    Article  Google Scholar 

  8. Xue, J., Zhao, Y.: Improved confusion network algorithm and shortest path search from word lattice. In: Proceedings of the 30th ICASSP, vol. 1, pp. 853–856 (2005)

    Google Scholar 

  9. Alabau, V., Romero, V., Lagarda, A.L., Martínez-Hinarejos, C.D.: A multimodal approach to dictation of handwritten historical documents. In: Proceedings of the 12th Interspeech, pp. 2245–2248 (2011)

    Google Scholar 

  10. Granell, E., Martínez-Hinarejos, C.D.: Combining handwriting and speech recognition for transcribing historical handwritten documents. In: Proceedings of the 13th ICDAR, pp. 126–130 (2015)

    Google Scholar 

  11. Rueber, B.: Obtaining confidence measures from sentence probabilities. In: Proceedings of the 5th Eurospeech, pp. 739–742 (1997)

    Google Scholar 

  12. Wessel, F., Schlüter, R., Macherey, K., Ney, H.: Confidence measures for large vocabulary continuous speech recognition. IEEE Trans. Speech Audio Process. 9(3), 288–298 (2001)

    Article  Google Scholar 

  13. Serrano, N., Castro, F., Juan, A.: The RODRIGO database. In: Proceedings of the 7th LREC, pp. 2709–2712 (2010)

    Google Scholar 

  14. Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Mariño, J.B., Nadeu, C.: Albayzin speech database: design of the phonetic corpus. In: Proceedings of the 3rd EuroSpeech, pp. 175–178 (1993)

    Google Scholar 

  15. Dreuw, P., Jonas, S., Ney, H.: White-space models for offline Arabic handwriting recognition. In: Proceedings of the 19th ICPR, pp. 1–4 (2008)

    Google Scholar 

  16. Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.C.: The HTK Book. Cambridge University Engineering Department, Cambridge (2006)

    Google Scholar 

  17. Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: Proceedings of ICASSP, vol. 1, pp. 181–184 (1995)

    Google Scholar 

  18. Bisani, M., Ney, H.: Bootstrap estimates for confidence intervals in ASR performance evaluation. In: Proceedings of ICASSP, vol. 1, pp. 409–412 (2004)

    Google Scholar 

  19. Luján-Mares, M., Tamarit, V., Alabau, V., Martínez-Hinarejos, C.D., Pastor, M., Sanchis, A., Toselli, A.H.: iATROS: a speech and handwritting recognition system. In: Procedings of the V Jornadas en Tecnologías del Habla, pp. 75–78 (2008)

    Google Scholar 

  20. Stolcke, A.: SRILM-an extensible language modeling toolkit. In Proceedings of the 3rd Interspeech, pp. 901–904 (2002)

    Google Scholar 

Download references

Acknowledgments

Work partially supported by projects SmartWays - RTC-2014-1466-4 (MINECO) and CoMUN-HaT - TIN2015-70924-C2-1-R (MINECO/FEDER).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emilio Granell .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Granell, E., Martínez-Hinarejos, CD. (2016). Collaborator Effort Optimisation in Multimodal Crowdsourcing for Transcribing Historical Manuscripts. In: Abad, A., et al. Advances in Speech and Language Technologies for Iberian Languages. IberSPEECH 2016. Lecture Notes in Computer Science(), vol 10077. Springer, Cham. https://doi.org/10.1007/978-3-319-49169-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49169-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49168-4

  • Online ISBN: 978-3-319-49169-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics