Collaborator Effort Optimisation in Multimodal Crowdsourcing for Transcribing Historical Manuscripts

Granell, Emilio; Martínez-Hinarejos, Carlos-D.

doi:10.1007/978-3-319-49169-1_23

Emilio Granell²¹ &
Carlos-D. Martínez-Hinarejos²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10077))

Included in the following conference series:

International Conference on Advances in Speech and Language Technologies for Iberian Languages

679 Accesses
1 Citations

Abstract

Crowdsourcing is a powerful tool for massive transcription at a relatively low cost, since the transcription effort is distributed into a set of collaborators, and therefore, supervision effort of professional transcribers may be dramatically reduced. Nevertheless, collaborators are a scarce resource, which makes optimisation very important in order to get the maximum benefit from their efforts. In this work, the optimisation of the work load in the side of collaborators is studied in a multimodal crowdsourcing platform where speech dictation of handwritten text lines is used as transcription source. The experiments explore how this optimisation allows to obtain similar results reducing the number of collaborators and the number of text lines that they have to read.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.mturk.com/mturk/.

References

Fischer, A., Wüthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: Proceedings of the 15th VSMM, pp. 137–142 (2009)
Google Scholar
Plamondon, R., Srihari, S.N.: On-line and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)
Article Google Scholar
Doan, A., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the world-wide web. Commun. ACM 54(4), 86–96 (2011)
Article Google Scholar
Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall, Upper Saddle River (1993)
MATH Google Scholar
Hinton, G., Deng, L., Dong, Y., Dahl, G.E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig. Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Granell, E., Martínez-Hinarejos, C.D.: A multimodal crowdsourcing framework for transcribing historical handwritten documents. In Proceedings of the 16th DocEng, pp. 157–163 (2016)
Google Scholar
Bellegarda, J.R.: Statistical language model adaptation: review and perspectives. Speech Commun. 42(1), 93–108 (2004)
Article Google Scholar
Xue, J., Zhao, Y.: Improved confusion network algorithm and shortest path search from word lattice. In: Proceedings of the 30th ICASSP, vol. 1, pp. 853–856 (2005)
Google Scholar
Alabau, V., Romero, V., Lagarda, A.L., Martínez-Hinarejos, C.D.: A multimodal approach to dictation of handwritten historical documents. In: Proceedings of the 12th Interspeech, pp. 2245–2248 (2011)
Google Scholar
Granell, E., Martínez-Hinarejos, C.D.: Combining handwriting and speech recognition for transcribing historical handwritten documents. In: Proceedings of the 13th ICDAR, pp. 126–130 (2015)
Google Scholar
Rueber, B.: Obtaining confidence measures from sentence probabilities. In: Proceedings of the 5th Eurospeech, pp. 739–742 (1997)
Google Scholar
Wessel, F., Schlüter, R., Macherey, K., Ney, H.: Confidence measures for large vocabulary continuous speech recognition. IEEE Trans. Speech Audio Process. 9(3), 288–298 (2001)
Article Google Scholar
Serrano, N., Castro, F., Juan, A.: The RODRIGO database. In: Proceedings of the 7th LREC, pp. 2709–2712 (2010)
Google Scholar
Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Mariño, J.B., Nadeu, C.: Albayzin speech database: design of the phonetic corpus. In: Proceedings of the 3rd EuroSpeech, pp. 175–178 (1993)
Google Scholar
Dreuw, P., Jonas, S., Ney, H.: White-space models for offline Arabic handwriting recognition. In: Proceedings of the 19th ICPR, pp. 1–4 (2008)
Google Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.C.: The HTK Book. Cambridge University Engineering Department, Cambridge (2006)
Google Scholar
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: Proceedings of ICASSP, vol. 1, pp. 181–184 (1995)
Google Scholar
Bisani, M., Ney, H.: Bootstrap estimates for confidence intervals in ASR performance evaluation. In: Proceedings of ICASSP, vol. 1, pp. 409–412 (2004)
Google Scholar
Luján-Mares, M., Tamarit, V., Alabau, V., Martínez-Hinarejos, C.D., Pastor, M., Sanchis, A., Toselli, A.H.: iATROS: a speech and handwritting recognition system. In: Procedings of the V Jornadas en Tecnologías del Habla, pp. 75–78 (2008)
Google Scholar
Stolcke, A.: SRILM-an extensible language modeling toolkit. In Proceedings of the 3rd Interspeech, pp. 901–904 (2002)
Google Scholar

Download references

Acknowledgments

Work partially supported by projects SmartWays - RTC-2014-1466-4 (MINECO) and CoMUN-HaT - TIN2015-70924-C2-1-R (MINECO/FEDER).

Author information

Authors and Affiliations

Pattern Recognition and Human Language Technology Research Center, Universitat Politècnica de València, Camino Vera s/n, 46022, Valencia, Spain
Emilio Granell & Carlos-D. Martínez-Hinarejos

Authors

Emilio Granell
View author publications
You can also search for this author in PubMed Google Scholar
Carlos-D. Martínez-Hinarejos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emilio Granell .

Editor information

Editors and Affiliations

INESC-ID/IST, Universidade de Lisboa, Lisbon, Portugal
Alberto Abad
I3A/University of Zaragoza, Zaragoza, Spain
Alfonso Ortega
DETI/IEETA, University of Aveiro, Aveiro, Portugal
António Teixeira
AtlantTIC Research Center, Universidad de Vigo, Vigo, Spain
Carmen García Mateo
Universitat Politècnica de València, Valencia, Spain
Carlos D. Martínez Hinarejos
University of Coimbra, Coimbra, Portugal
Fernando Perdigão
INESC-ID/ISCTE-IUL, Lisbon, Portugal
Fernando Batista
INESC-ID/IST, Universidade de Lisboa, Lisbon, Portugal
Nuno Mamede

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Granell, E., Martínez-Hinarejos, CD. (2016). Collaborator Effort Optimisation in Multimodal Crowdsourcing for Transcribing Historical Manuscripts. In: Abad, A., et al. Advances in Speech and Language Technologies for Iberian Languages. IberSPEECH 2016. Lecture Notes in Computer Science(), vol 10077. Springer, Cham. https://doi.org/10.1007/978-3-319-49169-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-49169-1_23
Published: 04 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49168-4
Online ISBN: 978-3-319-49169-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics