Skip to main content

Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8387))

Abstract

This article is a position paper about Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in articles of the domain, this type of on-line working platforms allows to develop quickly all sorts of quality language resources, at a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal. Our goal here is manifold: 1- to inform researchers, so that they can make their own choices, 2- to develop alternatives with the help of funding agencies and scientific associations, 3- to propose practical and organizational solutions in order to improve language resources development, while limiting the risks of ethical and legal issues without letting go price or quality, 4- to introduce an Ethics and Big Data Charter for the documentation of language resources.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Microworking refers to the fact that tasks are cut into small pieces and their execution is paid for. Crowdsourcing refers to the fact that the job is outsourced via the web and done by many people (paid or not).

  2. 2.

    For instance, we learn that Indian Turkers were 5 % in 2008, 36 % in December 2009 [2], 50 % in May 2010 (http://blog.crowdflower.com/2010/05/amazon- mechanical-turk-survey/) and have produced over 60 % of the activity in mturk  [4].

  3. 3.

    $1.25/hr according to [6] $1.38/hr according to [7].

  4. 4.

    For instance http://mechanicalturk.typepad.com or http://turkers.proboards.com.

  5. 5.

    http://turkopticon.differenceengines.com

  6. 6.

    Some of the problems reported, such as the interface problems, are not specific to mturk, but are generic to many crowdsourcing systems.

  7. 7.

    Interestingly, it seems that mturk recently decided to no longer accept the non-US Turkers, for quality and fraud reasons: http://turkrequesters.blogspot.fr/2013/01/the-reasons-why-amazon-mechanical-turk.html.

  8. 8.

    http://www.anawiki.org

  9. 9.

    http://www.samasource.org/haiti/

  10. 10.

    Centre National de la Recherche Scientifique/National agency for scientific research.

  11. 11.

    Association pour le Traitement Automatique des Langues/Natural Language Processing Association http://www.atala.org.

  12. 12.

    Association Française de Communication Parlée/French spoken communication association, http://www.afcp-parole.org.

  13. 13.

    Association de la Maîtrise et de la Valorisation des contenus/Association for mastering and empowering content, http://www.aproged.org.

  14. 14.

    http://wiki.ethique-big-data.org

  15. 15.

    http://wiki.ethique-big-data.org/chartes/charteethiqueenV2.pdf

  16. 16.

    http://www.aclweb.org/

  17. 17.

    http://www.isca-speech.org/

  18. 18.

    http://www.elra.info/

References

  1. Fort, K., Adda, G., Cohen, K.B.: Amazon mechanical turk: Gold mine or coal mine? Comput. Linguist. (Editorial) 37(2), 413–420 (2011)

    Article  Google Scholar 

  2. Ross, J., Irani, L., Silberman, M.S., Zaldivar, A., Tomlinson, B.: Who are the crowdworkers?: shifting demographics in mechanical turk. In: Proceedings of the 28th of the International Conference Extended Abstracts on Human Factors in Computing Systems, CHI EA ’10. ACM, New York (2010)

    Google Scholar 

  3. Ipeirotis, P.: Demographics of mechanical turk. CeDER Working Papers, March 2010. http://hdl.handle.net/2451/29585 (2010). CeDER-10-01

  4. Biewald, L.: Better crowdsourcing through automated methods for quality control. In: SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation, January 2010 (2010)

    Google Scholar 

  5. Silberman, M.S., Ross, J., Irani, L., Tomlinson, B.: Sellers’ problems in human computation markets. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP ’10, pp. 18–21 (2010)

    Google Scholar 

  6. Ross, J., Zaldivar, A., Irani, L., Tomlinson, B.: Who are the turkers? worker demographics in amazon mechanical turk. Social Code Report 2009-01. http://www.ics.uci.edu/jwross/pubs/SocialCode-2009-01.pdf (2009)

  7. Chilton, L.B., Horton, J.J., Miller, R.C., Azenkot, S.: Task search in a human computation market. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP ’10, pp. 1–9 (2010)

    Google Scholar 

  8. Adda, G., Mariani, J.: Language resources and amazon mechanical turk: legal, ethical and other issues. In: LISLR 2010, “Legal Issues for Sharing Language Resources workshop”, LREC 2010, Valletta, Malta, May 2010 (2010)

    Google Scholar 

  9. Novotney, S., Callison-Burch, C.: Cheap, fast and good enough: automatic speech recognition with non-expert transcription. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’10, Los Angeles, California, USA, pp. 207–215 (2010)

    Google Scholar 

  10. Callison-Burch, C., Dredze, M.: Creating speech and language data with amazon’s mechanical turk. In: CSLDAMT ’10: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, Los Angeles, California, USA (2010)

    Google Scholar 

  11. Kaisser, M., Lowe, J.B.: Creating a research collection of question answer sentence pairs with amazon’s mechanical turk. In: Proceedings of the International Language Resources and Evaluation Conference (LREC), Marrakech, Morocco (2008)

    Google Scholar 

  12. Xu, F., Klakow, D.: Paragraph acquisition and selection for list question using amazon’s mechanical turk. In: Proceedings of the International Language Resources and Evaluation Conference (LREC), Valletta, Malta, May 2010, pp. 2340–2345 (2010)

    Google Scholar 

  13. Marge, M., Banerjee, S., Rudnicky, A.I.: Using the amazon mechanical turk for transcription of spoken language. In: IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Dallas, USA, 14–19 March 2010, pp. 5270–5273 (2010)

    Google Scholar 

  14. Cook, P., Stevenson, S.: Automatically identifying changes in the semantic orientation of words. In: Proceedings of the International Language Resources and Evaluation Conference (LREC), Valletta, Malta, May 2010 (2010)

    Google Scholar 

  15. Bhardwaj, V., Passonneau, R., Salleb-Aouissi, A., Ide, N.: Anveshan: a tool for analysis of multiple annotators’ labeling behavior. In: Proceedings of the Fourth Linguistic Annotation Workshop (LAW IV), Uppsala, Sweden (2010)

    Google Scholar 

  16. Snow, R., O’Connor, B., Jurafsky, D., Ng., A.Y.: Cheap and fast - but is it good? evaluating non-expert annotations for natural language tasks. In: Proceedings of EMNLP 2008. pp. 254–263 (2008)

    Google Scholar 

  17. Gillick, D., Liu, Y.: Non-expert evaluation of summarization systems is risky. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT ’10, Los Angeles, California, USA (2010)

    Google Scholar 

  18. Tratz, S., Hovy, E.: A taxonomy, dataset, and classifier for automatic noun compound interpretation. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, July 2010, pp. 678–687 (2010)

    Google Scholar 

  19. Wais, P., Lingamneni, S., Cook, D., Fennell, J., Goldenberg, B., Lubarov, D., Marin, D., Simons, H.: Towards building a high-quality workforce with mechanical turk. In: Proceedings of Computational Social Science and the Wisdom of Crowds (NIPS), December 2010 (2010)

    Google Scholar 

  20. Kochhar, S., Mazzocchi, S., Paritosh, P.: The anatomy of a large-scale human computation engine. In: Proceedings of Human Computation Workshop at the 16th ACM SIKDD Conference on Knowledge Discovery and Data Mining, KDD 2010, Washington D.C. (2010)

    Google Scholar 

  21. Goldwater, S., Griffiths, T.: A fully bayesian approach to unsupervised part-of-speech tagging. In: Proceedings of ACL, Prague, Czech Republic (2007)

    Google Scholar 

  22. Hänig, C.: Improvements in unsupervised co-occurrence based parsing. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, CoNLL ’10, Uppsala, Sweden, pp. 1–8 (2010)

    Google Scholar 

  23. Abney, S.: Semisupervised Learning for Computational Linguistics. 1ère edn. Chapman & Hall/CRC, New York (2007)

    Google Scholar 

  24. Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, MA, USA, pp. 189–196 (1995)

    Google Scholar 

  25. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: COLT: Proceedings of the Workshop on Computational Learning Theory. Morgan Kaufmann Publishers (1998)

    Google Scholar 

  26. Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 705–712. The MIT Press, Cambridge (1995)

    Google Scholar 

  27. Smith, N., Eisner, J.: Contrastive estimation: training log-linear models on unlabeled data. In: Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, Michigan, USA, pp. 354–362 (2005)

    Google Scholar 

  28. Sagot, B.: Automatic acquisition of a Slovak lexicon from a raw corpus. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 156–163. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  29. Watson, R., Briscoe, T., Carroll, J.: Semi-supervised training of a statistical parser from unlabeled partially-bracketed data. In: Proceedings of the 10th International Conference on Parsing Technologies, IWPT ’07, Prague, Czech Republic (2007)

    Google Scholar 

  30. Fort, K., Sagot, B.: Influence of Pre-annotation on POS-tagged corpus development. In: Proceedings of the Fourth ACL Linguistic Annotation Workshop, Uppsala, Sweden (2010)

    Google Scholar 

  31. Erk, K., Kowalski, A., Pado, S.: The SALSA annotation tool. In: Duchier, D., Kruijff, G.J.M. (eds.) Proceedings of the Workshop on Prospects and Advances in the Syntax/Semantics Interface, Nancy, France (2003)

    Google Scholar 

  32. Yetisgen-Yildiz, M., Solti, I., Xia, F., Halgrim, S.R.: Preliminary experience with amazon’s mechanical turk for annotating medical named entities. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT ’10, Los Angeles, California, USA, pp. 180–183 (2010)

    Google Scholar 

  33. Finin, T., Murnane, W., Karandikar, A., Keller, N., Martineau, J., Dredze, M.: Annotating named entities in twitter data with crowdsourcing. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT ’10, Los Angeles, California, USA (2010)

    Google Scholar 

  34. Lawson, N., Eustice, K., Perkowitz, M., Yetisgen-Yildiz, M.: Annotating large email datasets for named entity recognition with mechanical turk. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT ’10, Los Angeles, California, USA, pp. 71–79 (2010)

    Google Scholar 

  35. Nothman, J., Curran, J.R., Murphy, T.: Transforming Wikipedia into named entity training data. In: Proceedings of the Australian Language Technology Workshop (2008)

    Google Scholar 

  36. Balasuriya, D., Ringland, N., Nothman, J., Murphy, T., Curran, J.R.: Named entity recognition in Wikipedia. In: People’s Web ’09: Proceedings of the 2009 Workshop on The People’s Web Meets NLP, Suntec, Singapore, pp. 10–18 (2009)

    Google Scholar 

  37. Stürenberg, M., Goecke, D., Die-wald, N., Cramer, I., Mehler, A.: Web-based annotation of anaphoric relations and lexical chains. In: ACL Workshop on Linguistic Annotation Workshop (LAW), Prague, Czech Republic (2007)

    Google Scholar 

  38. von Ahn, L.: Games with a purpose. IEEE Comput. Mag. 39, 92–94 (2006)

    Article  Google Scholar 

  39. Chamberlain, J., Poesio, M., Kruschwitz, U.: Phrase detectives: a web-based collaborative annotation game. In: Proceedings of the International Conference on Semantic Systems (I-Semantics’08), Graz, Austria (2008)

    Google Scholar 

  40. Hughes, T., Nakajima, K., Ha, L., Vasu, A., Moreno, P., LeBeau, M.: Building transcribed speech corpora quickly and cheaply for many languages. In: Proceedings of Interspeech, Makuhari, Chiba, Japan, September 2010, pp. 1914–1917 (2010)

    Google Scholar 

  41. Couillault, A., Fort, K.: Charte Éthique et Big Data : parce que mon corpus le vaut bien ! In: Linguistique, Langues et Parole : Statuts, Usages et Mésusages, Strasburg, France, July 2013, 4 p (2013)

    Google Scholar 

Download references

Acknowledgments

This work was partly realized as part of the Quæro Programme, funded by oseo, French State agency for innovation, as well as part of the French anr project edylex (anr-09-cord-008) and of the Network of Excellence “Multilingual Europe Technology Alliance (meta-net)”, co-funded by the 7th Framework Programme of the European Commission through the contract t4me (grant agreement no.: 249119).

We would like to thank the authors (http://wiki.ethique-big-data.org/index.php?title=Ethique_Big_Data:Accueil#Les_auteurs) of the Ethics and Big Data Charter for their dedicated time and effort.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karën Fort .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Fort, K., Adda, G., Sagot, B., Mariani, J., Couillault, A. (2014). Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use. In: Vetulani, Z., Mariani, J. (eds) Human Language Technology Challenges for Computer Science and Linguistics. LTC 2011. Lecture Notes in Computer Science(), vol 8387. Springer, Cham. https://doi.org/10.1007/978-3-319-08958-4_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08958-4_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08957-7

  • Online ISBN: 978-3-319-08958-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics