Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use

Fort, Karën; Adda, Gilles; Sagot, Benoît; Mariani, Joseph; Couillault, Alain

doi:10.1007/978-3-319-08958-4_25

Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use

Karën Fort⁶,
Gilles Adda⁷,
Benoît Sagot⁸,
Joseph Mariani^7,9 &
…
Alain Couillault¹⁰

Conference paper
First Online: 01 January 2014

977 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8387))

Abstract

This article is a position paper about Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in articles of the domain, this type of on-line working platforms allows to develop quickly all sorts of quality language resources, at a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal. Our goal here is manifold: 1- to inform researchers, so that they can make their own choices, 2- to develop alternatives with the help of funding agencies and scientific associations, 3- to propose practical and organizational solutions in order to improve language resources development, while limiting the risks of ethical and legal issues without letting go price or quality, 4- to introduce an Ethics and Big Data Charter for the documentation of language resources.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Microworking refers to the fact that tasks are cut into small pieces and their execution is paid for. Crowdsourcing refers to the fact that the job is outsourced via the web and done by many people (paid or not).
2.
For instance, we learn that Indian Turkers were 5 % in 2008, 36 % in December 2009 [2], 50 % in May 2010 (http://blog.crowdflower.com/2010/05/amazon- mechanical-turk-survey/) and have produced over 60 % of the activity in mturk [4].
3.
$1.25/hr according to [6] $1.38/hr according to [7].
4.
For instance http://mechanicalturk.typepad.com or http://turkers.proboards.com.
5.
http://turkopticon.differenceengines.com
6.
Some of the problems reported, such as the interface problems, are not specific to mturk, but are generic to many crowdsourcing systems.
7.
Interestingly, it seems that mturk recently decided to no longer accept the non-US Turkers, for quality and fraud reasons: http://turkrequesters.blogspot.fr/2013/01/the-reasons-why-amazon-mechanical-turk.html.
8.
http://www.anawiki.org
9.
http://www.samasource.org/haiti/
10.
Centre National de la Recherche Scientifique/National agency for scientific research.
11.
Association pour le Traitement Automatique des Langues/Natural Language Processing Association http://www.atala.org.
12.
Association Française de Communication Parlée/French spoken communication association, http://www.afcp-parole.org.
13.
Association de la Maîtrise et de la Valorisation des contenus/Association for mastering and empowering content, http://www.aproged.org.
14.
http://wiki.ethique-big-data.org
15.
http://wiki.ethique-big-data.org/chartes/charteethiqueenV2.pdf
16.
http://www.aclweb.org/
17.
http://www.isca-speech.org/
18.
http://www.elra.info/

References

Fort, K., Adda, G., Cohen, K.B.: Amazon mechanical turk: Gold mine or coal mine? Comput. Linguist. (Editorial) 37(2), 413–420 (2011)
Article Google Scholar
Ross, J., Irani, L., Silberman, M.S., Zaldivar, A., Tomlinson, B.: Who are the crowdworkers?: shifting demographics in mechanical turk. In: Proceedings of the 28th of the International Conference Extended Abstracts on Human Factors in Computing Systems, CHI EA ’10. ACM, New York (2010)
Google Scholar
Ipeirotis, P.: Demographics of mechanical turk. CeDER Working Papers, March 2010. http://hdl.handle.net/2451/29585 (2010). CeDER-10-01
Biewald, L.: Better crowdsourcing through automated methods for quality control. In: SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation, January 2010 (2010)
Google Scholar
Silberman, M.S., Ross, J., Irani, L., Tomlinson, B.: Sellers’ problems in human computation markets. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP ’10, pp. 18–21 (2010)
Google Scholar
Ross, J., Zaldivar, A., Irani, L., Tomlinson, B.: Who are the turkers? worker demographics in amazon mechanical turk. Social Code Report 2009-01. http://www.ics.uci.edu/jwross/pubs/SocialCode-2009-01.pdf (2009)
Chilton, L.B., Horton, J.J., Miller, R.C., Azenkot, S.: Task search in a human computation market. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP ’10, pp. 1–9 (2010)
Google Scholar
Adda, G., Mariani, J.: Language resources and amazon mechanical turk: legal, ethical and other issues. In: LISLR 2010, “Legal Issues for Sharing Language Resources workshop”, LREC 2010, Valletta, Malta, May 2010 (2010)
Google Scholar
Novotney, S., Callison-Burch, C.: Cheap, fast and good enough: automatic speech recognition with non-expert transcription. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’10, Los Angeles, California, USA, pp. 207–215 (2010)
Google Scholar
Callison-Burch, C., Dredze, M.: Creating speech and language data with amazon’s mechanical turk. In: CSLDAMT ’10: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, Los Angeles, California, USA (2010)
Google Scholar
Kaisser, M., Lowe, J.B.: Creating a research collection of question answer sentence pairs with amazon’s mechanical turk. In: Proceedings of the International Language Resources and Evaluation Conference (LREC), Marrakech, Morocco (2008)
Google Scholar
Xu, F., Klakow, D.: Paragraph acquisition and selection for list question using amazon’s mechanical turk. In: Proceedings of the International Language Resources and Evaluation Conference (LREC), Valletta, Malta, May 2010, pp. 2340–2345 (2010)
Google Scholar
Marge, M., Banerjee, S., Rudnicky, A.I.: Using the amazon mechanical turk for transcription of spoken language. In: IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Dallas, USA, 14–19 March 2010, pp. 5270–5273 (2010)
Google Scholar
Cook, P., Stevenson, S.: Automatically identifying changes in the semantic orientation of words. In: Proceedings of the International Language Resources and Evaluation Conference (LREC), Valletta, Malta, May 2010 (2010)
Google Scholar
Bhardwaj, V., Passonneau, R., Salleb-Aouissi, A., Ide, N.: Anveshan: a tool for analysis of multiple annotators’ labeling behavior. In: Proceedings of the Fourth Linguistic Annotation Workshop (LAW IV), Uppsala, Sweden (2010)
Google Scholar
Snow, R., O’Connor, B., Jurafsky, D., Ng., A.Y.: Cheap and fast - but is it good? evaluating non-expert annotations for natural language tasks. In: Proceedings of EMNLP 2008. pp. 254–263 (2008)
Google Scholar
Gillick, D., Liu, Y.: Non-expert evaluation of summarization systems is risky. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT ’10, Los Angeles, California, USA (2010)
Google Scholar
Tratz, S., Hovy, E.: A taxonomy, dataset, and classifier for automatic noun compound interpretation. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, July 2010, pp. 678–687 (2010)
Google Scholar
Wais, P., Lingamneni, S., Cook, D., Fennell, J., Goldenberg, B., Lubarov, D., Marin, D., Simons, H.: Towards building a high-quality workforce with mechanical turk. In: Proceedings of Computational Social Science and the Wisdom of Crowds (NIPS), December 2010 (2010)
Google Scholar
Kochhar, S., Mazzocchi, S., Paritosh, P.: The anatomy of a large-scale human computation engine. In: Proceedings of Human Computation Workshop at the 16th ACM SIKDD Conference on Knowledge Discovery and Data Mining, KDD 2010, Washington D.C. (2010)
Google Scholar
Goldwater, S., Griffiths, T.: A fully bayesian approach to unsupervised part-of-speech tagging. In: Proceedings of ACL, Prague, Czech Republic (2007)
Google Scholar
Hänig, C.: Improvements in unsupervised co-occurrence based parsing. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, CoNLL ’10, Uppsala, Sweden, pp. 1–8 (2010)
Google Scholar
Abney, S.: Semisupervised Learning for Computational Linguistics. 1ère edn. Chapman & Hall/CRC, New York (2007)
Google Scholar
Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, MA, USA, pp. 189–196 (1995)
Google Scholar
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: COLT: Proceedings of the Workshop on Computational Learning Theory. Morgan Kaufmann Publishers (1998)
Google Scholar
Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 705–712. The MIT Press, Cambridge (1995)
Google Scholar
Smith, N., Eisner, J.: Contrastive estimation: training log-linear models on unlabeled data. In: Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, Michigan, USA, pp. 354–362 (2005)
Google Scholar
Sagot, B.: Automatic acquisition of a Slovak lexicon from a raw corpus. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 156–163. Springer, Heidelberg (2005)
Chapter Google Scholar
Watson, R., Briscoe, T., Carroll, J.: Semi-supervised training of a statistical parser from unlabeled partially-bracketed data. In: Proceedings of the 10th International Conference on Parsing Technologies, IWPT ’07, Prague, Czech Republic (2007)
Google Scholar
Fort, K., Sagot, B.: Influence of Pre-annotation on POS-tagged corpus development. In: Proceedings of the Fourth ACL Linguistic Annotation Workshop, Uppsala, Sweden (2010)
Google Scholar
Erk, K., Kowalski, A., Pado, S.: The SALSA annotation tool. In: Duchier, D., Kruijff, G.J.M. (eds.) Proceedings of the Workshop on Prospects and Advances in the Syntax/Semantics Interface, Nancy, France (2003)
Google Scholar
Yetisgen-Yildiz, M., Solti, I., Xia, F., Halgrim, S.R.: Preliminary experience with amazon’s mechanical turk for annotating medical named entities. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT ’10, Los Angeles, California, USA, pp. 180–183 (2010)
Google Scholar
Finin, T., Murnane, W., Karandikar, A., Keller, N., Martineau, J., Dredze, M.: Annotating named entities in twitter data with crowdsourcing. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT ’10, Los Angeles, California, USA (2010)
Google Scholar
Lawson, N., Eustice, K., Perkowitz, M., Yetisgen-Yildiz, M.: Annotating large email datasets for named entity recognition with mechanical turk. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT ’10, Los Angeles, California, USA, pp. 71–79 (2010)
Google Scholar
Nothman, J., Curran, J.R., Murphy, T.: Transforming Wikipedia into named entity training data. In: Proceedings of the Australian Language Technology Workshop (2008)
Google Scholar
Balasuriya, D., Ringland, N., Nothman, J., Murphy, T., Curran, J.R.: Named entity recognition in Wikipedia. In: People’s Web ’09: Proceedings of the 2009 Workshop on The People’s Web Meets NLP, Suntec, Singapore, pp. 10–18 (2009)
Google Scholar
Stürenberg, M., Goecke, D., Die-wald, N., Cramer, I., Mehler, A.: Web-based annotation of anaphoric relations and lexical chains. In: ACL Workshop on Linguistic Annotation Workshop (LAW), Prague, Czech Republic (2007)
Google Scholar
von Ahn, L.: Games with a purpose. IEEE Comput. Mag. 39, 92–94 (2006)
Article Google Scholar
Chamberlain, J., Poesio, M., Kruschwitz, U.: Phrase detectives: a web-based collaborative annotation game. In: Proceedings of the International Conference on Semantic Systems (I-Semantics’08), Graz, Austria (2008)
Google Scholar
Hughes, T., Nakajima, K., Ha, L., Vasu, A., Moreno, P., LeBeau, M.: Building transcribed speech corpora quickly and cheaply for many languages. In: Proceedings of Interspeech, Makuhari, Chiba, Japan, September 2010, pp. 1914–1917 (2010)
Google Scholar
Couillault, A., Fort, K.: Charte Éthique et Big Data : parce que mon corpus le vaut bien ! In: Linguistique, Langues et Parole : Statuts, Usages et Mésusages, Strasburg, France, July 2013, 4 p (2013)
Google Scholar

Download references

Acknowledgments

This work was partly realized as part of the Quæro Programme, funded by oseo, French State agency for innovation, as well as part of the French anr project edylex (anr-09-cord-008) and of the Network of Excellence “Multilingual Europe Technology Alliance (meta-net)”, co-funded by the 7th Framework Programme of the European Commission through the contract t4me (grant agreement no.: 249119).

We would like to thank the authors (http://wiki.ethique-big-data.org/index.php?title=Ethique_Big_Data:Accueil#Les_auteurs) of the Ethics and Big Data Charter for their dedicated time and effort.

Author information

Authors and Affiliations

LORIA & Université de Lorraine, Vandœuvre-lès-Nancy, France
Karën Fort
Spoken Language Processing Group, LIMSI-CNRS, Orsay, France
Gilles Adda & Joseph Mariani
Alpage, INRIA Paris–Rocquencourt & Université Paris 7, Rocquencourt, France
Benoît Sagot
IMMI-CNRS, Orsay, France
Joseph Mariani
L3i Laboratory, Université de La Rochelle, La Rochelle, France
Alain Couillault

Authors

Karën Fort
View author publications
You can also search for this author in PubMed Google Scholar
Gilles Adda
View author publications
You can also search for this author in PubMed Google Scholar
Benoît Sagot
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Mariani
View author publications
You can also search for this author in PubMed Google Scholar
Alain Couillault
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karën Fort .

Editor information

Editors and Affiliations

Adam Mickiewicz University, Poznań, Poland
Zygmunt Vetulani
IMMI-CNRS, Orsay, France
Joseph Mariani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fort, K., Adda, G., Sagot, B., Mariani, J., Couillault, A. (2014). Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use. In: Vetulani, Z., Mariani, J. (eds) Human Language Technology Challenges for Computer Science and Linguistics. LTC 2011. Lecture Notes in Computer Science(), vol 8387. Springer, Cham. https://doi.org/10.1007/978-3-319-08958-4_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-08958-4_25
Published: 26 July 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08957-7
Online ISBN: 978-3-319-08958-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics