Skip to main content

Crowdsourcing

  • Chapter
  • First Online:

Abstract

Most annotated corpora of wide use in computational linguistics were created using traditional annotation methods, but such methods may not be appropriate for smaller scale annotation and tend to be too expensive for very large scale annotation. This chapter covers crowdsourcing, the use of web collaboration for annotation. Both microtask crowdsourcing and games-with-a-purpose are discussed, as well as their use in computational linguistics.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   349.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   449.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The alternative term human computation is arguably more popular in other fields, but crowdsourcing is more popular in Computational Linguistics.

  2. 2.

    A more formal and systematic definition of crowdsourcing has been provided in [17]:

    Crowdsourcing is a type of participative online activity in which an individual, an institution, a non-profit organization, or company proposes to a group of individuals of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task. The undertaking of the task, of variable complexity and modularity, and in which the crowd should participate bringing their work, money, knowledge and/or experience, always entails mutual benefit. The user will receive the satisfaction of a given type of need, be it economic, social recognition, self-esteem, or the development of individual skills, while the crowdsourcer will obtain and utilize to their advantage that what the user has brought to the venture, whose form will depend on the type of activity undertaken.

  3. 3.

    A number of alternative classification schemes for crowdsourcing have been proposed –see, e.g., [41, 50]. We return to the Wang et al. study below.

  4. 4.

    The creation of the Oxford English Dictionary in the nineteenth century, which involved the collaboration of thousands of volunteers proposing candidate words and senses, is perhaps the best known example of the use of this approach in the pre-Web era.

  5. 5.

    http://meta.wikimedia.org/wiki/List_of_Wikipedias.

  6. 6.

    http://fold.it/portal.

  7. 7.

    http://www.galaxyzoo.org/.

  8. 8.

    http://phylo.cs.mcgill.ca/.

  9. 9.

    http://www.openmind.org.

  10. 10.

    http://conceptnet5.media.mit.edu.

  11. 11.

    https://www.mturk.com/.

  12. 12.

    The term turkers is also often used on Amazon Mechanical Turk, but this term is often perceived as having a negative connotation.

  13. 13.

    http://crowdflower.com.

  14. 14.

    http://samasource.org.

  15. 15.

    von Ahn’s games used to be available from www.gwap.com, but the site is now dormant. ESP is still occasionally available at http://www.espgame.org.

  16. 16.

    http://en.wikipedia.org/wiki/Google_Image_Labeler.

  17. 17.

    OntoTube used to be online at http://ontogame.sti2.at/games.

  18. 18.

    Tagatune used to be available as http://www.gwap.com/gwap/gamesPreview/tagatune or from Facebook. The site now appears to be dormant.

  19. 19.

    Verbosity used to be accessible at http://www.gwap.com/gwap/gamesPreview/verbosity.

  20. 20.

    http://ontogame.sti2.at/games.

  21. 21.

    http://ai.stanford.edu/~dvickrey/wordgame/.

  22. 22.

    http://ai.stanford.edu/~dvickrey/wordgame/.

  23. 23.

    As of May 2015 Amazon Mechanical Turk requires payment with a US-based credit card hence most researchers outside the USA use CrowdFlower that does not have such a restriction.

  24. 24.

    http://www.phrasedetectives.com.

  25. 25.

    http://apps.facebook.com/phrasedetectives.

  26. 26.

    https://www.modul.ac.at/about/departments/new-media-technology/projects/sentiment-quiz/.

  27. 27.

    http://www.give-challenge.org.

  28. 28.

    http://galoap.codeplex.com.

  29. 29.

    http://gmb.let.rug.nl/.

  30. 30.

    http://www.wordrobe.org.

  31. 31.

    http://www.sensei-conversation.eu/.

References

  1. Aker, A., El-Haj, M., Albakour, M., Kruschwitz, U.: Assessing crowdsourcing quality through objective tasks. In: Proceedings of LREC (2012)

    Google Scholar 

  2. Attardi, G.: the Galoap Team: Phratris. Demo presented at INSEMTIVES 2010 (2010)

    Google Scholar 

  3. Basile, V., Bos, J., Evang, K., Venhuizen, N.: Developing a large semantically annotated corpus. In: Proceedings of LREC, pp. 3196–3200. Istanbul, Turkey (2012)

    Google Scholar 

  4. Bhardwaj, V., Passonneau, R., Salleb-Alouissi, A., Ide, N.: Anveshan: a tool for analysis of multiple annotators’ labelling behavior. In: Proceedings of the 4th LAW (2010)

    Google Scholar 

  5. Biermann, C.: Creating a system for lexical substitutions from scratch using crowdsourcing. Lang. Resour. Eval. 47(1), 97–122 (2013)

    Article  Google Scholar 

  6. Burchardt, A., Erk, K., Frank, A., Kowalski, A., Pado, S., Pinkal, M.: Framenet for the semantic analysis of German: Annotation, representation and automation. In: Boas, H.C. (ed.) Multilingual FrameNets in Computational Lexicography: Methods and Applications. Mouton De Gruyter (2009)

    Google Scholar 

  7. Callison-Burch, C.: Fast, cheap, and creative: evaluating translation quality using amazon’s mechanical turk. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Vol. 1, pp. 286–295. Association for Computational Linguistics (2009)

    Google Scholar 

  8. Carpenter, B.: Multilevel bayesian models of categorical data annotation (2008). Available as http://lingpipe.files.wordpress.com/2008/11/carp-bayesian-multilevel-annotation.pdf

  9. Caverlee, P.: Eploitation in human computation systems. In: Michelucci, P. (ed.) Handbook of Human Computation. Springer (2013)

    Google Scholar 

  10. Chamberlain, J., Poesio, M., Kruschwitz, U.: Phrase detectives: a web-based collaborative annotation game. In: Proceedings of the International Conference on Semantic Systems (I-Semantics’08). Graz (2008)

    Google Scholar 

  11. Chamberlain, J., Kruschwitz, U., Poesio, M.: Facebook phrase detectives: social networks meet games-with-a-purpose (2012). In preparation

    Google Scholar 

  12. Chamberlain, J., Kruschwitz, U., Poesio, M.: Methods for engaging and evaluating users of human computation systems. In: Handbook of Human Computation. Springer (2013)

    Google Scholar 

  13. Chklovski, T., Gil, Y.: Improving the design of intelligent acquisition interfaces for collecting world knowledge from web contributors. In: Proceedings of the 3rd International Conference on Knowledge Capture, pp. 35–42 (2005)

    Google Scholar 

  14. Chklovski, T.: Collecting paraphrase corpora from volunteer contributors. In: Proceedings of K-CAP ’05, pp. 115–120. ACM, New York, USA (2005). http://doi.acm.org/10.1145/1088622.1088644

  15. Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl. Stat. 28(1), 20–28 (1979)

    Article  Google Scholar 

  16. El-Haj, M., Kruschwitz, U., Fox, C.: Using mechanical turk to create a corpus of arabic summaries. In: Proceedings of LREC Workshop on Semitic Languages, pp. 36–39. Malta (2010)

    Google Scholar 

  17. Estellés-Arolas, E., González-Ladrón-de Guevara, F.: Towards an integrated crowdsourcing definition. J. Inf. Sci. 38(2), 189–200 (2012)

    Article  Google Scholar 

  18. Felstiner, A.: Labor standards. In: Michelucci, P. (ed.) Handbook of Human Computation. Springer (2013)

    Google Scholar 

  19. Finin, T., Murnane, W., Karandikar, A., Keller, N., Martineau, J., Dredze, M.: Annotating named entities in twitter data with crowdsourcing. In: Proceedings of CSLDAMT ’10 - NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. 80–88 (2010)

    Google Scholar 

  20. Fort, K., Adda, G., Cohen, K.B.: Amazon mechanical turk: gold mine or coal mine? Comput. Linguist. 37, 413–420 (2011). Editorial

    Article  Google Scholar 

  21. Gurevych, I., Zesch, T.: Collective intelligence and language resources: introduction to the special issue on collaboratively constructed language resources. Lang. Resour. Eval. 47(1), 1–7 (2013)

    Article  Google Scholar 

  22. Havasi, C., Speer, R., Alonso, J.: Conceptnet 3: a flexible, multilingual semantic network for common sense knowledge. In: Proceedings of RANLP (2007)

    Google Scholar 

  23. Hladká, B., Mírovskỳ, J., Schlesinger, P.: Play the language: play coreference. In: Proceedings of the Joint conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing, pp. 209–212. Association for Computational Linguistics (2009)

    Google Scholar 

  24. Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., Weischedel, R.: Ontonotes: the 90% solution. In: Proceedings Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 57–60 (2006)

    Google Scholar 

  25. Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., Hovy, E.: Learning whom to trust with MACE. In: Proceedings of NAACL, pp. 1120–1130 (2013)

    Google Scholar 

  26. Howe, J.: Crowdsourcing: Why the Power of the Crowd is Driving the Future of Business. Crown Publishing Group, New York (2008)

    Google Scholar 

  27. Kamp, H., Reyle, U.: From Discourse to Logic. D. Reidel, Dordrecht (1993)

    Google Scholar 

  28. Koller, A., Striegnitz, K., Gargett, A., Byron, D., Cassell, J., Dale, R., Moore, J., Oberlander, J.: Report on the second nlg challenge on generating instructions in virtual environments (give-2). In: Proceedings of the 6th International Natural Language Generation Conference. Dublin (2010)

    Google Scholar 

  29. Mainzer, J.E.: Labeling parts of speech using untrained annotators on mechanical turk. Master’s thesis, Ohio State University (2011)

    Google Scholar 

  30. Mason, W., Watts, D.J.: Financial incentives and the “performance of crowds”. Spec. Interes. Group Knowl. Discov. Data Min. Explor. Newsl. 11, 100–108 (2010)

    Google Scholar 

  31. McGraw, I., Lee, C., Hetherington, I.L., Seneff, S., Glass, J.: Collecting voices from the cloud. In: Proceedings of LREC (2010)

    Google Scholar 

  32. Michelucci, P. (ed.): Handbook of Human Computation. Springer (2013)

    Google Scholar 

  33. Mihalcea, R., Strapparava, C.: The lie detector: explorations in the automatic recognition of deceptive language. In: Proceedings of ACL/IJCNLP, pp. 309–312 (2009)

    Google Scholar 

  34. Mohammad, S.M., Turner, P.D.: Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon. In: Proceedings of CAAGET ’10 - the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pp. 26–34 (2010)

    Google Scholar 

  35. Munro, R., Bethard, S., Kuperman, V., Lai, V.T., Melnick, R., Potts, C., Schnoebelen, T., Tily, H.: Crowdsourcing and language studies: the new generation of linguistic data. In: Proceedings of CSLDAMT ’10 - NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. 122–130 (2010)

    Google Scholar 

  36. Oleson, D., Sorokin, A., Laughlin, G., Hester, V., Le, J., Biewald, L.: Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. In: Proceedings of the AAAI Workshop on Human Computation, pp. 43–48 (2011)

    Google Scholar 

  37. Passonneau, R.J., Carpenter, B.: The benefits of a model of annotation. Trans. ACL 2, 311–326 (2014)

    Google Scholar 

  38. Passonneau, R.J., Bhardwaj, V., Salleb-Aouissi, A., Ide, N.: Multiplicity and word sense: evaluating and learning from multiply labeled word sense annotations. Lang. Resour. Eval. 46(2), 219–252 (2012). doi:10.1007/s10579-012-9188-x

    Article  Google Scholar 

  39. Poesio, M., Artstein, R.: Anaphoric annotation in the arrau corpus. In: Proceedings of the sixth International Conference on Language Resources and Evaluation. Marrakesh (2008)

    Google Scholar 

  40. Poesio, M., Chamberlain, J., Kruschwitz, U., Robaldo, L., Ducceschi, L.: Phrase detectives: utilizing collective intelligence for internet-scale language resource creation. ACM Trans. Intell. Interact. Syst. 3(1), (2013)

    Google Scholar 

  41. Quinn, A.J., Bederson, B.B.: A taxonomy of distributed human computation. Technical report, University of Maryland, College Park (2009)

    Google Scholar 

  42. Rafelsberger, W., Scharl, A.: Games with a purpose for social networking platforms. In: Proceedings of the 20th Association for Computing Machinery (ACM) conference on Hypertext and hypermedia, pp. 193–198. ACM (2009)

    Google Scholar 

  43. Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)

    Google Scholar 

  44. Singh, P.: The public acquisition of commonsense knowledge. In: Proceedings of the AAAI Spring Symposium on Acquiring (and Using) Linguistic (and World) Knowledge for Information Access. Palo Alto, CA (2002)

    Google Scholar 

  45. Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks. In: EMNLP ’08: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 254–263. Association for Computational Linguistics, Morristown, NJ, USA (2008)

    Google Scholar 

  46. Uebersax, J.S., Grove, W.M.: A latent trait finite mixture model for the analysis of rating agreement. Biom. 49, 832–835 (1993)

    Article  Google Scholar 

  47. Venhuizen, N., Basile, V., Evang, K., Bos, J.: Gamification for word sense labeling. In: Proceedings of the 10th IWCS, pp. 397–403. Potsdam, Germany (2013)

    Google Scholar 

  48. von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the Conference on Human Factors in Computing Systems, pp. 319–326. ACM (2004)

    Google Scholar 

  49. von Ahn, L.: Games with a purpose. Comput. 39(6), 92–94 (2006)

    Article  Google Scholar 

  50. Wang, A., Hoang, C.D.V., Kan, M.Y.: Perspectives on crowdsourcing annotation for natural language processing. Lang. Res. Eval. 47(1), 9–31 (2013)

    Article  Google Scholar 

  51. Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. Adv. Neural Inf. Process. Syst. 22, 2035–2043 (2009)

    Google Scholar 

Download references

Acknowledgements

This work was in part supported by the sensei project.Footnote 31 The development of Phrase Detectives was in part supported by epsrc. Jon Chamberlain is currently supported by epsrc.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimo Poesio .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Poesio, M., Chamberlain, J., Kruschwitz, U. (2017). Crowdsourcing. In: Ide, N., Pustejovsky, J. (eds) Handbook of Linguistic Annotation. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-0881-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-94-024-0881-2_10

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-024-0879-9

  • Online ISBN: 978-94-024-0881-2

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics