High-Throughput Crowdsourcing Mechanisms for Complex Tasks

Sautter, Guido; Böhm, Klemens

doi:10.1007/978-3-642-24704-0_27

Guido Sautter²² &
Klemens Böhm²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6984))

Included in the following conference series:

International Conference on Social Informatics

1752 Accesses
2 Citations

Abstract

Crowdsourcing is popular for large-scale data processing endeav ors that require hu man input. However, working with a large community of users raises new chal lenges. In particular, both possible misjudgment and disho nesty threaten the quality of the results. Common countermeasures are based on redundancy, giving way to a tradeoff between result quality and throughput. Ideally, measures should (1) maintain high throughput and (2) ensure high result quality at the same time. Existing work on crowdsourcing mostly focuses on result quality, paying little attention to throughput or even to that tradeoff. One reason is that the number of tasks (individual atomic units of work) is usually small. A further problem is that the tasks users work on are small as well. In consequence, existing result-improvement mecha nisms do not scale to the number or complexity of tasks that arise, for instance, in proofreading and processing of digitized legacy literature. This paper proposes novel result-improvement mechanisms that (1) are independent of the size and complexity of tasks and (2) allow to trade result quality for throughput to a significant extent. Both mathematical analyses and extensive simulations show the effectiveness of the proposed mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

The Amazon Mechanical Turk, http://www.mturk.com
Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popovic, Z.: Predicting protein structures with a multiplayer online game. Nature 466 (2010)
Google Scholar
Eckert, K., Niepert, M., Niemann, C., Buckner, C., Allen, C., Stuckenschmidt, H.: Crowdsourcing the assembly of concept hierarchies. In: Proceedings of JCDL 2010, Brisbane, Australia (2010)
Google Scholar
Lintott, C.J., Schawinski, K., Slosar, A., Land, K., Bamford, S., Thomas, D., Raddick, M.J., Nichol, R.C., Szalay, A., Andreescu, D., Murray, P., Vandenberg, J.: Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 389 (2008), doi:10.1111/j.1365-2966.2008.13689.x
Google Scholar
Newby, G.B., Franks, C.: Distributed proofreading. In: Proceedings of JCDL 2003, Houston, TX (2003), doi:10.1109/JCDL.2003.1204888
Google Scholar
Sautter, G., Böhm, K., Agosti, D., Klingenberg, C.: Digital Resources from Legacy Documents - an Experience Report from the Biosystematics Domain. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 738–752. Springer, Heidelberg (2009)
Chapter Google Scholar
Siorpaes, K., Hepp, M.: OntoGame: Towards overcoming the incentive bottleneck in ontology building. In: Chung, S., Herrero, P. (eds.) OTM-WS 2007, Part II. LNCS, vol. 4806, pp. 1222–1232. Springer, Heidelberg (2007)
Chapter Google Scholar
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks. In: EMNLP 2008, Morristown, NJ, USA (2008)
Google Scholar
Von Ahn, L., Blum, M., Hopper, N., Langford, J.: CAPTCHA: Using Hard AI Problems for Security. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 294–311. Springer, Heidelberg (2003), doi:10.1007/3-540-39200-9_18
Chapter Google Scholar
Von Ahn, L.: Games with a Purpose. IEEE Computer 29(6), 92–94 (2006)
Article Google Scholar
Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science 321(5895) (2008), doi:10.1126/science.1160379
Google Scholar

Download references

Author information

Authors and Affiliations

KIT, Am Fasanengarten 5, 76128, Karlsruhe, Germany
Guido Sautter & Klemens Böhm

Authors

Guido Sautter
View author publications
You can also search for this author in PubMed Google Scholar
Klemens Böhm
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Engineering, Nanyang Technological University (NTU), Block N4, Nanyang Avenue, 639798, Singapore
Anwitaman Datta
University of Massachusetts Amherst, Thompson Hall, 200 Hicks Way, 01003, Amherst, MA, USA
Stuart Shulman
Singapore Management University, Singapore
Baihua Zheng
Graduate Institute of Networking and Multimedia, Department of Computer Science and Information Engineering, National Taiwan University,, Roosevelt Rd., 10617, Taipei, Taiwan
Shou-De Lin
School of Computer Engineering, Nanyang Technological University, Block N4, Nanyang Avenue, 639798, Singapore
Aixin Sun
School of Information Systems, Singapore Management University, 80 Stamford Rd, 178902, Singapore
Ee-Peng Lim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sautter, G., Böhm, K. (2011). High-Throughput Crowdsourcing Mechanisms for Complex Tasks. In: Datta, A., Shulman, S., Zheng, B., Lin, SD., Sun, A., Lim, EP. (eds) Social Informatics. SocInfo 2011. Lecture Notes in Computer Science, vol 6984. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24704-0_27

Download citation

DOI: https://doi.org/10.1007/978-3-642-24704-0_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24703-3
Online ISBN: 978-3-642-24704-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics