Abstract
Living the economic dream of globalization in the form of a location- and time-independent world-wide employment market, today crowd sourcing companies offer affordable digital solutions to business problems. At the same time, highly accessible economic opportunities are offered to workers, who often live in low or middle income countries. Thus, crowd sourcing can be understood as a flexible social solution that indiscriminately reaches out to poor, yet diligent workers: a win-win situation for employers and crowd workers. On the other hand, its virtual nature opens doors to unethical exploitation by fraudulent workers, compromising in turn the overall quality of the gained results and increasing the costs of continuous result quality assurance, e.g. by gold questions or majority votes. The central question discussed in this paper is how to distinguish between basically honest workers, who might just be lacking educational skills, and plainly unethical workers. We show how current quality control measures misjudge and subsequently discriminate against honest workers with lower skill levels. In contrast, our techniques use statistical models that computes the level of a worker’s skill and a task’s difficulty to clearly distinguish each worker’s success zone and detect irrational response patterns, which usually imply fraud. Our evaluation shows that about 50% of misjudged workers can be successfully detected as honest, can be retained, and subsequently redirected to easier tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gino, F., Staats, B.R.: The microwork solution. Harvard Business Review 90(12) (2012)
Selke, J., Lofi, C., Balke, W.-T.: Pushing the boundaries of crowd- enabled databases with query-driven schema expansion. In: 38th Int. Conf. on Very Large Data Bases (VLDB), pp. 538–549 (2012)
Howe, J.: The Rise of Crowdsourcing. The Journal of North 14(14), 1–5 (2006)
Zhu, D., Carterette, B.: An analysis of assessor behavior in crowdsourced preference judgments. In: SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation, no. Cse, pp. 17–20 (2010)
Rasch, G.: Probabilistic Models for Some Intelligence and Attainment Tests. Nielsen & Lydiche (1960)
Finin, T., Murnane, W., Karandikar, A., Keller, N., Martineau, J., Dredze, M.: Annotating named entities in twitter data with crowdsourcing. In: CSLDAMT 2010 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. 80–88 (2010)
Lofi, C., El Maarry, K., Balke, W.-T.: Skyline queries in crowd-enabled databases. In: Proceedings of the 16th International Conference on Extending Database Technology, EDBT/ICDT Joint Conference (2013)
Kouloumpis, E., Wilson, T., Moore, J.: Twitter sentiment analysis: the good the bad and the OMG! In: International AAAI Conference on Weblogs and Social Media, pp. 538–541 (2011)
Callison-Burch, C.: Fast, cheap, and creative: evaluating translation quality using amazon’s mechanical turk. In: EMNLP 2009: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1(1), pp. 286–295 (2009)
Lofi, C., Selke, J., Balke, W.-T.: Information Extraction Meets Crowdsourcing: A Promising Couple. Proceedings of the VLDB Endowment 5 (6), 538-549 (2012)
Kuncheva, L.I., Whitaker, C.J., Shipp, C.A., Duin, R.P.W.: Limits on the majority vote accuracy in classifier fusion. Journal: Pattern Analysis and Applications, PAA 6(1), 22–31 (2003)
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of Applied Statistics, 20–28 (1979)
Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L.: Learning From Crowds. The Journal of Machine Learning Research 11, 1297–1322 (2010)
Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise. Proceedings of NIPS 22(1), 1–9 (2009)
Ipeirotis, P.G., Provost, F., Wang, J.: Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, p. 3 (2010)
Kazai, G.: In Search of Quality in Crowdsourcing for Search Engine Evaluation. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 165–176. Springer, Heidelberg (2011)
Qiang Liu, A.I.: Mark steyvers, “scoring workers in crowdsourcing: how many control questions are enough? In: Proceedings of NIPS (2013)
El Maarry, K., Balke, W.-T., Cho, H., Hwang, S., Baba, Y.: Skill ontology-based model for quality assurance in crowdsourcing. In: UnCrowd 2014: DASFAA Workshop on Uncertain and Crowdsourced Data, Bali, Indonesia (2014)
Ignjatovic, A., Foo, N., Lee, C.T.L.C.T.: An analytic approach to reputation ranking of participants in online transactions. In: IEEE/WIC/ACM Int. Conf. Web Intell. Intell. Agent Technol., vol. 1 (2008)
Noorian, Z., Ulieru, M.: The State of the Art in Trust and Reputation Systems: A Framework for Comparison. Journal of Theoretical and Applied Electronic Commerce Research 5(2) (2010)
Traub, R.E.: Applications of item response theory to practical testing problems, vol. 5, pp. 539–543. Erlbaum Associates (1980)
Quoc Viet Hung, N., Tam, N.T., Tran, L.N., Aberer, K.: An Evaluation of Aggregation Techniques in Crowdsourcing. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013, Part II. LNCS, vol. 8181, pp. 1–15. Springer, Heidelberg (2013)
Wang, J., Ipeirotis, P.G., Provost, F.: Managing crowdsourced workers. In: Winter Conference on Business Intelligence (2011)
Batchelder, W.H., Romney, A.K.: Test theory without an answer key. Journal Psychometrika 53(1), 71–92 (1988)
Bond, T.G., Fox, C.M.: Applying the Rasch Model: Fundamental Measurement in the Human Sciences. Journal of Educational Measurement 40(2), 185–187 (2003)
Karabatsos, G.: A critique of Rasch residual fit statistics. Journal of Applied Measures. 1(2), 152–176 (2000)
Linacre, J.M.: Understanding Rasch measurement: estimation methods for Rasch measures. Journal of Outcome Measurment 3(4), 382–405 (1999)
Guttman, L.: A basis for scaling qualitative data. Journal of American Sociological Review, 139–150 (1944)
Shannon, C.E.: A mathematical theory of communication. SIGMOBILE Mobile Computing and Communications Review 5(1) (2001)
Rasch, G.: On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. Journal of Danish Yearbook of Philosophy 14 (1977)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Maarry, K.E., Balke, WT. (2015). Retaining Rough Diamonds: Towards a Fairer Elimination of Low-Skilled Workers. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9050. Springer, Cham. https://doi.org/10.1007/978-3-319-18123-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-18123-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18122-6
Online ISBN: 978-3-319-18123-3
eBook Packages: Computer ScienceComputer Science (R0)