Abstract
Crowdsourcing has become a popular and practical tool to gather low-cost labels from human workers in order to provide training data for machine learning applications. However, the quality of the crowdsourced data has always been an issue mainly caused by the quality of the contributors. Since they can be unreliable due to many factors, it became common to assign a task to more than one person and then combine the gathered contributions in order to obtain high quality results. In this work, we propose a new approach of answer combination within an evidential framework to cope with uncertainty. In fact, we assume that answers could be partial which means imprecise or even incomplete. Moreover, the approach includes an important step that clusters workers using the k-means algorithm to determine their types in order to effectively integrate them in the aggregation of answers step. Experimentation on simulated dataset show the efficiency of our approach to improve outcome quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zheng, Y., Wang, J., Li, G., Feng, J.: QASCA: a quality-aware task assignment system for crowdsourcing applications. In: International Conference on Management of Data, pp. 1031–1046 (2015)
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast but is it good? Evaluation non-expert annotations for natural language tasks. In: The Conference on Empirical Methods In Natural Languages Processing, pp. 254–263 (2008)
Sheng, V.S., Provost, F., Ipeirotis, P.G.: Get another label? Improving data quality and data mining using multiple, noisy labelers. In: International Conference on Knowledge Discovery and Data Mining, pp. 614–622 (2008)
Shafer, G.: A Mathematical Theory of Evidence, vol. 1. Princeton University Press, Princeton (1976)
Dempster, A.P.: Upper and lower probabilities induced by a multivalued mapping. In: The Annals of Mathematical Statistics, pp. 325–339 (1967)
Jousselme, A.-L., Grenier, D., Bossé, É.: A new distance between two bodies of evidence. Inf. Fusion 2, 91–101 (2001)
Lefèvre, E., Elouedi, Z.: How to preserve the confict as an alarm in the combination of belief functions? Decis. Support Syst. 56, 326–333 (2013)
Smets, P.: The combination of evidence in the transferable belief model. IEEE Trans. Pattern Anal. Mach. Intell. 12(5), 447–458 (1990)
Raykar, V.C., Yu, S.: Eliminating spammers and ranking annotators for crowdsourced labeling tasks. J. Mach. Learn. Res. 13, 491–518 (2012)
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl. Stat. 28, 20–28 (2010)
Khattak, F.K., Salleb, A.: Quality control of crowd labeling through expert evaluation. In: The Neural Information Processing Systems 2nd Workshop on Computational Social Science and the Wisdom of Crowds, pp. 27–29 (2011)
Raykar, V.C., et al.: Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 889–896 (2009)
Smets, P., Kennes, R.: The transferable belief model. Artif. Intell. 66, 191–234 (1994)
Ben Rjab, A., Kharoune, M., Miklos, Z., Martin, A.: Characterization of experts in crowdsourcing platforms. In: Vejnarová, J., Kratochvíl, V. (eds.) BELIEF 2016. LNCS (LNAI), vol. 9861, pp. 97–104. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45559-4_10
Watanabe, M., Yamaguchi, K.: The EM Algorithm and Related Statistical Models, p. 250. CRC Press, Boca Raton (2003)
Whitehill, J., Wu, T., Bergsma, J., Movellan, J.R., Ruvolo, P.L.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Neural Information Processing Systems, pp. 2035–2043 (2009)
Abassi, L., Boukhris, I.: Crowd label aggregation under a belief function framework. In: Lehner, F., Fteimi, N. (eds.) KSEM 2016. LNCS (LNAI), vol. 9983, pp. 185–196. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47650-6_15
Abassi, L., Boukhris, I.: A gold standards-based crowd label aggregation within the belief function theory. In: Benferhat, S., Tabia, K., Ali, M. (eds.) IEA/AIE 2017. LNCS (LNAI), vol. 10351, pp. 97–106. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60045-1_12
Abassi, L., Boukhris, I.: Iterative aggregation of crowdsourced tasks within the belief function theory. In: Antonucci, A., Cholvy, L., Papini, O. (eds.) ECSQARU 2017. LNCS (LNAI), vol. 10369, pp. 159–168. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61581-3_15
Abassi, L., Boukhris, I.: A worker clustering-based approach of label aggregation under the belief function theory. Appl. Intell. 49, 53–62 (2018). ISSN: 1573–7497
Abassi, L., Boukhris, I.: Imprecise label aggregation approach under the belief function theory. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds.) ISDA 2018 2018. AISC, vol. 941, pp. 607–616. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-16660-1_59
Koulougli, D., HadjAli, A., Rassoul, I.: Handling query answering in crowdsourcing systems: a belief function-based approach. In: Annual Conference of the North American Fuzzy Information Processing Society (NAFIPS), pp. 1–6 (2016)
Smarandache, K., Martin, A., Osswald, C.: Contradiction measures and specificity degrees of basic belief assignments. In: 14th International Conference on Information Fusion, pp. 1–8 (2011)
Kuncheva, L., et al.: Limits on the majority vote accuracy in classifier fusion. Pattern Anal. Appl. 6, 22–31 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Abassi, L., Boukhris, I. (2019). An Evidential Imprecise Answer Aggregation Approach Based on Worker Clustering. In: Yin, H., Camacho, D., Tino, P., Tallón-Ballesteros, A., Menezes, R., Allmendinger, R. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2019. IDEAL 2019. Lecture Notes in Computer Science(), vol 11871. Springer, Cham. https://doi.org/10.1007/978-3-030-33607-3_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-33607-3_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33606-6
Online ISBN: 978-3-030-33607-3
eBook Packages: Computer ScienceComputer Science (R0)