A worker clustering-based approach of label aggregation under the belief function theory

  • Lina Abassi
  • Imen Boukhris


Crowdsourcing platforms have been attracting a wide attention in the field of artificial intelligence in recent years, providing a cheap and reachable human-powered resource to gather massive labeled data. These data are used to effectively build supervised learning models for academic research puposes. However, despite the attractiveness of these systems, the major concern has always been the quality of the collected labels. In fact, a wide range of workers contributes in labeling data leading to be in possession of potentially noisy and imperfect labels. Therefore in this paper, we propose a new label aggregation technique that allows to determine workers qualities via a clustering process and then represent and combine their labels to estimate the final one under the belief function theory. This latter is notorious for its strength and flexibility when dealing with imperfect information. Experimental results demonstrate that our proposed method outperforms the related work baseline and improves results quality.


Crowdsourcing Belief function theory Label aggregation Worker clustering Gold data 


  1. 1.
    Howe J (2006) The rise of crowdsourcing. Wired Mag 14(6):1–4Google Scholar
  2. 2.
    Zheng Y, Wang J, Li G, Feng J (2015) QASCA: a quality-aware task assignment system for crowdsourcing applications. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, pp 1031–1046Google Scholar
  3. 3.
    Yan T, Kumar V, Ganesan D (2010) Crowdsearch: exploiting crowds for accurate real-time image search on mobile phones. In: 8th international conference on mobile systems, applications, and services, pp 77–90Google Scholar
  4. 4.
    Vondrick C, Patterson D, Ramanan D (2013) Efficiently scaling up crowdsourced video annotation. In: International journal of computer vision, pp 184–204Google Scholar
  5. 5.
    Snow R, O’Connor B, Jurafsky D, Ng YA (2008) Cheap and fast but is it good? Evaluation non-expert annotations for natural language tasks. In: The conference on empirical methods in natural languages processing, pp 254–263Google Scholar
  6. 6.
    Shafer G (1976) A mathematical theory of evidence, vol 1. Princeton University Press, PrincetonzbMATHGoogle Scholar
  7. 7.
    Dempster AP (1967) Upper and lower probabilities induced by a multivalued mapping. Ann Math Stat 219:325–339MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Jousselme A-L, Grenier D, Bossé É (2001) A new distance between two bodies of evidence. In: Information fusion, pp 91–101Google Scholar
  9. 9.
    Downs JS, Holbrook MB, Sheng S, Cranor LF (2010) Are your participants gaming the system?: screening mechanical turk workers. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 2399–2402Google Scholar
  10. 10.
    Lefèvre E, Elouedi Z (2013) How to preserve the confict as an alarm in the combination of belief functions? Decis Support Syst 56:326–333CrossRefGoogle Scholar
  11. 11.
    Lee K, Caverlee J, Webb S (2010) The social honeypot project: protecting online communities from spammers. In: International World Wide Web conference, pp 1139–1140Google Scholar
  12. 12.
    Smets P (1990) The combination of evidence in the transferable belief model. IEEE Trans Pattern Anal Mach Intell 12(5):447–458CrossRefGoogle Scholar
  13. 13.
    Raykar VC, Yu S, Zhao LH, Jerebko A, Florin C, Valadez GH, Bogoni L, Moy L (2009) Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings of the 26th annual international conference on machine learning, pp 889–896Google Scholar
  14. 14.
    Raykar VC, Yu S, Zhao LH, Valadez GH, Florin C, Bogoni L, Moy L (2010) Learning from crowds. In: Journal of machine learning research, pp 1297–1322Google Scholar
  15. 15.
    Dawid AP, Skene AM (2010) Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl Stat 28:20–28CrossRefGoogle Scholar
  16. 16.
    Khattak FK, Salleb A (2011) Quality control of crowd labeling through expert evaluation. In: The neural information processing systems 2nd workshop on computational social science and the wisdom of crowds, pp 27–29Google Scholar
  17. 17.
    Sheng VS, Provost F, Ipeirotis P (2008) Get another label? improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 614– 622Google Scholar
  18. 18.
    Smets P, Mamdani A, Dubois D, Prade H (1988) Non standard logics for automated reasoning. Academic Press, London, pp 253–286zbMATHGoogle Scholar
  19. 19.
    Whitehill JT, Bergsma J, R Movellan J, L Ruvolo P (2009) Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Neural information processing systems, pp 2035–2043Google Scholar
  20. 20.
    Ipeirotis P (2010) Worker evaluation in crowdsourcing : Gold data or multiple workers?
  21. 21.
    Abassi L, Boukhris I (2016) Crowd label aggregation under a belief function framework. In: International conference on knowledge science, engineering and management. Springer, pp 185–196Google Scholar
  22. 22.
    Abassi L, Boukhris I (2017) A gold Standards-Based crowd label aggregation within the belief function theory. In: International conference on industrial, engineering and other applications of applied intelligent systems. Springer, pp 97–106Google Scholar
  23. 23.
    Abassi L, Boukhris I (2017) Iterative aggregation of crowdsourced tasks within the belief function theory. In: European conference on symbolic and quantitative approaches to reasoning and uncertainty. Springer, pp 159–168Google Scholar
  24. 24.
    Frank A (1987) UCI Machine learning repository.
  25. 25.
    Georgescu M, Zhu X (2014) Aggregation of crowdsourced labels based on worker history. In: Proceedings of the 4th international conference on web intelligence, mining and semantics, pp 1–11Google Scholar
  26. 26.
    Kuncheva L et al (2003) Limits on the majority vote accuracy in classifier fusion. Pattern Anal Appl 6:22–31MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Welinder P, Branson S, Perona P, Belongie JS (2010) The multidimensional wisdom of crowds. In: Neural information processing systems, pp 2424–2432Google Scholar
  28. 28.
    Feng S, Xing L, Gogar A, Choi Y (2012) Distributional footprints of deceptive product reviews. In: AAAI, pp 98–105Google Scholar
  29. 29.
    Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the 2008 international conference on Web Search and Data Mining (WSDM ’08), pp 219–230Google Scholar
  30. 30.
    Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pp 309–319Google Scholar
  31. 31.
    Ben Rjab A, Kharoune M, Miklos Z, Martin A (2016) Characterization of experts in crowdsourcing platforms. In: International conference, BELIEF 2016, pp 97–104Google Scholar
  32. 32.
    Ipeirotis P, Provost F, Wang J (2010) Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD workshop on human computation, pp 64–67Google Scholar
  33. 33.
    MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceeding of the 5th Berkeley symposium on mathematical statistics and probability, pp 281–297Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.LARODEC, Institut Supérieur de Gestion de TunisUniversité de TunisTunisTunisia

Personalised recommendations