Abstract
Crowdsourcing platforms like Amazon’s Mechanical Turk provide fast and effective solutions of collecting massive datasets for performing tasks in domains such as image classification, information retrieval, etc. Crowdsourcing quality control plays an essential role in such systems. However, existing algorithms are prone to get stuck in a bad local optimum because of ill-defined datasets. To overcome the above drawbacks, we propose a novel self-paced quality control model integrating a priority-based sample-picking strategy. The proposed model ensures the evident samples do better efforts during iterations. We also empirically demonstrate that the proposed self-paced learning strategy promotes common quality control methods.
This work was supported by 863 project of China (No. 2015AA015403) and NSFC (No. 61632019).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
Data are download from http://i.cs.hku.hk/~ydzheng2/crowd_survey/datasets.html.
References
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 41–48. ACM (2009)
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the em algorithm. Appl. Stat. 28, 20–28 (1979)
Fang, M., Yin, J., Tao, D.: Active learning for crowdsourcing using knowledge transfer. In: AAAI, pp. 1809–1815 (2014)
Galland, A., Abiteboul, S., Marian, A., Senellart, P.: Corroborating information from disagreeing views. In: Proceedings of the third ACM International Conference on Web Search and Data Mining, pp. 131–140. ACM (2010)
Karataev, E., Zadorozhny, V.: Adaptive social learning based on crowdsourcing. IEEE Trans. Learn. Technol. 10(2), 128–139 (2016)
Kumar, M.P., Packer, B., Koller, D.: Self-paced learning for latent variable models. In: Advances in Neural Information Processing Systems, pp. 1189–1197 (2010)
Ma, F., Li, Y., Li, Q., Qiu, M., Gao, J., Zhi, S., Su, L., Zhao, B., Ji, H., Han, J.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 745–754. ACM (2015)
Oyama, S., Baba, Y., Sakurai, Y., Kashima, H.: Accurate integration of crowdsourced labels using workers’ self-reported confidence scores. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 2554–2560. AAAI Press (2013)
Venanzi, M., Guiver, J., Kazai, G., Kohli, P., Shokouhi, M.: Community-based bayesian aggregation models for crowdsourcing. In: the 23rd International Conference, pp. 155–164. ACM, New York (2014)
Welinder, P., Branson, S., Perona, P.: The multidimensional wisdom of crowds. In: Advances in Neural Information Processing Systems 23 (2010)
Whitehill, J., Wu, T.f., Bergsma, J., Movellan, J.R., Ruvolo, P.L.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Advances in Neural Information Processing Systems, pp. 2035–2043 (2009)
Xu, C., Tao, D., Xu, C.: Multi-view self-paced learning for clustering. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 3974–3980. AAAI Press (2015)
Yin, X., Han, J., Philip, S.Y.: Truth discovery with multiple conflicting information providers on the web. IEEE Trans. Knowl. Data Eng. 20(6), 796–808 (2008)
Zhou, D., Liu, Q., Platt, J.C., Meek, C.: Aggregating ordinal labels from crowds by minimax conditional entropy. In: ICML, pp. 262–270 (2014)
Zhou, D., Basu, S., Mao, Y., Platt, J.C.: Learning from the wisdom of crowds by minimax entropy. In: Advances in Neural Information Processing Systems, pp. 2195–2203 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Zhang, X., Shi, H., Li, Y., Liang, W. (2017). SPGLAD: A Self-paced Learning-Based Crowdsourcing Classification Model. In: Kang, U., Lim, EP., Yu, J., Moon, YS. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10526. Springer, Cham. https://doi.org/10.1007/978-3-319-67274-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-67274-8_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67273-1
Online ISBN: 978-3-319-67274-8
eBook Packages: Computer ScienceComputer Science (R0)