N2TM: A New Node to Trust Matrix Method for Spam Worker Defense in Crowdsourcing Environments
Abstract
To defend against spam workers in crowdsourcing environments, the existing solutions overlook the fact that a spam worker with guises can easily bypass the defense. To alleviate this problem, in this paper, we propose a Node to Trust Matrix method (N2TM) that represents a worker node in a crowdsourcing network as an un-manipulable Worker Trust Matrix (WTM) for identifying the worker’s identity. In particular, we first present a crowdsourcing trust network consisting of requester nodes, worker nodes, and transaction-based edges. Then, we construct WTMs for workers based on the trust network. A WTM consists of trust indicators measuring the extent to which a worker is trusted by different requesters in different sub-networks. Moreover, we show the un-manipulable property and the usable property of a WTM that are crucial for identifying a worker’s identity. Furthermore, we leverage deep learning techniques to predict a worker’s identity with its WTM as input. Finally, we demonstrate the superior performance of our proposed N2TM in identifying spam workers with extensive experiments.
Keywords
Crowdsourcing Trust Spam worker identificationReferences
- 1.Allahbakhsh, M., Ignjatovic, A., Benatallah, B., Beheshti, S., Bertino, E., Foo, N.: Reputation management in crowdsourcing systems. In: Proceeding of the 2012 International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 664–671 (2012). https://doi.org/10.4108/icst.collaboratecom.2012.250499
- 2.Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), 10008 (2008)CrossRefGoogle Scholar
- 3.Callison-Burch, C., Dredze, M.: Creating speech and language data with Amazon’s mechanical turk. In: Proceedings of the 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, Los Angeles, USA, pp. 1–12 (2010). https://aclanthology.info/papers/W10-0701/w10-0701
- 4.Cao, Q., Sirivianos, M., Yang, X., Pregueiro, T.: Aiding the detection of fake accounts in large scale social online services. In: Proceedings of the 2012 USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, San Jose, CA, USA, pp. 197–210 (2012), https://www.usenix.org/conference/nsdi12/technical-sessions/presentation/cao
- 5.Danezis, G., Mittal, P.: Sybilinfer: Detecting sybil nodes using social networks. In: Proceedings of the 2009 Network and Distributed System Security Symposium, NDSS, San Diego, California, USA (2009). http://www.isoc.org/isoc/conferences/ndss/09/pdf/06.pdf
- 6.Guha, R., Kumar, R., Raghavan, P., Tomkins, A.: Propagation of trust and distrust. In: Proceedings of the 13th International Conference on World Wide Web, pp. 403–412. ACM (2004)Google Scholar
- 7.Jagabathula, S., Subramanian, L., Venkataraman, A.: Reputation-based worker filtering in crowdsourcing. In: Proceeding of the 2014 Annual Conference on Neural Information Processing Systems, Montreal, Quebec, Canada, pp. 2492–2500 (2014). http://papers.nips.cc/paper/5393-reputation-based-worker-filtering-in-crowdsourcing
- 8.Jeff, H.: The rise of crowdsourcing. Wired Mag. 14(6), 1–4 (2006)Google Scholar
- 9.Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Proceeding of the 2011 Annual Conference on Neural Information Processing Systems, Granada, Spain, pp. 1953–1961 (2011). http://papers.nips.cc/paper/4396-iterative-learning-for-reliable-crowdsourcing-systems
- 10.KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J.: Detecting non-adversarial collusion in crowdsourcing. In: Proceedings of the 2014 Second AAAI Conference on Human Computation and Crowdsourcing, HCOMP, Pittsburgh, Pennsylvania, USA (2014). http://www.aaai.org/ocs/index.php/HCOMP/HCOMP14/paper/view/8967
- 11.LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
- 12.Liu, X., Lu, M., Ooi, B.C., Shen, Y., Wu, S., Zhang, M.: Cdas: A crowdsourcing data analytics system. PVLDB 5(10), 1040–1051 (2012). http://vldb.org/pvldb/vol5/p1040xuanliuvldb2012.pdfGoogle Scholar
- 13.Mashhadi, A.J., Capra, L.: Quality control for real-time ubiquitous crowdsourcing. In: Proceedings of the 2011 International Workshop on Ubiquitous Crowdsouring, pp. 5–8. ACM (2011)Google Scholar
- 14.Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://arxiv.org/abs/1301.3781
- 15.Peer, E., Vosgerau, J., Acquisti, A.: Reputation as a sufficient condition for data quality on Amazon mechanical turk. Behav. Res. Methods 46(4), 1023–1031 (2014)CrossRefGoogle Scholar
- 16.Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: The Proceeding of 2014 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, New York, NY, USA, pp. 701–710 (2014). https://doi.org/10.1145/2623330.2623732
- 17.Raykar, V.C., Yu, S.: Eliminating spammers and ranking annotators for crowdsourced labeling tasks. J. Mach. Learn. Res. 13, 491–518 (2012). http://dl.acm.org/citation.cfm?id=2188401MathSciNetzbMATHGoogle Scholar
- 18.Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Natl. Acad. Sci. 105(4), 1118–1123 (2008)CrossRefGoogle Scholar
- 19.Stefanovitch, N., Alshamsi, A., Cebrian, M., Rahwan, I.: Error and attack tolerance of collective problem solving: the darpa shredder challenge. EPJ Data Sci. 3(1), 13 (2014)CrossRefGoogle Scholar
- 20.Tran, D.N., Min, B., Li, J., Subramanian, L.: Sybil-resilient online content voting. In: Proceedings of the 2009 USENIX Symposium on NSDI, Boston, MA, USA, pp. 15–28 (2009). http://www.usenix.org/events/nsdi09/tech/full_papers/tran/tran.pdf
- 21.Vuurens, J.B.P., de Vries, A.P.: Obtaining high-quality relevance judgments using crowdsourcing. IEEE Internet Comput. 16(5), 20–27 (2012). https://doi.org/10.1109/MIC.2012.71CrossRefGoogle Scholar
- 22.Vuurens, J.B., de Vries, A.P., Eickhoff, C.: How much spam can you take? An analysis of crowdsourcing results to increase accuracy. In: ACM SIGIR Workshop on Crowdsourcing for Information Retrieval, CIR11, pp. 21–26 (2011)Google Scholar
- 23.Wei, W., Xu, F., Tan, C.C., Li, Q.: Sybildefender: a defense mechanism forsybil attacks in large social networks. IEEE Trans. Parallel Distrib. Syst. 24(12), 2492–2502 (2013). https://doi.org/10.1109/TPDS.2013.9CrossRefGoogle Scholar
- 24.Ye, B., Wang, Y., Liu, L.: Crowd trust: a context-aware trust model for worker selection in crowdsourcing environments. In: Proceeding of the 2015 IEEE International Conference on Web Services, ICWS 2015, New York, NY, USA, pp. 121–128 (2015). https://doi.org/10.1109/ICWS.2015.26
- 25.Ye, B., Wang, Y., Liu, L.: Crowddefense: a trust vector-based threat defense model in crowdsourcing environments. In: Proceeding of the 2017 IEEE International Conference on Web Services, ICWS 2017, Honolulu, HI, USA, pp. 245–252 (2017). https://doi.org/10.1109/ICWS.2017.39
- 26.Yu, H., Gibbons, P.B., Kaminsky, M., Xiao, F.: Sybillimit: a near-optimal social network defense against sybil attacks. IEEE/ACM Trans. Netw. 18(3), 885–898 (2010). https://doi.org/10.1109/TNET.2009.2034047CrossRefGoogle Scholar
- 27.Yu, H., Shen, Z., Miao, C., An, B.: Challenges and opportunities for trust management in crowdsourcing. In: Proceeding of the 2012 IEEE/WIC/ACM International Conferences on Intelligent Agent Technology, IAT 2012, Macau, China, pp. 486–493 (2012). https://doi.org/10.1109/WI-IAT.2012.104
- 28.Yuen, M., King, I., Leung, K.: A survey of crowdsourcing systems. In: 2011 IEEE Conference on Privacy, Security, Risk and Trust (PASSAT) and on Social Computing (SocialCom), Boston, MA, USA, pp. 766–773 (2011). https://doi.org/10.1109/PASSAT/SocialCom.2011.203