Abstract
Entity identification and resolution has been a hot topic in computer science from last three decades. The ever increasing amount of data and data quality issues such as duplicate records pose great challenge to organizations to efficiently and effectively perform their business operations such as customer relationship management, marketing, contact centers management etc. Recently, crowdsourcing technique has been used to improve the accuracy of entity resolution that make use of human intelligence to label the data and make it ready for further processing by entity resolution (ER) algorithms. However, labelling of data by humans is an error prone process that affects the process of entity resolution and eventually overall performance of crowd. Thus controlling the quality of labeling task is an essential for crowdsourcing systems. However, this task becomes more challenging due to unavailability of ground data. In this paper, we address the above mentioned challenge and design and develop framework for evaluating performance of ER-In-house crowdsourcing system using cognition and statistical-based techniques. Our methodology is divided into two phases namely before-hand evaluation and in-process evaluation. In before-hand evaluation a cognitive approach is used to filter out workers with an inappropriate cognitive style for ER-labeling task. To this end, analytic hierarchy process (AHP) is used to classify the existing four primary cogitative styles discussed in the literature either as suitable or not-suitable for labelling task under consideration. To control the quality of work by crowd-workers, we extend and use the statistical approach proposed by Joglekar et al. during second phase i.e. in-process evaluation. To illustrate effectiveness of our approach; we have considered the domain of Inbound Contact Center and using Customer Service Representatives (CSRs) knowledge for ER-labeling task. In the proposed ER-In-house crowdsourcing system CSRs are considered as crowd-workers. Synthetic dataset is used to demonstrate the applicability of the proposed cognition and statistical-based CSRs evaluation approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hart, M., Mwendia, K., Singh, I.: Managing knowledge about customers in inbound contact centres. In: Proceedings of the European Conference on Knowledge Management. ECKM 2009
Reichheld, F.F.: Loyalty rules!: How today’s leaders build lasting relationships. Harvard Business Press (2001)
Millard, N.: Learning from the ‘wow’factor—how to engage customers through the design of effective affective customer experiences. BT Technology Journal 24(1), 11–16 (2006)
LaValle, S., Lesser, E., Shockley, R., Hopkins, M., Kruschwitz, N.: Big Data, Analytics and the Path From Insights to Value. MIT Sloan Management Review 52(2), 21–32 (2011)
Kim, W., Choi, B.-J., Hong, E.-K., Kim, S.-K., Lee, D.: A taxonomy of dirty data. Data mining and knowledge discovery 7(1), 81–99 (2003)
Turing, A.M.: Computing machinery and intelligence. Mind, 433–460 (1950)
Davidson, S.B., Khanna, S., Milo, T., Roy, S.: Using the crowd for top-k and group-by queries. In: Book Using the Crowd for Top-k and Group-by Queries, pp. 225–236. ACM (2013)
Wang, F.-Y., Carley, K.M., Zeng, D., Mao, W.: Social computing: From social informatics to social intelligence. Intelligent Systems, IEEE 22(2), 79–83 (2007)
Szuba, T.M.: Computational collective intelligence. John Wiley & Sons, Inc. (2001)
Sarma, A.D., Parameswaran, A., Garcia-Molina, H., Halevy, A.: Finding with the crowd. In: Book Finding with the Crowd (2012)
Brabham, D.C.: Crowdsourcing as a model for problem solving an introduction and cases. Convergence: the international journal of research into new media technologies 14(1), 75–90 (2008)
Yi, J., Jin, R., Jain, A.K., Jain, S.: Crowdclustering with sparse pairwise labels: a matrix completion approach. In: Book Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach, pp. 1–7 (2012)
Bigham, J.P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R.C., Miller, R., Tatarowicz, A., White, B., White, S.: Vizwiz: nearly real-time answers to visual questions. In: Book Vizwiz: Nearly Real-Time Answers to Visual Questions, pp. 333–342. ACM (2010)
Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: Book Twitter as a Corpus for Sentiment Analysis and Opinion Mining, pp. 1320–1326 (2010)
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks. In: Book Cheap and Fast—but is it Good?: Evaluating Non-Expert Annotations for Natural Language Tasks, pp. 254–263. Association for Computational Linguistics (2008)
Kittur, A., Chi, E.H., Suh, B.: Crowdsourcing user studies with mechanical turk. In: Book Crowdsourcing User Studies with Mechanical Turk, pp. 453–456. ACM (2008)
Mason, W., Suri, S.: Conducting behavioral research on Amazon’s Mechanical Turk. Behavior research methods 44(1), 1–23 (2012)
Schmidt, L.: Crowdsourcing for human subjects research. In: Proceedings of CrowdConf (2010)
Whang, S.E., Lofgren, P., Garcia-Molina, H.: Question selection for crowd entity resolution. Proceedings of the VLDB Endowment 6(6), 349–360 (2013)
Doan, A., Franklin, M.J., Kossmann, D., Kraska, T.: Crowdsourcing applications and platforms: A data management perspective. Proceedings of the VLDB Endowment 4(12), 1508–1509 (2011)
Feng, A., Franklin, M., Kossmann, D., Kraska, T., Madden, S., Ramesh, S., Wang, A., Xin, R.: Crowddb: Query processing with the vldb crowd. Proceedings of the VLDB Endowment 4(12) (2011)
Gokhale, C., Das, S., Doan, A., Naughton, J.F., Rampalli, R., Shavlik, J., Zhu, X.: Corleone: hands-off crowdsourcing for entity matching. In: Book Corleone: Hands-Off Crowdsourcing for Entity Matching
Jiang, L., Wang, Y., Hoffart, J., Weikum, G.: Crowdsourced entity markup. In: Book Crowdsourced Entity Markup, pp. 1–10 (2013)
Demartini, G., Difallah, D.E., Cudré-Mauroux, P.: ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Book ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for Large-Scale Entity Linking, pp. 469–478. ACM (2012)
Yang, Y., Singh, P., Yao, J., Au Yeung, C.-m., Zareian, A., Wang, X., Cai, Z., Salvadores, M., Gibbins, N., Hall, W., Shadbolt, N.: Distributed human computation framework for linked data co-reference resolution. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 32–46. Springer, Heidelberg (2011)
Mozafari, B., Sarkar, P., Franklin, M.J., Jordan, M.I., Madden, S.: Active learning for crowd-sourced databases, CoRR, abs/1209.3686 (2012)
Venetis, P., Garcia-Molina, H.: Quality control for comparison microtasks. In: Book Quality Control for Comparison Microtasks, pp. 15–21. ACM (2012)
Mason, W., Watts, D.J.: Financial incentives and the performance of crowds. ACM SigKDD Explorations Newsletter 11(2), 100–108 (2010)
Feldman, M., Bernstein, A.: Cognition-based Task Routing: Towards Highly-Effective Task-Assignments in Crowdsourcing Settings (2014)
Joglekar, M., Garcia-Molina, H., Parameswaran, A.: Evaluating the crowd with confidence. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013)
Khattak, F.K., Salleb-Aouissi, A.: Improving crowd labeling through expert evaluation. In: Book Improving Crowd Labeling Through Expert Evaluation (2012)
Su, H., Zheng, K., Huang, J., Liu, T., Wang, H., Zhou, X.: A crowd-based route recommendation system-CrowdPlanner. In: Book A Crowd-Based Route Recommendation System-CrowdPlanner, pp. 1178–1181. IEEE (2014)
Lease, M.: On Quality Control and Machine Learning in Crowdsourcing. In: Book On Quality Control and Machine Learning in Crowdsourcing (2011)
Driver, M.J.: Decision style: Past, present, and future research, International perspectives on individual differences, pp. 41–64 (2000)
Saberi, M., Hussain, O.K., Janjua, N.K., Chang, E.: In-house crowdsourcing-based entity resolution: dealing with common names. In: Book In-House Crowdsourcing-Based Entity Resolution: Dealing with Common Names, pp. 83–88. IEEE (2014)
Saaty, T.L.: The analytic hierarchy process: planning, priority setting, resources allocation. McGraw, New York (1980)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Saberi, M., Hussain, O.K., Janjua, N.K., Chang, E. (2015). Cognition and Statistical-Based Crowd Evaluation Framework for ER-in-House Crowdsourcing System: Inbound Contact Center. In: Sharaf, M., Cheema, M., Qi, J. (eds) Databases Theory and Applications. ADC 2015. Lecture Notes in Computer Science(), vol 9093. Springer, Cham. https://doi.org/10.1007/978-3-319-19548-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-19548-3_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19547-6
Online ISBN: 978-3-319-19548-3
eBook Packages: Computer ScienceComputer Science (R0)