Abstract
Despite the availability of crowdsourcing platforms, which provide a much cheaper way to ask humans to do some work, it is still quite expensive when there is a lot of work to do. Therefore, a big challenge in crowdsourced data management is cost control, i.e., how to reduce human cost while still keeping good result quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
von Ahn, L., Dabbish, L.: ESP: labeling images with a computer game. In: AAAI, pp. 91–98 (2005)
Amsterdamer, Y., Davidson, S.B., Milo, T., Novgorodov, S., Somech, A.: Oassis: query driven crowd mining. In: SIGMOD, pp. 589–600. ACM (2014)
Chen, X., Bennett, P.N., Collins-Thompson, K., Horvitz, E.: Pairwise ranking aggregation in a crowdsourced setting. In: WSDM, pp. 193–202 (2013)
Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans. Knowl. Data Eng. 24(9), 1537–1555 (2012)
Deng, D., Li, G., Feng, J.: A pivotal prefix based filtering algorithm for string similarity search. In: SIGMOD, pp. 673–684 (2014)
Efron, B., Tibshirani, R.J.: An introduction to the bootstrap. CRC press (1994)
Eriksson, B.: Learning to top-k search using pairwise comparisons. In: AISTATS, pp. 265–273 (2013)
Fan, W., Li, J., Ma, S., Tang, N., Yu, W.: Towards certain fixes with editing rules and master data. PVLDB 3(1), 173–184 (2010)
Feng, J., Wang, J., Li, G.: Trie-join: a trie-based method for efficient string similarity joins. VLDB J. 21(4), 437–461 (2012)
Gokhale, C., Das, S., Doan, A., Naughton, J.F., Rampalli, N., Shavlik, J.W., Zhu, X.: Corleone: hands-off crowdsourcing for entity matching. In: SIGMOD, pp. 601–612 (2014)
Gruenheid, A., Kossmann, D., Ramesh, S., Widmer, F.: Crowdsourcing entity resolution: When is A=B? Technical report, ETH Zürich
Guo, S., Parameswaran, A.G., Garcia-Molina, H.: So who won?: dynamic max discovery with the crowd. In: SIGMOD, pp. 385–396 (2012)
Jeffery, S.R., Franklin, M.J., Halevy, A.Y.: Pay-as-you-go user feedback for dataspace systems. In: SIGMOD, pp. 847–860 (2008)
Kaplan, H., Lotosh, I., Milo, T., Novgorodov, S.: Answering planning queries with the crowd. PVLDB 6(9), 697–708 (2013)
Khan, A.R., Garcia-Molina, H.: Hybrid strategies for finding the max with the crowd. Tech. rep. (2014)
Lohr, S.: Sampling: design and analysis. Nelson Education (2009)
Marcus, A., Karger, D.R., Madden, S., Miller, R., Oh, S.: Counting with the crowd. PVLDB 6(2), 109–120 (2012)
Mozafari, B., Sarkar, P., Franklin, M., Jordan, M., Madden, S.: Scaling up crowd-sourcing to very large datasets: a case for active learning. PVLDB 8(2), 125–136 (2014)
Parameswaran, A.G., Sarma, A.D., Garcia-Molina, H., Polyzotis, N., Widom, J.: Human-assisted graph search: it’s okay to ask questions. PVLDB 4(5), 267–278 (2011)
Pfeiffer, T., Gao, X.A., Chen, Y., Mao, A., Rand, D.G.: Adaptive polling for information aggregation. In: AAAI (2012)
Sarawagi, S., Bhamidipaty, A.: Interactive deduplication using active learning. In: SIGKDD, pp. 269–278 (2002)
Settles, B.: Active learning literature survey. University of Wisconsin, Madison 52(55–66), 11
Verroios, V., Garcia-Molina, H.: Entity resolution with crowd errors. In: ICDE, pp. 219–230 (2015)
Vesdapunt, N., Bellare, K., Dalvi, N.N.: Crowdsourcing algorithms for entity resolution. PVLDB 7(12), 1071–1082 (2014)
Wang, J., Kraska, T., Franklin, M.J., Feng, J.: CrowdER: crowdsourcing entity resolution. PVLDB 5(11), 1483–1494 (2012)
Wang, J., Krishnan, S., Franklin, M.J., Goldberg, K., Kraska, T., Milo, T.: A sample-and-clean framework for fast and accurate query processing on dirty data. In: SIGMOD, pp. 469–480 (2014)
Wang, J., Li, G., Feng, J.: Can we beat the prefix filtering?: an adaptive framework for similarity join and search. In: SIGMOD, pp. 85–96 (2012)
Wang, J., Li, G., Kraska, T., Franklin, M.J., Feng, J.: Leveraging transitive relations for crowdsourced joins. In: SIGMOD, pp. 229–240 (2013)
Wang, S., Xiao, X., Lee, C.: Crowd-based deduplication: An adaptive approach. In: SIGMOD, pp. 1263–1277 (2015)
Whang, S.E., Lofgren, P., Garcia-Molina, H.: Question selection for crowd entity resolution. PVLDB 6(6), 349–360 (2013)
Xiao, C., Wang, W., Lin, X., Yu, J.X., Wang, G.: Efficient similarity joins for near-duplicate detection. ACM Trans. Database Syst. 36(3), 15:1–15:41 (2011)
Yakout, M., Elmagarmid, A.K., Neville, J., Ouzzani, M., Ilyas, I.F.: Guided data repair. PVLDB 4(5), 279–289 (2011)
Ye, P., EDU, U., Doermann, D.: Combining preference and absolute judgements in a crowd-sourced setting. In: ICML Workshop (2013)
Yu, M., Li, G., Deng, D., Feng, J.: String similarity search and join: a survey. Frontiers of Computer Science 10(3), 399–417 (2016)
Zhang, C.J., Tong, Y., Chen, L.: Where to: Crowd-aided path selection. PVLDB 7(14), 2005–2016 (2014)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Li, G., Wang, J., Zheng, Y., Fan, J., Franklin, M.J. (2018). Cost Control. In: Crowdsourced Data Management. Springer, Singapore. https://doi.org/10.1007/978-981-10-7847-7_4
Download citation
DOI: https://doi.org/10.1007/978-981-10-7847-7_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7846-0
Online ISBN: 978-981-10-7847-7
eBook Packages: Computer ScienceComputer Science (R0)