Advertisement

Cost Control

  • Guoliang Li
  • Jiannan Wang
  • Yudian Zheng
  • Ju Fan
  • Michael J. Franklin
Chapter

Abstract

Despite the availability of crowdsourcing platforms, which provide a much cheaper way to ask humans to do some work, it is still quite expensive when there is a lot of work to do. Therefore, a big challenge in crowdsourced data management is cost control, i.e., how to reduce human cost while still keeping good result quality.

References

  1. 1.
    von Ahn, L., Dabbish, L.: ESP: labeling images with a computer game. In: AAAI, pp. 91–98 (2005)Google Scholar
  2. 2.
    Amsterdamer, Y., Davidson, S.B., Milo, T., Novgorodov, S., Somech, A.: Oassis: query driven crowd mining. In: SIGMOD, pp. 589–600. ACM (2014)Google Scholar
  3. 3.
    Chen, X., Bennett, P.N., Collins-Thompson, K., Horvitz, E.: Pairwise ranking aggregation in a crowdsourced setting. In: WSDM, pp. 193–202 (2013)Google Scholar
  4. 4.
    Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans. Knowl. Data Eng. 24(9), 1537–1555 (2012)CrossRefGoogle Scholar
  5. 5.
    Deng, D., Li, G., Feng, J.: A pivotal prefix based filtering algorithm for string similarity search. In: SIGMOD, pp. 673–684 (2014)Google Scholar
  6. 6.
    Efron, B., Tibshirani, R.J.: An introduction to the bootstrap. CRC press (1994)Google Scholar
  7. 7.
    Eriksson, B.: Learning to top-k search using pairwise comparisons. In: AISTATS, pp. 265–273 (2013)Google Scholar
  8. 8.
    Fan, W., Li, J., Ma, S., Tang, N., Yu, W.: Towards certain fixes with editing rules and master data. PVLDB 3(1), 173–184 (2010)Google Scholar
  9. 9.
    Feng, J., Wang, J., Li, G.: Trie-join: a trie-based method for efficient string similarity joins. VLDB J. 21(4), 437–461 (2012)CrossRefGoogle Scholar
  10. 10.
    Gokhale, C., Das, S., Doan, A., Naughton, J.F., Rampalli, N., Shavlik, J.W., Zhu, X.: Corleone: hands-off crowdsourcing for entity matching. In: SIGMOD, pp. 601–612 (2014)Google Scholar
  11. 11.
    Gruenheid, A., Kossmann, D., Ramesh, S., Widmer, F.: Crowdsourcing entity resolution: When is A=B? Technical report, ETH ZürichGoogle Scholar
  12. 12.
    Guo, S., Parameswaran, A.G., Garcia-Molina, H.: So who won?: dynamic max discovery with the crowd. In: SIGMOD, pp. 385–396 (2012)Google Scholar
  13. 13.
    Jeffery, S.R., Franklin, M.J., Halevy, A.Y.: Pay-as-you-go user feedback for dataspace systems. In: SIGMOD, pp. 847–860 (2008)Google Scholar
  14. 14.
    Kaplan, H., Lotosh, I., Milo, T., Novgorodov, S.: Answering planning queries with the crowd. PVLDB 6(9), 697–708 (2013)Google Scholar
  15. 15.
    Khan, A.R., Garcia-Molina, H.: Hybrid strategies for finding the max with the crowd. Tech. rep. (2014)Google Scholar
  16. 16.
    Lohr, S.: Sampling: design and analysis. Nelson Education (2009)Google Scholar
  17. 17.
    Marcus, A., Karger, D.R., Madden, S., Miller, R., Oh, S.: Counting with the crowd. PVLDB 6(2), 109–120 (2012)Google Scholar
  18. 18.
    Mozafari, B., Sarkar, P., Franklin, M., Jordan, M., Madden, S.: Scaling up crowd-sourcing to very large datasets: a case for active learning. PVLDB 8(2), 125–136 (2014)Google Scholar
  19. 19.
    Parameswaran, A.G., Sarma, A.D., Garcia-Molina, H., Polyzotis, N., Widom, J.: Human-assisted graph search: it’s okay to ask questions. PVLDB 4(5), 267–278 (2011)Google Scholar
  20. 20.
    Pfeiffer, T., Gao, X.A., Chen, Y., Mao, A., Rand, D.G.: Adaptive polling for information aggregation. In: AAAI (2012)Google Scholar
  21. 21.
    Sarawagi, S., Bhamidipaty, A.: Interactive deduplication using active learning. In: SIGKDD, pp. 269–278 (2002)Google Scholar
  22. 22.
    Settles, B.: Active learning literature survey. University of Wisconsin, Madison 52(55–66), 11Google Scholar
  23. 23.
    Verroios, V., Garcia-Molina, H.: Entity resolution with crowd errors. In: ICDE, pp. 219–230 (2015)Google Scholar
  24. 24.
    Vesdapunt, N., Bellare, K., Dalvi, N.N.: Crowdsourcing algorithms for entity resolution. PVLDB 7(12), 1071–1082 (2014)Google Scholar
  25. 25.
    Wang, J., Kraska, T., Franklin, M.J., Feng, J.: CrowdER: crowdsourcing entity resolution. PVLDB 5(11), 1483–1494 (2012)Google Scholar
  26. 26.
    Wang, J., Krishnan, S., Franklin, M.J., Goldberg, K., Kraska, T., Milo, T.: A sample-and-clean framework for fast and accurate query processing on dirty data. In: SIGMOD, pp. 469–480 (2014)Google Scholar
  27. 27.
    Wang, J., Li, G., Feng, J.: Can we beat the prefix filtering?: an adaptive framework for similarity join and search. In: SIGMOD, pp. 85–96 (2012)Google Scholar
  28. 28.
    Wang, J., Li, G., Kraska, T., Franklin, M.J., Feng, J.: Leveraging transitive relations for crowdsourced joins. In: SIGMOD, pp. 229–240 (2013)Google Scholar
  29. 29.
    Wang, S., Xiao, X., Lee, C.: Crowd-based deduplication: An adaptive approach. In: SIGMOD, pp. 1263–1277 (2015)Google Scholar
  30. 30.
    Whang, S.E., Lofgren, P., Garcia-Molina, H.: Question selection for crowd entity resolution. PVLDB 6(6), 349–360 (2013)Google Scholar
  31. 31.
    Xiao, C., Wang, W., Lin, X., Yu, J.X., Wang, G.: Efficient similarity joins for near-duplicate detection. ACM Trans. Database Syst. 36(3), 15:1–15:41 (2011)CrossRefGoogle Scholar
  32. 32.
    Yakout, M., Elmagarmid, A.K., Neville, J., Ouzzani, M., Ilyas, I.F.: Guided data repair. PVLDB 4(5), 279–289 (2011)Google Scholar
  33. 33.
    Ye, P., EDU, U., Doermann, D.: Combining preference and absolute judgements in a crowd-sourced setting. In: ICML Workshop (2013)Google Scholar
  34. 34.
    Yu, M., Li, G., Deng, D., Feng, J.: String similarity search and join: a survey. Frontiers of Computer Science 10(3), 399–417 (2016)CrossRefGoogle Scholar
  35. 35.
    Zhang, C.J., Tong, Y., Chen, L.: Where to: Crowd-aided path selection. PVLDB 7(14), 2005–2016 (2014)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Guoliang Li
    • 1
  • Jiannan Wang
    • 2
  • Yudian Zheng
    • 3
  • Ju Fan
    • 4
  • Michael J. Franklin
    • 5
  1. 1.Department of Computer Science and TechnologyTsinghua UniversityBeijingChina
  2. 2.School of Computing ScienceSimon Fraser UniversityBurnabyCanada
  3. 3.Twitter Inc.San FranciscoUSA
  4. 4.DEKE Lab & School of InformationRenmin University of ChinaBeijingChina
  5. 5.Department of Computer ScienceUniversity of ChicagoChicagoUSA

Personalised recommendations