Implementation, Results, and Problems of Paid Crowd-Based Geospatial Data Collection

  • Volker WalterEmail author
  • Uwe Sörgel
Original Article


In this paper, we discuss the potential and problems of paid crowd-based geospatial data collection. First, we present a web-based program for the crowd-based collection of geodata by paid crowdworkers that is implemented on the commercial platform microWorkers. We will discuss our implemented approach and show on data samples that in principle, it is possible to produce high-quality geospatial data sets with paid crowdsourcing. However, the problem is that geodata, which are collected by the crowd, can have limited and inhomogeneous quality. Even when experts collect geodata, one may yield incorrect objects, which we will demonstrate on examples. A possible approach to handle this problem is to collect the data not only once but also multiple times and to integrate the multiple representations into one common data set. We will analyze how the quality measures of such multiple representations are statistically distributed. Finally, we discuss how individual results as well as multiple collected data can be integrated into one common data set.


Crowdsourcing Geospatial data collection Data quality 


In diesem Artikel diskutieren wir das Potential und die Probleme von bezahlter crowd-basierter Erfassung von Geodaten. Zuerst stellen wir ein webbasiertes Programm für die Erfassung von Geodaten durch bezahlte Crowd-Arbeiter vor, welches auf der kommerziellen Plattform microWorkers implementiert worden ist. Anhand von Datenbeispielen demonstrieren wir, dass mit der Crowd prinzipiell qualitativ hochwertige Geodaten erfasst werden können. Es zeigt sich jedoch, dass die Qualität der so gewonnenen Geodaten sehr inhomogen sein kann. Selbst bei einer Erfassung von Geodaten durch Experten können fehlerhafte Ergebnisse entstehen, wie wir an Beispielen zeigen werden. Eine mögliche Lösung dieses Problems ist die Daten mehrfach erfassen zu lassen und danach in einen gemeinsamen Datensatz zu integrieren. Wir untersuchen, wie sich die Qualitätsmaße solcher multiplen Repräsentationen statistisch verteilen. Abschließend diskutieren wir, wie sich die verschiedenen Teilergebnisse sowie mehrfach erfasste Daten in einen gemeinsamen Datensatz integrieren lassen.


  1. Aker A, El-Haj M, Albakour M-D, Kruschwitz U (2012) Assessing crowdsourcing quality through objective tasks. Paper presented at the 8th international conference on language resources and evaluation, Istanbul, Turkey, pp 1456–1461Google Scholar
  2. Bär D, Biemann C, Gurevych I, Zesch T (2012) UKP: computing semantic textual similarity by combining multiple content similarity measures. In: Proceedings of the Sixth international workshop on semantic evaluation. Association for computational linguistics, Stroudsburg, PA, USA, pp 435–440Google Scholar
  3. Barron C, Neis P, Zipf A (2013) A comprehensive framework for intrinsic OpenStreetMap quality analysis. Trans GIS 18(6):877–895CrossRefGoogle Scholar
  4. Bernstein M-S, Little G, Miller R-C, Hartmann B, Ackerman M-S, Karger DR, Panovich K (2010) Soylent: a word processor with a crowd inside. In: Proceedings of the 23nd annual ACM symposium on user interface software and technology, pp 313–322Google Scholar
  5. Budhathoki R, Haythornthwaite C (2012) Motivation for open collaboration: crowd and community models and the case of OpenStreetMap. Am Behav Sci 57(5):548–575CrossRefGoogle Scholar
  6. Devillers R, Stein A, Bédard Y, Chrisman N, Fisher P, Shi W (2010) Thirty years of research on spatial data quality: achievements, failures, and opportunities. Trans GIS 14(4):387–440CrossRefGoogle Scholar
  7. Filippovska Y (2012) Evaluierung generalisierter Gebäudegrundrisse in großen Maßstäben, Reihe C, Nr. 693. Deutsche Geodätische Kommission, MünchenGoogle Scholar
  8. Glemser M (2001) Zur Berücksichtigung der geometrischen Objektunsicherheit in der Geoinformatik Reihe C, Nr. 539. Deutsche Geodätische Kommission, MünchenGoogle Scholar
  9. Goncalves J, Ferreira D, Hosio S, Liu Y, Rogstadius J, Kukka H, Kostakos V (2013) Crowdsourcing on the spot: altruistic use of public displays, feasibility, performance, and behaviours. In: Proceedings of the 2013 ACM international joint conference on pervasive and ubiquitous computing (UbiComp ‘13). ACM, New York, NY, USA, pp 753–762Google Scholar
  10. Goncalves J, Hosio S, Rogstadius J, Karapanos E, Kostakos V (2015) Motivating participation and improving quality of contribution in ubiquitous crowdsourcing. Comput Netw 90:34–48CrossRefGoogle Scholar
  11. Hirth M, Hoßfeld T, Tran-Gia P (2011) Anatomy of a crowdsourcing platform—using the example of 2011 Fifth international conference on innovative mobile and internet services in ubiquitous computing, pp 322–329Google Scholar
  12. Holland H, Hoffmann P (2013) Crowdsourcing-kampagnen—Teilnahmemotivation von Konsumenten. In: Deutscher Dialogmarketing Verband e.V. (Hrsg.). Dialogmarketingperspektiven 2012/2013, Tagungsband, 7. wissenschaftlicher interdisziplinärer kongress für dialogmarketing, pp 179–209Google Scholar
  13. Hossain M (2012) Crowdsourcing: activities, incentives and users’ motivations to participate. In: International conference on innovation management and technology research, Malacca, pp 501–506Google Scholar
  14. Hoßfeld T, Hirth M, Tran-Gia P (2012) Crowdsourcing. Informatik Spektrum 35(3):204–208CrossRefGoogle Scholar
  15. Jain A, Sarma A, Parameswaran A, Widom J (2017) Understanding workers, developing effective tasks, and enhancing marketplace dynamics: a study of a large crowdsourcing marketplace. Proc VLDB Endow 10(7):829–840CrossRefGoogle Scholar
  16. Le J, Edmonds A, Hester V, Biewald L (2010) Ensuring quality in crowdsourced search relevance evaluation: the effects of training question distribution. In: Proceedings of the SIGIR 2010 workshop on crowdsourcing for search evaluation (CSE 2010), pp 17–20Google Scholar
  17. Ledoux H, Ohori K-A (2017) Solving the horizontal conflation problem with a constrained Delaunay triangulation. J Geogr Syst 19(1):21–42CrossRefGoogle Scholar
  18. Leimeister J, Zogaj S (2013) Neue arbeitsorganisation durch crowdsourcing. Arbeitspapier Nr. 287. Hans Böckler Stiftung, DüsseldorfGoogle Scholar
  19. Mao A, Kamar E, Chen Y, Horvitz E, Schwamb M, Lintott C, Smith A (2013) Volunteering versus work for pay: incentives and tradeoffs in crowdsourcing. AAAI publications First AAAI conference on human computation and crowdsourcing, pp 94–102Google Scholar
  20. Mason W, Watts D (2009) Financial incentives and the “performance of crowds”. In: HCOMP’ 09: Proceedings of the ACM SIGKDD workshop on human computation, pp 77–85Google Scholar
  21. Oxford (2017) Definition of conflate—combine into one. Visited 15 Jan 2018
  22. Park S, Shoemark P, Morency L (2014) Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization. In: Proceedings of the 19th international conference on intelligent user interfaces, IUI ‘14, pp 37–46Google Scholar
  23. Ross J, Irani L, Silberman M, Zaldivar A, Tomlinson B (2010) Who are the crowdworkers?: shifting demographics in mechanical turk. In: CHI ‘10 extended abstracts on human factors in computing systems (CHI EA ‘10). ACM, New York, NY, USA, pp 2863–2872Google Scholar
  24. Rote G (1991) Computing the minimum Hausdorff distance between two point sets on a line under translation. Inform Proc Lett 38:123–127CrossRefGoogle Scholar
  25. Schenk E, Claude G (2009) Crowdsourcing: what can be outsourced to the crowd, and why. In: Workshop on open source innovation, Strasbourg, France 2009. Visited 15 Jan 2018
  26. Senaratne H, Mobasheri A, Ali A-L, Capineri C, Haklay M (2017) A review of volunteered geographic information quality assessment methods. Int J Geogr Inform Sci 31(1):139–167CrossRefGoogle Scholar
  27. Shrier D, Adjodah D, Wu W, Pentland A (2016) Prediction markets. Technical report. Massachusetts Institute of Technology, CambridgeGoogle Scholar
  28. Spindeldreher K, Schlagwein D (2016) What drives the crowd? A meta-analysis of the motivation of participants in crowdsourcing. In: Pacific Asia Conference on Information Systems (PACIS) Proceedings 119. Visited 15 Jan 2018
  29. Sui D, Elwood S, Goodchild M (2013) Crowdsourcing Geographic knowledge, volunteered geographic information (VGI) in theory and practice. Springer, New YorkGoogle Scholar
  30. van Exel M, Dias E, Fruijtier S (2010) The impact of crowdsourcing on spatial data quality indicators. In: Proceedings of GiScience 2011, Zurich, Switzerland, 14–17 September 2010, p 4Google Scholar
  31. Walter V, Fritsch D (1999) Matching spatial data sets: a statistical approach. Int J Geogr Inform Sci 13(5):445–473CrossRefGoogle Scholar
  32. Wiemann S, Bernard L (2010) Conflation services within spatial data infrastructures. In: Painho M, Santos MY, Pundt H (eds), 13th AGILE International Conference on Geographic Information Science. pp 1–8Google Scholar
  33. Xavier E, Francisco J, Manuel A (2016) A survey of measures and methods for matching geospatial vector datasets. ACM Comput Surv 49(2):34 (article 39) CrossRefGoogle Scholar
  34. Yuan S, Tao C (1999) Development of conflation components. In: Li B et al (eds) The Proceedings of Geoinformatics'99 Conference, Ann Arbor, 19–21 June, 1999, pp 1–13Google Scholar

Copyright information

© Deutsche Gesellschaft für Photogrammetrie, Fernerkundung und Geoinformation (DGPF) e.V. 2018

Authors and Affiliations

  1. 1.Institute for PhotogrammetryUniversity of StuttgartStuttgartGermany

Personalised recommendations