Skip to main content

Implications of Data Density and Length of Collection Period for Population Estimations Using Social Media Data

  • Conference paper
  • First Online:
Geographical Information Systems Theory, Applications and Management (GISTAM 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 936))

  • 377 Accesses

Abstract

When programmatically utilizing public APIs provided by social media services, it is possible to attain a large volume of volunteered geographic information. Geospatially enabled data from Twitter, Instagram, Panaramio, etc. can be used to create high-resolution estimations of human movements over time, with volume of the data being of critical importance. This investigation extends previous work, showing the effects of artificial data removal, and generated error; though using over twice as much collected data, attained using an enterprise cloud solution, over a span of thirteen months instead of five.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abdi, H., Williams, L.J.: Normalizing Data. Encyclopedia of Research Design, pp. 935–938. Sage, Thousand Oaks (2010)

    Google Scholar 

  2. Aubrecht, C., Ungar, J., Freire, S.: Exploring the potential of volunteered geographic information for modeling spatio-temporal characteristics of urban population: a case study for Lisbon Metro using foursquare check-in data. In: 7th International Conference Virtual City and Territory, Lisboa, pp. 57–60 (2011)

    Google Scholar 

  3. Aubrecht, C., Özceylan Aubrecht, D., Ungar, J., Freire, S., Steinnocher, K.: VGDI-advancing the concept: volunteered geo-dynamic information and its benefits for population dynamics modeling. Trans. GIS 21, 253–276 (2016)

    Article  Google Scholar 

  4. Boyd, D., Crawford, K.: Critical questions for big data: provocations for a cultural, technological, and scholarly phenomenon. Inf. Commun. Soc. 15(5), 662–679 (2012)

    Article  Google Scholar 

  5. Chai, T., Draxler, R.R.: Root mean square error (RMSE) or mean absolute error (MAE)?-arguments against avoiding RMSE in the literature. Geosci. Model. Dev. 7(3), 1247–1250 (2014)

    Article  Google Scholar 

  6. Coleman, D.J., Georgiadou, Y., Labonte, J., et al.: Volunteered geographic information: the nature and motivation of produsers. Int. J. Spat. Data Infrastruct. Res. 4(1), 332–358 (2009)

    Google Scholar 

  7. FEMA: Cascadia Rising 2016. https://www.fema.gov/cascadia-rising-2016. Accessed 08 Dec 2016

  8. Freire, S., Florczyk, A., Ferri, S.: Modeling day-and nighttime population exposure at high resolution: application to volcanic risk assessment in campi flegrei. In: 12th International Conference on Information Systems for Crisis Response and Management (2015)

    Google Scholar 

  9. GeoHash grid Aggregation, Elasticsearch Reference 5.0. https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geohashgrid-aggregation.html. Accessed 29 July 2017

  10. Goodchild, M.F.: Citizens as sensors: the world of volunteered geography. GeoJournal 69(4), 211–221 (2007)

    Article  Google Scholar 

  11. Goodchild, M.F., Aubrecht, C., Bhaduri, B.: New questions and a changing focus in advanced VGI research. Trans. GIS 21, 189–190 (2016)

    Article  Google Scholar 

  12. GNIP - The World’s Largest and Most Trusted Provider of Social Data. https://gnip.com/. Accessed 29 July 2017

  13. GNU Octave. https://www.gnu.org/software/octave/. Accessed 29 July 2017

  14. Haines, E.: Point in polygon strategies. In: Graphics gems IV, vol. 994, pp. 24–26 (1994)

    Chapter  Google Scholar 

  15. Haklay, M., Weber, P.: Openstreetmap: user-generated street maps. IEEE Pervasive Comput. 7(4), 12–18 (2008)

    Article  Google Scholar 

  16. Haklay, M.: How good is volunteered geographical information? A comparative study of OpenStreetMap and ordnance survey datasets. Environ. Plan. B: Plan. Des. 37(4), 682–703 (2010)

    Article  Google Scholar 

  17. Heaton, T.H., Hartzell, S.H.: Earthquake hazards on the Cascadia subduction zone. Science 236(4798), 162–168 (1987)

    Article  Google Scholar 

  18. Hochman, H.M., Rodgers, J.D.: Pareto optimal redistribution. Am. Econ. Rev. 59(4), 542–557 (1969)

    Google Scholar 

  19. JTS Topology Suite. https://github.com/locationtech/jts. Accessed 29 July 2017

  20. Leong, L., Toombs, D., Gill, B.: Magic quadrant for cloud infrastructure as a service, worldwide. Analyst(s) 501, G00265139 (2015)

    Google Scholar 

  21. Mennis, J., Hultgren, T.: Intelligent dasymetric mapping and its application to areal interpolation. Cartogr. Geogr. Inf. Sci. 33(3), 179–194 (2006)

    Article  Google Scholar 

  22. Miller, H.J.: The data avalanche is here. Shouldn’t we be digging? J. Reg. Sci. 50(1), 181–201 (2010)

    Article  Google Scholar 

  23. Morstatter, F., Pfeffer, J., Liu, H., Carley, K.M.: Is The Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose. arXiv preprint arXiv:1306.5204 (2013)

  24. Moussalli, R., Srivatsa, M., Asaad, S.: Fast and flexible conversion of geohash codes to and from latitude/longitude coordinates. In: 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE (2015)

    Google Scholar 

  25. Oracle Technology Network for Java Developers — Oracle Technology Network — Oracle. http://www.oracle.com/technetwork/java/index.html. Accessed 29 July 2017

  26. Overview of Amazon Web Services. https://d0.awsstatic.com/whitepapers/aws-overview.pdf. Accessed 29 July 2017

  27. PostGIS - Spatial and Geographic Objects for PostgreSQL. http://www.postgis.net. Accessed 29 July 2017

  28. Sagl, G., Resch, B., Hawelka, B., Beinat, E.: From social sensor data to collective human behaviour patterns: analysing and visualising spatio-temporal dynamics in urban environments. In: Proceedings of the GI-Forum, pp. 54–63 (2012)

    Google Scholar 

  29. Stewart, R., et al.: Can social media play a role in developing building occupancy curves for small area estimation? In: Proceedings of 13th International Conference GeoComp (2015)

    Google Scholar 

  30. Toepke, S.L., Starsman, R.S.: Population distribution estimation of an urban area using crowd sourced data for disaster response. In: 12th International Conference on Information Systems for Crisis Response and Management (2015)

    Google Scholar 

  31. Toepke, S.L.: Investigation of geospatially enabled, social media generated structure occupancy curves in commercial structures. In: Grueau, C., Laurini, R., Rocha, J.G. (eds.) GISTAM 2016. CCIS, vol. 741, pp. 49–61. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62618-5_4

    Chapter  Google Scholar 

  32. Toepke, S.L.: Data density considerations for crowd sourced population estimations from social media. In: Proceedings of the 3rd International Conference on Geographical Information Systems Theory, Applications and Management - GISTAM, vol. 1, pp. 35–42 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samuel Lee Toepke .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Toepke, S.L. (2019). Implications of Data Density and Length of Collection Period for Population Estimations Using Social Media Data. In: Ragia, L., Laurini, R., Rocha, J. (eds) Geographical Information Systems Theory, Applications and Management. GISTAM 2017. Communications in Computer and Information Science, vol 936. Springer, Cham. https://doi.org/10.1007/978-3-030-06010-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-06010-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-06009-1

  • Online ISBN: 978-3-030-06010-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics