Skip to main content

Moon Landing or Safari? A Study of Systematic Errors and Their Causes in Geographic Linked Data

  • Conference paper
  • First Online:
Geographic Information Science (GIScience 2016)

Abstract

While the adoption of Linked Data technologies has grown dramatically over the past few years, it has not come without its own set of growing challenges. The triplification of domain data into Linked Data has not only given rise to a leading role of places and positioning information for the dense interlinkage of data about actors, objects, and events, but also led to massive errors in the generation, transformation, and semantic annotation of data. In a global and densely interlinked graph of data, even seemingly minor error can have far reaching consequences as different datasets make statements about the same resources. In this work we present the first comprehensive study of systematic errors and their potential causes. We also discuss lessons learned and means to avoid some of the introduced pitfalls in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://lodlaundromat.org/.

  2. 2.

    http://www.taxonconcept.org/.

  3. 3.

    http://data.nytimes.com/.

  4. 4.

    http://www.rkbexplorer.com/.

  5. 5.

    http://stats.lod2.eu/.

  6. 6.

    A high resolution version that gives a better impression of the coverage as well as various errors is available at http://stko.geog.ucsb.edu/pictures/lstd_map.png.

  7. 7.

    SPARQL: ASK WHERE \(\mathtt{<}\) http://dbpedia.org/resource/Earth \(\mathtt{>}\) \(\mathtt{<}\) http://dbpedia.org/property/flattening \(\mathtt{> 1.}\) [using DBpedia 2015-04.].

  8. 8.

    E.g. via, wget http://sws.geonames.org/6252001/nearby.rdf.

  9. 9.

    http://dbpedia.org/resource/HMS_Victory.

  10. 10.

    The way in which DBpedia uses cardinal directions can be easily misunderstood. The triple states that the entity south of Ventura is the city of Oxnard.

  11. 11.

    http://data.nytimes.com/N2261955445337191084.

  12. 12.

    The views presented in this paper belong to the authors and do not necessarily represent the views or positions of the entire working group. A current draft of the best practice report is available at: https://www.w3.org/TR/sdw-bp/.

References

  1. Adams, B., Janowicz, K.: Thematic signatures for cleansing and enriching place-related linked data. Int. J. Geogr. Inf. Sci. 29(4), 556–579 (2015)

    Article  Google Scholar 

  2. Beek, W., Rietveld, L., Bazoobandi, H.R., Wielemaker, J., Schlobach, S.: LOD Laundromat: a uniform way of publishing other people’s dirty data. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 213–228. Springer, Heidelberg (2014)

    Google Scholar 

  3. Ferrucci, D.A., Brown, E.W., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J.M., Welty, C.A.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)

    Google Scholar 

  4. Fisher, P.F.: Models of uncertainty in spatial data. Geograph. Inf. Syst. 1, 191–205 (1999)

    Google Scholar 

  5. Hogan, A., Harth, A., Passant, A., Decker, S., Polleres, A.: Weaving the pedantic web. In: Proceedings of the WWW 2010 Workshop on Linked Data on the Web, LDOW 2010, Raleigh, USA, 27 April 2010 (2010)

    Google Scholar 

  6. Hogan, A., Hitzler, P., Janowicz, K.: Linked dataset description papers at the semantic web journal: a critical assessment. Semant. Web 7(2), 105–116 (2016)

    Article  Google Scholar 

  7. Janowicz, K.: Observation-driven geo-ontology engineering. Trans. GIS 16(3), 351–374 (2012)

    Article  Google Scholar 

  8. Janowicz, K., Hitzler, P.: The digital earth as knowledge engine. Semant. Web 3(3), 213–221 (2012)

    Google Scholar 

  9. Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R., Zaveri, A.: Test-driven evaluation of linked data quality. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 747–758. International World Wide Web Conferences Steering (2014)

    Google Scholar 

  10. Kuhn, W., Kauppinen, T., Janowicz, K.: Linked data - a paradigm shift for geographic information science. In: Duckham, M., Pebesma, E., Stewart, K., Frank, A.U. (eds.) GIScience 2014. LNCS, vol. 8728, pp. 173–186. Springer, Heidelberg (2014)

    Google Scholar 

  11. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)

    Google Scholar 

  12. Perry, M., Herring, J.: OGC geosparql-a geographic query language for RDF data. Open Geospatial Consortium (2012)

    Google Scholar 

  13. Perry, M., Jain, P., Sheth, A.P.: SPARQL-ST: extending SPARQL to support spatiotemporal queries. In: Ashish, N., Sheth, A.P. (eds.) Geospatial Semantics and the Semantic Web - Foundations, Algorithms, and Applications. Semantic Web and Beyond: Computing for Human Experience, vol. 12, pp. 61–86. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  14. Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 114–129. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  15. Rietveld, L., Verborgh, R., Beek, W., Vander Sande, M., Schlobach, S.: Linked data-as-a-service: the semantic web redeployed. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 471–487. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  16. Williams, A.J., Harland, L., Groth, P., Pettifer, S., Chichester, C., Willighagen, E.L., Evelo, C.T., Blomberg, N., Ecker, G., Goble, C., Mons, B.: Open phacts: semantic interoperability for drug discovery. Drug Discov. Today 17(21), 1188–1198 (2012)

    Article  Google Scholar 

  17. Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web 7(1), 63–93 (2015)

    Article  Google Scholar 

  18. Zhu, R., Hu, Y., Janowicz, K., McKenzie, G.: Spatial signatures for geographic feature types: examining gazetteer ontologies using spatial statistics. Trans. GIS 20(3), 333–355 (2016). doi:10.1111/tgis.12232

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge partial support by the National Science Foundation (NSF) under award 1440202 EarthCube Building Blocks: Collaborative Proposal: GeoLink Leveraging Semantics and Linked Data for Data Sharing and Discovery in the Geosciences, NSF award 1540849 EarthCube IA: Collaborative Proposal: Cross-Domain Observational Metadata Environmental Sensing Network (X-DOMES), and the USGS award on Linked Data for the National Map.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krzysztof Janowicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Janowicz, K. et al. (2016). Moon Landing or Safari? A Study of Systematic Errors and Their Causes in Geographic Linked Data. In: Miller, J., O'Sullivan, D., Wiegand, N. (eds) Geographic Information Science. GIScience 2016. Lecture Notes in Computer Science(), vol 9927. Springer, Cham. https://doi.org/10.1007/978-3-319-45738-3_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45738-3_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45737-6

  • Online ISBN: 978-3-319-45738-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics