Abstract
While the adoption of Linked Data technologies has grown dramatically over the past few years, it has not come without its own set of growing challenges. The triplification of domain data into Linked Data has not only given rise to a leading role of places and positioning information for the dense interlinkage of data about actors, objects, and events, but also led to massive errors in the generation, transformation, and semantic annotation of data. In a global and densely interlinked graph of data, even seemingly minor error can have far reaching consequences as different datasets make statements about the same resources. In this work we present the first comprehensive study of systematic errors and their potential causes. We also discuss lessons learned and means to avoid some of the introduced pitfalls in the future.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
A high resolution version that gives a better impression of the coverage as well as various errors is available at http://stko.geog.ucsb.edu/pictures/lstd_map.png.
- 7.
SPARQL: ASK WHERE \(\mathtt{<}\) http://dbpedia.org/resource/Earth \(\mathtt{>}\) \(\mathtt{<}\) http://dbpedia.org/property/flattening \(\mathtt{> 1.}\) [using DBpedia 2015-04.].
- 8.
E.g. via, wget http://sws.geonames.org/6252001/nearby.rdf.
- 9.
- 10.
The way in which DBpedia uses cardinal directions can be easily misunderstood. The triple states that the entity south of Ventura is the city of Oxnard.
- 11.
- 12.
The views presented in this paper belong to the authors and do not necessarily represent the views or positions of the entire working group. A current draft of the best practice report is available at: https://www.w3.org/TR/sdw-bp/.
References
Adams, B., Janowicz, K.: Thematic signatures for cleansing and enriching place-related linked data. Int. J. Geogr. Inf. Sci. 29(4), 556–579 (2015)
Beek, W., Rietveld, L., Bazoobandi, H.R., Wielemaker, J., Schlobach, S.: LOD Laundromat: a uniform way of publishing other people’s dirty data. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 213–228. Springer, Heidelberg (2014)
Ferrucci, D.A., Brown, E.W., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J.M., Welty, C.A.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)
Fisher, P.F.: Models of uncertainty in spatial data. Geograph. Inf. Syst. 1, 191–205 (1999)
Hogan, A., Harth, A., Passant, A., Decker, S., Polleres, A.: Weaving the pedantic web. In: Proceedings of the WWW 2010 Workshop on Linked Data on the Web, LDOW 2010, Raleigh, USA, 27 April 2010 (2010)
Hogan, A., Hitzler, P., Janowicz, K.: Linked dataset description papers at the semantic web journal: a critical assessment. Semant. Web 7(2), 105–116 (2016)
Janowicz, K.: Observation-driven geo-ontology engineering. Trans. GIS 16(3), 351–374 (2012)
Janowicz, K., Hitzler, P.: The digital earth as knowledge engine. Semant. Web 3(3), 213–221 (2012)
Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R., Zaveri, A.: Test-driven evaluation of linked data quality. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 747–758. International World Wide Web Conferences Steering (2014)
Kuhn, W., Kauppinen, T., Janowicz, K.: Linked data - a paradigm shift for geographic information science. In: Duckham, M., Pebesma, E., Stewart, K., Frank, A.U. (eds.) GIScience 2014. LNCS, vol. 8728, pp. 173–186. Springer, Heidelberg (2014)
Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)
Perry, M., Herring, J.: OGC geosparql-a geographic query language for RDF data. Open Geospatial Consortium (2012)
Perry, M., Jain, P., Sheth, A.P.: SPARQL-ST: extending SPARQL to support spatiotemporal queries. In: Ashish, N., Sheth, A.P. (eds.) Geospatial Semantics and the Semantic Web - Foundations, Algorithms, and Applications. Semantic Web and Beyond: Computing for Human Experience, vol. 12, pp. 61–86. Springer, Heidelberg (2011)
Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 114–129. Springer, Heidelberg (2012)
Rietveld, L., Verborgh, R., Beek, W., Vander Sande, M., Schlobach, S.: Linked data-as-a-service: the semantic web redeployed. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 471–487. Springer, Heidelberg (2015)
Williams, A.J., Harland, L., Groth, P., Pettifer, S., Chichester, C., Willighagen, E.L., Evelo, C.T., Blomberg, N., Ecker, G., Goble, C., Mons, B.: Open phacts: semantic interoperability for drug discovery. Drug Discov. Today 17(21), 1188–1198 (2012)
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web 7(1), 63–93 (2015)
Zhu, R., Hu, Y., Janowicz, K., McKenzie, G.: Spatial signatures for geographic feature types: examining gazetteer ontologies using spatial statistics. Trans. GIS 20(3), 333–355 (2016). doi:10.1111/tgis.12232
Acknowledgements
The authors would like to acknowledge partial support by the National Science Foundation (NSF) under award 1440202 EarthCube Building Blocks: Collaborative Proposal: GeoLink Leveraging Semantics and Linked Data for Data Sharing and Discovery in the Geosciences, NSF award 1540849 EarthCube IA: Collaborative Proposal: Cross-Domain Observational Metadata Environmental Sensing Network (X-DOMES), and the USGS award on Linked Data for the National Map.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Janowicz, K. et al. (2016). Moon Landing or Safari? A Study of Systematic Errors and Their Causes in Geographic Linked Data. In: Miller, J., O'Sullivan, D., Wiegand, N. (eds) Geographic Information Science. GIScience 2016. Lecture Notes in Computer Science(), vol 9927. Springer, Cham. https://doi.org/10.1007/978-3-319-45738-3_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-45738-3_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45737-6
Online ISBN: 978-3-319-45738-3
eBook Packages: Computer ScienceComputer Science (R0)