Skip to main content

KEYSTONE WG1: Activities and Results Overview on Representation of Structured Data Sources

  • Conference paper
  • First Online:
Semantic Keyword-Based Search on Structured Data Sources (IKC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10546))

Included in the following conference series:

  • 821 Accesses

Abstract

The main goal of research in the Keystone Action COST IC1302 is to manage big amounts of heterogeneous data, particularly structured data, in order to provide users (people or software agents) with the data they require in an effective way with the minimum cost. The processes of managing and organizing data to provide users with them in an efficient way also generate new data that can be recollected and exploited to improve the processes; i.e., data about the processes involved can be used as feedback to improve them.

Keystone is organized in 4 working groups: Representation of Structure Data Sources (WG1), Keyword-based Search (WG2), User Interaction and Keyword Query Interpretation (WG3), and Research Integration, Showcases, Benchmarks and Evaluations (WG4). This chapter is focused on the research related to WG1 focusing on profiling, assessment, representation and discovery of structured datasets. The results of WG1 influence WG2 and WG3, whereas WG4 focuses on the integration of the results of all working groups and how to exploit them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/dbpedia-spotlight/dbpedia-spotlight.

  2. 2.

    http://babelfy.org/.

  3. 3.

    https://nlp.stanford.edu/software/sutime.shtml.

  4. 4.

    http://heideltime.ifi.uni-heidelberg.de/heideltime.

  5. 5.

    https://stanfordnlp.github.io/CoreNLP/.

  6. 6.

    https://rapidminer.com.

  7. 7.

    https://www.talend.com.

  8. 8.

    www.pentaho.com.

  9. 9.

    https://www.w3.org/DesignIssues/LinkedData.html.

  10. 10.

    https://www.w3.org/TR/void/.

  11. 11.

    https://www.w3.org/TR/vocab-dcat.

  12. 12.

    https://github.com/cmader/qSKOS.

  13. 13.

    https://github.com/NatLibFi/Skosify.

  14. 14.

    https://www.poolparty.biz/.

References

  1. van der Aalst, W.M.P.: Process Mining - Data Science in Action, 2nd edn. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4

    Book  Google Scholar 

  2. Álvarez-García, S., Brisaboa, N.R., Fernández, J.D., Martínez-Prieto, M.A., Navarro, G.: Compressed vertical partitioning for efficient RDF management. Knowl. Inf. Syst. 44(2), 439–474 (2015). https://doi.org/10.1007/s10115-014-0770-y

  3. Arenas, M., Bertails, A., Prud’hommeaux, E., Sequeda, J.: A direct mapping of relational data to RDF (2012). http://www.w3.org/TR/2012/REC-rdb-direct-mapping-20120927/

  4. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52. http://dl.acm.org/citation.cfm?id=1785162.1785216

    Chapter  Google Scholar 

  5. Beckett, D.: RDF 1.1 n-triples, a line-based syntax for an RDF graph, W3C recommendation, 25 February 2014

    Google Scholar 

  6. Bellini, P., Benigni, M., Billero, R., Nesi, P., Rauch, N.: Km4city ontology building vs data harvesting and cleaning for smart-city services. J. Vis. Lang. Comput. 25(6), 827–839 (2014)

    Article  Google Scholar 

  7. Ben Ellefi, M., Bellahsene, Z., John, B., Demidova, E., Dietze, S., Szymanski, J., Todorov, K.: RDF dataset profiling - a survey of features, methods, vocabularies and applications. Semantic Web J. (2017). http://www.semantic-web-journal.net/content/rdf-dataset-profiling-survey-features-methods-vocabularies-and-applications. Accepted in August 2017 (to appear)

  8. Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semantic Web Inf. Syst. 5(3), 1–22 (2009)

    Article  Google Scholar 

  9. Brewer, E.: Invited keynote on 19th ACM Symposium on Principles of Distributed Computing (PODC) (2000)

    Google Scholar 

  10. Brewer, E.: Pushing the cap: strategies for consistency and availability. Comput. 45(2), 23–29 (2012). https://doi.org/10.1109/MC.2012.37

    Article  Google Scholar 

  11. Brickley, D., Guha, R.: RDF schema 1.1, W3C recommendation, 25 February 2014. https://www.w3.org/TR/rdf-schema/

  12. Brisaboa, N.R., Caro, D., Fariña, A., Rodríguez, M.A.: A compressed suffix-array strategy for temporal-graph indexing. In: de Moura, E., Crochemore, M. (eds.) SPIRE 2014. LNCS, vol. 8799, pp. 77–88. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11918-2_8

    Google Scholar 

  13. Brisaboa, N.R., Cerdeira-Pena, A., Fariña, A., Navarro, G.: A compact RDF store using suffix arrays. In: Iliopoulos, C., Puglisi, S., Yilmaz, E. (eds.) SPIRE 2015. LNCS, vol. 9309, pp. 103–115. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23826-5_11

    Chapter  Google Scholar 

  14. Cabri, G., Guerra, F., Vincini, M., Bergamaschi, S., Leonardi, L., Zambonelli, F.: Momis: exploiting agents to support information integration. Int. J. Cooperative Inf. Syst. 11(3), 293–314 (2002)

    Article  MATH  Google Scholar 

  15. Cadegnani, S., Guerra, F., Ilarri, S., del Carmen Rodríguez-Hernández, M., Lado, R.T., Velegrakis, Y., Amaro, R.: Exploiting linguistic analysis on URLs for recommending web pages: a comparative study. Trans. Comput. Collect. Intell. 26, 26–45 (2017)

    Google Scholar 

  16. del Carmen Rodríguez-Hernández, M., Ilarri, S., Hermoso, R., Lado, R.T.: DataGenCARS: a generator of synthetic data for the evaluation of context-aware recommendation systems. Pervasive Mob. Comput. 38, 516–541 (2017). https://doi.org/10.1016/j.pmcj.2016.09.020

    Article  Google Scholar 

  17. del Carmen Rodríguez-Hernández, M., Ilarri, S., Lado, R.T., Guerra, F.: Towards keyword-based pull recommendation systems. In: ICEIS, vol. 1, pp. 207–214. SciTePress (2016)

    Google Scholar 

  18. Caro, D., Rodríguez, M.A., Brisaboa, N.R., Fariña, A.: Compressed k\(^{\text{ d }}\)-tree for temporal graphs. Knowl. Inf. Syst. 49(2), 553–595 (2016). https://doi.org/10.1007/s10115-015-0908-6

    Article  Google Scholar 

  19. Cerdeira-Pena, A., Fariña, A., Fernández, J.D., Martínez-Prieto, M.A.: Self-indexing RDF archives. In: Bilgin, A., Marcellin, M.W., Serra-Sagristà, J., Storer, J.A. (eds.) 2016 Data Compression Conference, DCC 2016, Snowbird, UT, USA, 30 March–1 April 2016, pp. 526–535. IEEE (2016). https://doi.org/10.1109/DCC.2016.40

  20. Dudic, D., Zlatanovic, I., Gligorević, K., Urosevic, T.: Solar: a software tool for meteorological data processing. Agri. Eng. 39(4), 51–61 (2014). ISSN 0554-5587

    Google Scholar 

  21. Dai, H.-J., Wu, C.-Y., Tzong-Han, R., Hsu, T.W.-L.: From entity recognition to entity linking: a survey of advanced entity linking techniques (2013)

    Google Scholar 

  22. Das, S., Sundara, S., Cyganiak, R.: R2RML: RDB to RDF mapping language (2012). http://www.w3.org/TR/2012/REC-rdb-direct-mapping-20120927/

  23. Abián, D., Guerra, F., Guerra, F., Martinez-Romanos, J., Trillo-Lado, R.: Wikidata and DBpedia: a comparative study. In: Proceedings of the 3rd International Keystone Conference (2017)

    Google Scholar 

  24. Beckett, D., Berners-Lee, T., Prud’hommeaux, E., Carothers, G., Machina, L.: RDF 1.1 turtle, terse RDF triple language, W3C recommendation, 25 February 2014

    Google Scholar 

  25. Debattista, J., Auer, S., Lange, C.: Luzzu: a methodology and framework for linked data quality assessment. J. Data Inf. Qual. 8(1), 4:1–4:32 (2016). https://doi.org/10.1145/2992786

    Google Scholar 

  26. Ben Ellefi, M., Bellahsene, Z., Dietze, S., Todorov, K.: Beyond established knowledge graphs-recommending web datasets for data linking. In: Bozzon, A., Cudre-Maroux, P., Pautasso, C. (eds.) ICWE 2016. LNCS, vol. 9671, pp. 262–279. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-38791-8_15

    Google Scholar 

  27. Ben Ellefi, M., Bellahsene, Z., Dietze, S., Todorov, K.: Dataset recommendation for data linking: an intensional approach. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 36–51. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34129-3_3

    Chapter  Google Scholar 

  28. Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF, W3C Recommendation, 15 January 2008. https://www.w3.org/TR/rdf-sparql-query/

  29. Fernández, J.D., Martínez-Prieto, M.A., Gutiérrez, C., Polleres, A., Arias, M.: Binary RDF representation for publication and exchange (HDT). J. Web Sem. 19, 22–41 (2013). https://doi.org/10.1016/j.websem.2013.01.002

    Article  Google Scholar 

  30. Giménez-García, J.M., Fernández, J.D., Martínez-Prieto, M.A.: HDT-MR: a scalable solution for RDF compression with HDT and MapReduce. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 253–268. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18818-8_16

    Chapter  Google Scholar 

  31. Gottschalk, S., Demidova, E.: Multiwiki: interlingual text passage alignment in Wikipedia. ACM Trans. Web 11(1), 6:1–6:30 (2017)

    Article  Google Scholar 

  32. Klyne, G., Carroll, J.J., McBride, B.: RDF 1.1 concepts and abstract syntax, W3C recommendation, 25 February 2014. https://www.w3.org/TR/rdf11-concepts/

  33. The W3C SPARQL Working Group: SPARQL 1.1 W3C recommendation, 21 March 2013. https://www.w3.org/TR/sparql11-overview/

  34. W3C OWL Working Group: OWL 2 web ontology language document overview, 2nd edn., W3C Recommendation, 11 December 2012. https://www.w3.org/TR/owl2-overview/

  35. Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5(2), 199–220 (1993). https://doi.org/10.1006/knac.1993.1008

    Article  Google Scholar 

  36. Guha, R.V., Brickley, D., MacBeth, S.: Schema.org: evolution of structured data on the web. Queue 13(9), 1010–1037 (2015). http://doi.acm.org/10.1145/2857274.2857276

    Article  Google Scholar 

  37. Hernández, I.R.: Development of a system to populate Knowledge Bases on the Web of Data, Final Project for the Computer Science Degree. University of Zaragoza (2016)

    Google Scholar 

  38. Ilarri, S., Wolfson, O., Mena, E., Illarramendi, A., Sistla, A.P.: A query processor for prediction-based monitoring of data streams. In: Kersten, M.L., Novikov, B., Teubner, J., Polutin, V., Manegold, S. (eds.) Proceedings of the 12th International Conference on Extending Database Technology, EDBT 2009, Saint Petersburg, Russia, 24–26 March 2009, International Conference Proceeding Series, vol. 360, pp. 415–426. ACM (2009). https://doi.org/10.1145/1516360.1516409

  39. Karsai, L., Fekete, A., Kay, J., Missier, P.: Clustering provenance facilitating provenance exploration through data abstraction. In: Binnig, C., Fekete, A., Nandi, A. (eds.) Proceedings of the Workshop on Human-In-the-Loop Data Analytics, HILDA@SIGMOD 2016, San Francisco, CA, USA, 26 June–1 July 2016, p. 6. ACM (2016). https://doi.org/10.1145/2939502.2939508

  40. Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R., Zaveri, A.: Test-driven evaluation of linked data quality. In: Proceedings of the 23rd International Conference on World Wide Web, WWW 2014, pp. 747–758. International World Wide Web Conferences (2014). http://svn.aksw.org/papers/2014/WWW_Databugger/public.pdf

  41. Kosba, A.E., Miller, A., Shi, E., Wen, Z., Papamanthou, C.: Hawk: the blockchain model of cryptography and privacy-preserving smart contracts. IACR Cryptology ePrint Archive 2015, 675 (2015). http://dblp.uni-trier.de/db/journals/iacr/iacr2015.html#KosbaMSWP15

  42. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Sem. Web J. 6(2), 167–195 (2015). http://jens-lehmann.org/files/2015/swj_dbpedia.pdf

    Google Scholar 

  43. Sporny, M., Digital Bazaar, Inc.: RDFa lite 1.1, W3C recommendation 7 June 2012. https://www.w3.org/TR/2012/REC-rdfa-lite-20120607/

  44. Sporny, M., Digital Bazaar, Inc.: RDFa lite 1.1, 2nd edn., W3C recommendation, 17 March 2015. https://www.w3.org/TR/2015/REC-rdfa-core-20150317/

  45. Oliveira, W., Missier, P., Ocaña, K., de Oliveira, D., Braganholo, V.: Analyzing provenance across heterogeneous provenance graphs. In: Mattoso, M., Glavic, B. (eds.) IPAW 2016. LNCS, vol. 9672, pp. 57–70. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40593-3_5

    Chapter  Google Scholar 

  46. Nesi, P., Po, L., Viqueira, J.R.R., Trillo-Lado, R.: An integrated smart city platform. In: Proceedings of the 3rd International Keystone Conference (2017)

    Google Scholar 

  47. Regueiro, M.A., Viqueira, J.R.R., Stasch, C., Taboada, J.A.: Sensor observation service semantic mediation: generic wrappers for in-situ and remote devices. In: Comyn-Wattiau, I., Tanaka, K., Song, I.-Y., Yamamoto, S., Saeki, M. (eds.) ER 2016. LNCS, vol. 9974, pp. 269–276. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46397-1_21

    Chapter  Google Scholar 

  48. Regueiro, M.A., Viqueira, J.R.R., Stasch, C., Taboada, J.A.: Semantic mediation of observation datasets through sensor observation services. Future Gener. Comp. Syst. 67, 47–56 (2017)

    Article  Google Scholar 

  49. Rodriguez-Hernandez, I., Trillo-Lado, R., Yus, R.: WikInfoboxer: a tool to create Wikipedia infoboxes using DBpedia. In: XXI Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2016), Demo track, Salamanca (Spain), 4 p., September 2016

    Google Scholar 

  50. Sarasua, C., Checco, A., Demartini, G., Difallah, D.E., Feldman, M., Pintscher, L.: Editing behavior over time power vs. Standard Wikidata editors at Wikidatacon (2017). https://www.slideshare.net/cristinasarasua/editing-behavior-over-time-power-vs-standard-wikidata-editors-81276124

  51. Smith, B., Linden, G.: Two decades of recommender systems at Amazon.com. IEEE Internet Comput. 21(3), 12–18 (2017)

    Article  Google Scholar 

  52. Stojmenovic, I., Wen, S.: The fog computing paradigm: scenarios and security issues. In: Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, Warsaw, Poland, 7–10 September 2014, pp. 1–8 (2014). https://doi.org/10.15439/2014F503

  53. Suárez-Figueroa, M.C., Gómez-Pérez, A., Fernández-López, M.: The NeOn methodology for ontology engineering. In: Suárez-Figueroa, M., Gómez-Pérez, A., Motta, E., Gangemi, A. (eds.) Ontology Engineering in a Networked World. Springer, Heidelberg (2012). http://oa.upm.es/21469/

    Chapter  Google Scholar 

  54. Pajic, V., Banovic, M.B.B., Dudic, D.: Mining PMMoV genotype-pathotype association rules from public databases. In: Proceedings of International Conference Belgrade Bioinformatics (BelBI), Belgrade, Serbia (2016)

    Google Scholar 

  55. Vrandecic, D.: Wikidata: a new platform for collaborative data collection. In: Mille, A., Gandon, F.L., Misselis, J., Rabinovich, M., Staab, S. (eds.) WWW (Companion Volume), pp. 1063–1064. ACM (2012)

    Google Scholar 

  56. Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014). http://doi.acm.org/10.1145/2629489

    Article  Google Scholar 

  57. Wal, T.V.: Folksonomy coinage and definition (2007). http://vanderwal.net/folksonomy.html

  58. Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for Linked Data: a survey. Semant. Web J. (2015). http://www.semantic-web-journal.net/content/quality-assessment-linked-data-survey

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raquel Trillo-Lado .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Trillo-Lado, R., Dietze, S. (2018). KEYSTONE WG1: Activities and Results Overview on Representation of Structured Data Sources. In: Szymański, J., Velegrakis, Y. (eds) Semantic Keyword-Based Search on Structured Data Sources. IKC 2017. Lecture Notes in Computer Science(), vol 10546. Springer, Cham. https://doi.org/10.1007/978-3-319-74497-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-74497-1_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-74496-4

  • Online ISBN: 978-3-319-74497-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics