Abstract
Data Integration has been a consistent concern in the Linked Open Data (LOD) research. The data integration problem (DIP) depends upon many factors. Primarily the nature and type of datasets guide the integration process. Every day, the demand for open and improved data visualization is increasing. Organizations, researchers and data scientists all require more improved techniques for data integration that can be used for analytics and predictions. The scientific community has been able to construct meaningful solutions by using the power of metadata. The metadata is powerful if it is properly guided. There are several existing methodologies that improve system semantics using metadata. However, the data integration between heterogeneous resources for example structured and unstructured data is still a far fetched reality. Metadata can not only improve but effectively increase semantic search performance if properly reconciled with the available information or standard data. In this paper, we present a metadata reconciliation strategy for improving data integration and data classification between data sources that correspond to a certain standard of similarity. The data similarity can be deployed as a power tool for linked data operations. The data publishing and connection over the LOD can effectively be improved using reconciliation strategies. In this paper, we also briefly define the procedure of reconciliation that can semi-automate the interlinking and validation process for publishing linked data as an integrated resource.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. J. Alg. 50(2), 257–275 (2004)
Fetahu, B., Anand, A., Anand, A.: How much is Wikipedia lagging behind news? In: Proceedings of the ACM Web Science Conference, p. 28. ACM (2015)
Georgescu, M., Kanhabua, N., Krause, D., Nejdl, W., Siersdorfer, S.: Extracting event-related information from article updates in Wikipedia. In: Serdyukov, P., et al. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 254–266. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36973-5_22
Ho, T., Oh, S.R., Kim, H.: A parallel approximate string matching under Levenshtein distance on graphics processing units using warp-shuffle operations. PloS One 12(10), e0186251 (2017)
Lehmann, J., et al.: DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Seman. Web 6(2), 167–195 (2015)
Morsey, M., Lehmann, J., Auer, S., Stadler, C., Hellmann, S.: DBpedia and the live extraction of structured data from Wikipedia. Program 46(2), 157–181 (2012)
Ochs, C., Tian, T., Geller, J., Chun, S.A.: Google knows who is famous today-building an ontology from search engine knowledge and DBpedia. In: 2011 Fifth IEEE International Conference on Semantic Computing (ICSC), pp. 320–327. IEEE (2011)
Zhu, X., Wang, B.: Web service management based on Hadoop. In: 2011 8th International Conference on Service Systems and Service Management (ICSSSM), pp. 1–6. IEEE (2011)
Acknowledgments
This research has been funded by the European Commission through the Erasmus Mundus Joint Doctorate Information Technologies for Business Intelligence-Doctoral College (IT4BI-DC).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Khalid, H., Zimanyi, E., Wrembel, R. (2018). Metadata Reconciliation for Improved Data Binding and Integration. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds) Beyond Databases, Architectures and Structures. Facing the Challenges of Data Proliferation and Growing Variety. BDAS 2018. Communications in Computer and Information Science, vol 928. Springer, Cham. https://doi.org/10.1007/978-3-319-99987-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-99987-6_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99986-9
Online ISBN: 978-3-319-99987-6
eBook Packages: Computer ScienceComputer Science (R0)