A multi-aspect approach to ontology matching based on Bayesian cluster ensembles

Ippolito, Andre; de Almeida Junior, Jorge Rady

doi:10.1007/s10844-019-00583-8

A multi-aspect approach to ontology matching based on Bayesian cluster ensembles

Published: 23 November 2019

Volume 55, pages 95–118, (2020)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

256 Accesses
1 Citation
Explore all metrics

Abstract

With the progressive increase in the number of existing ontologies, ontology matching became a challenging task. Ontology matching is a crucial step in the ontology integration process and its goal is to find correspondent elements in heterogeneous ontologies. A trend of clustering-based solutions for ontology matching has evolved, based on a divide-and-conquer strategy, which partitions ontologies, clusters similar partitions and restricts the matching to ontology elements of similar partitions. Nevertheless, most of these solutions considered solely the terminological aspect, ignoring other ontology aspects that can contribute to the final matching results. In this work, we developed a novel solution for ontology matching based on a consensus clustering of multiple aspects of ontology partitons. We partitioned the ontologies applying Community Detection techniques and applied Bayesian Cluster Ensembles (BCE) to find a consensus clustering among the terminological, topological and extensional aspects of ontology partitions. The matching results of our experimental study indicated that a BCE-based solution with three clusters best captured the contributions of the aspects, in comparison to other consensual solutions. The results corroborated the benefits of the synergy between the ontology aspects to the ontology alignment. We also verified that the BCE-based solution for three clusters yielded higher matching scores than other state-of-the-art solutions. Besides, our proposed methods structurize a configurable framework, which allows adding other ontology aspects and also other techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering graph data: the roadmap to spectral techniques

Article Open access 22 January 2024

Rahul Mondal, Evelina Ignatova, … Robert Heyer

Clustering ensemble method

Article Open access 16 January 2018

Tahani Alqurashi & Wenjia Wang

Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature

Article 10 October 2020

Absalom E. Ezugwu, Amit K. Shukla, … Jeffery O. Agushaka

References

Algergawy, A., Massmann, S., Rahm, E. (2011). A clustering-based approach for large-scale ontology matching. ADBIS, 6909, 415–428.
Google Scholar
Algergawy, A., Moawed, S., Sarhan, A., Eldosouky, A., Saake, G. (2014). Improving clustering-based schema matching using latent semantic indexing. Trans Large-Scale Data- and Knowledge-Centered Systems, 15, 102–123.
Google Scholar
Blei, D., Ng, A., Jordan, M. (2003). Latent dirichlet allocation. Journal of Machine learning Research (3) 993–1022.
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10,008.
Article Google Scholar
Brandes, U., Delling, D., Gaertler, M., Gorke, R., Hoefer, M., Nikoloski, Z., Wagner, D. (2008). On modularity clustering. IEEE Transactions on Knowledge and Data Engineering, 20(2), 172–188.
Article Google Scholar
Clauset, A., Newman, M.E.J., Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70, 066,111.
Article Google Scholar
Coskun, G, Rothe, M, Teymourian, K, Paschke, A. (2011). Applying community detection algorithms on ontologies for identifying concept groups, Frontiers in Artificial Intelligence and Applications, vol 230. IOS Press Books.
Euzenat, J, & Shvaiko, P. (2013). Ontology matching. Springer.
Ferrara, A., Genta, L., Montanelli, S., Castano, S. (2015). Dimensional clustering of linked data: techniques and applications. Trans Large-Scale Data- and Knowledge-Centered Systems, 19, 55–86.
Article MathSciNet Google Scholar
Fortunato, S. (2009). Community detection in graphs. arXiv:0906.
Ghosh, J, & Acharya, A. (2013). Cluster ensembles: theory and applications. In: Data Clustering: Algorithms and Applications, pp 551–570.
Harary, F. (1969). Graph theory. Addison-Wesley.
Honkela, T., Hyvärinen, A, Väyrynen, JJ. (2010). Wordica - emergence of linguistic representations for words by independent component analysis. Natural Language Engineering, 16(3), 277–308.
Article Google Scholar
Hu, B., Kalfoglou, Y., Alani, H., Dupplaw, D., Lewis, P.H., Shadbolt, N. (2006). Semantic metrics. In Staab, S., & Svátek, V. (Eds.) EKAW, (Vol. 4248 pp. 166–181). Berlin: Springer, Lecture Notes in Computer Science.
Hyvärinen, A, Karhunen, J, Oja, E. (2001). Independent component analysis. John Wiley and Sons.
Ippolito, A., & de Almeida Junior, J.R. (2016). Ontology matching based on multi-aspect consensus clustering of communities. In Hammoudi, S., Maciaszek, L.A., Missikoff, M., Camp, O., Cordeiro, J. (Eds.) ICEIS 2016 - Proceedings of the 18th International Conference on Enterprise Information Systems, Volume 2, SciTePress (pp. 321–326).
Jain, A.K. (2010). Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8), 651–666.
Article Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J. (1999). Data clustering: a review. ACM Computing Surveys, 31(3), 264–323.
Article Google Scholar
Karpis, G., & Kumar, V. (1998). A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1), 359–392.
Article MathSciNet Google Scholar
Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S. (1999). Multilevel hypergraph partitioning: applications in vlsi domain. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 7(1), 69–79.
Article Google Scholar
Kaufman, L., & Rousseeuw, P.J. (1990). Finding groups in data: an introduction to cluster analysis. John Wiley.
Kondrak, G. (2005). N-gram similarity and distance. In Consens, M.P., & Navarro, G. (Eds.) 12Th International Conference String Processing and Information Retrieval (SPIRE), (Vol. 3772 pp. 115–126). Berlin: Springer, Lecture Notes in Computer Science.
Kullback, S., & Leibler, R.A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22(1), 79–86.
Article MathSciNet Google Scholar
Landauer, T.K., Foltz, P.W., Laham, D. (1998). Introduction to latent semantic analysis. Discourse Processes, 25, 259–284.
Article Google Scholar
Levenshtein, V. (1966). Binary codes capable of correcting deletions and insertions and reversals. Soviet Physics Doklady, 10, 707–710.
MathSciNet Google Scholar
Manning, C.D., Raghavan, P, Schütze, H. (2008). Introduction to information retrieval. Cambridge University Press.
Miller, G.A. (1995). WordNet: a lexical database for english. Communications of the ACM, pp. 39–45.
Pons, P., & Latapy, M. (2006). Computing communities in large networks using random walks. Journal of Graph Algorithms and Applications, 10(2), 191–218.
Article MathSciNet Google Scholar
Reichardt, J., & Bornholdt, S. (2006). Statistical mechanics of community detection. Physical Review E, 74, 016,110.
Article MathSciNet Google Scholar
Rousseeuw, P. (1987). Silhouette: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.
Article Google Scholar
Sokal, R.R., & Michener, C.D. (1958). A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin, 38, 1409–1438.
Google Scholar
Strehl, A., & Ghosh, J. (2003). Cluster ensembles: a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583–617.
MathSciNet MATH Google Scholar
Tran, T., Wang, H., Haase, P. (2009). Hermes: Data web search on a pay-as-you-go integration infrastructure. Web Semantics: Science, Services and Agents on the World Wide Web, 7(3), 189–203.
Article Google Scholar
Wang, H., Shan, H., Banerjee, A. (2011). Bayesian cluster ensembles. Statistical Analysis and Data Mining, 4(1), 54–70.
Article MathSciNet Google Scholar
Ward, J. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.
Article MathSciNet Google Scholar
Wasserman, S., & Faust, K. (1994). Social network analysis: methods and applications. Cambridge University Press.
West, D.B. (2001). Introduction to graph theory, 2nd edn. Prentice Hall.
Zhang, H., Hu, W., Qu, Y. (2012). Vdoc+: a virtual document based approach for matching large ontologies using mapreduce. Journal of Zhejiang University - Science C, 13(4), 257–267.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer and Digital Systems Department, Polytechnic School of University of Sao Paulo, Avenida Professor Luciano Gualberto 158, Sao Paulo, Brazil
Andre Ippolito & Jorge Rady de Almeida Junior

Authors

Andre Ippolito
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Rady de Almeida Junior
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andre Ippolito.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ippolito, A., de Almeida Junior, J.R. A multi-aspect approach to ontology matching based on Bayesian cluster ensembles. J Intell Inf Syst 55, 95–118 (2020). https://doi.org/10.1007/s10844-019-00583-8

Download citation

Received: 31 October 2016
Revised: 30 September 2019
Accepted: 04 October 2019
Published: 23 November 2019
Issue Date: August 2020
DOI: https://doi.org/10.1007/s10844-019-00583-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-aspect approach to ontology matching based on Bayesian cluster ensembles

Abstract

Access this article

Similar content being viewed by others

Clustering graph data: the roadmap to spectral techniques

Clustering ensemble method

Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multi-aspect approach to ontology matching based on Bayesian cluster ensembles

Abstract

Access this article

Similar content being viewed by others

Clustering graph data: the roadmap to spectral techniques

Clustering ensemble method

Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation