A Latent Semantic Indexing-Based Approach to Determine Similar Clusters in Large-scale Schema Matching

Moawed, Seham; Algergawy, Alsayed; Sarhan, Amany; Eldosouky, Ali; Saake, Gunter

doi:10.1007/978-3-319-01863-8_29

Seham Moawed¹⁴,
Alsayed Algergawy^12,13,
Amany Sarhan¹³,
Ali Eldosouky¹⁴ &
…
Gunter Saake¹²

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 241))

1420 Accesses
3 Citations

Abstract

Schema matching plays a central role in identifying the semantic correspondences across shared-data applications, such as data integration. Due to the increasing size and the widespread use of XML schemas and different kinds of ontologies, it becomes toughly challenging to cope with large-scale schema matching. Clustering-based matching is a great step towards more significant reduction of the search space and thus improved efficiency. However, methods used to identify similar clusters depend on literally matching terms. To improve this situation, in this paper, a new approach is proposed which uses Latent Semantic Indexing that allows retrieving the conceptual meaning between clusters. The experimental evaluations show encourage results towards building efficient large-scale matching approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Algergawy, A., Massmann, S., Rahm, E.: A clustering-based approach for large-scale ontology matching. In: Eder, J., Bielikova, M., Tjoa, A.M. (eds.) ADBIS 2011. LNCS, vol. 6909, pp. 415–428. Springer, Heidelberg (2011)
Chapter Google Scholar
Algergawy, A., Schallehn, E., Saake, G.: Improving XML schema matching using prufer sequences. DKE 68(8), 728–747 (2009)
Article Google Scholar
Berry, M.W., Drmac, Z., Jessup, E.R.: Matrices, vector spaces, and information retrieval. SIAM Review 41(2), 335–362 (1999)
Article MathSciNet MATH Google Scholar
Bonifati, A., Mecca, G., Pappalardo, A., Raunich, S., Summa, G.: Schema mapping verification: the spicy way. In: EDBT 2008, France,, pp. 85–96 (2008)
Google Scholar
Deerwester, S., Dumais, S.T., Harshman, R.: Indexing by latent semantic analysis. Journal of American Society for Information Science 41, 391–407
Google Scholar
Do, H.H., Rahm, E.: Matching large schemas: Approaches and evaluation. Information Systems 32(6), 857–885 (2007)
Article Google Scholar
Hamdi, F., Safar, B., Reynaud, C., Zargayouna, H.: Alignment-based partitioning of large-scale ontologies. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds.) Advances in Knowledge Discovery and Management. SCI, vol. 292, pp. 251–269. Springer, Heidelberg (2010)
Chapter Google Scholar
Hu, W., Qu, Y., Cheng, G.: Matching large ontologies: A divide-and-conquer approach. DKE 67, 140–160 (2008)
Article Google Scholar
Landauer, T.: Handbook of Latent Semantic Analysis (2007)
Google Scholar
Peukert, E., Massmann, S., Konig, K.: Comparing similarity combination methods for schema matching. In: GI-Workshop, pp. 692–701 (2010)
Google Scholar
Rahm, E.: Towards large-scale schema and ontology matching. In: Data-Centric Systems and Applications, vol. 5258, pp. 3–27. Springer (2011)
Google Scholar
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB Journal 10(4), 334–350 (2001)
Article MATH Google Scholar
Seddiquia, M.H., Aono, M.: An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. Web Semantics 7(4), 344–356 (2009)
Article Google Scholar
Shvaiko, P., Euzenat, J.: Ontology matching: State of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)
Article Google Scholar
Wang, Z., Wang, Y., Zhang, S.-S., Shen, G., Du, T.: Matching large scale ontology effectively. In: Mizoguchi, R., Shi, Z.-Z., Giunchiglia, F. (eds.) ASWC 2006. LNCS, vol. 4185, pp. 99–105. Springer, Heidelberg (2006)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Otto-von-Guericke University, 39106, Magdeburg, Germany
Alsayed Algergawy & Gunter Saake
Department of Computer Engineering, Tanta University, Tanta, Egypt
Alsayed Algergawy & Amany Sarhan
Department of Computer Engineering, Mansoura University, Mansoura, Egypt
Seham Moawed & Ali Eldosouky

Authors

Seham Moawed
View author publications
You can also search for this author in PubMed Google Scholar
Alsayed Algergawy
View author publications
You can also search for this author in PubMed Google Scholar
Amany Sarhan
View author publications
You can also search for this author in PubMed Google Scholar
Ali Eldosouky
View author publications
You can also search for this author in PubMed Google Scholar
Gunter Saake
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Informatica Bioingegneria, Robotica e, Università di Genova, Genova, Italy
Barbara Catania
Dipartimento di Automatica e Informatica, Politecnico di Torino, Torino, Italy
Tania Cerquitelli
Dipartimento di Automatica e Informatica, Politecnico di Torino, Torino, Italy
Silvia Chiusano
Dipartimento di Informatica, Bioingegneria, Robotica e, Università di Genova, Genova, Italy
Giovanna Guerrini
Cloudera, Inc., California,, California, USA
Mirko Kämpf
Faculty of Informatics, Technische Universität München, Garching, Germany
Alfons Kemper
Dept. of Analytical Information Systems, Saint Petersburg University, Saint Petersburg, Russia
Boris Novikov
Dipartimento di Ingegneria e Scienza, dell’Informazione, ItalyUniversità di Trento, Povo, TN,, Italy
Themis Palpanas
Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Praha, Praha, Czech Republic
Jaroslav Pokorný
Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Athena Vakali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moawed, S., Algergawy, A., Sarhan, A., Eldosouky, A., Saake, G. (2014). A Latent Semantic Indexing-Based Approach to Determine Similar Clusters in Large-scale Schema Matching. In: Catania, B., et al. New Trends in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol 241. Springer, Cham. https://doi.org/10.1007/978-3-319-01863-8_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-01863-8_29
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01862-1
Online ISBN: 978-3-319-01863-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics