Summarizing Relational Database Schema Based on Label Propagation

Yuan, Xiaojie; Li, Xinkun; Yu, Man; Cai, Xiangrui; Zhang, Ying; Wen, Yanlong

doi:10.1007/978-3-319-11116-2_23

Xiaojie Yuan¹⁹,
Xinkun Li¹⁹,
Man Yu¹⁹,
Xiangrui Cai¹⁹,
Ying Zhang²⁰ &
…
Yanlong Wen¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8709))

Included in the following conference series:

Asia-Pacific Web Conference

3248 Accesses
3 Citations

Abstract

Real enterprise databases are usually composed of hundreds of tables, which make querying a complex database a really hard task for unprofessional users, especially when lack of documentation. Schema summarization helps to improve the usability of databases and provides a succinct overview of the entire schema. In this paper, we introduce a novel three-step schema summarization method based on label propagation. First, we exploit varied similarity properties in database schema and propose a measure of table similarity based on Radial Basis Function Kernel, which measures similarity properties comprehensively. Second, we find representative tables as labeled data and annotate the labeled schema graph. Finally, we use label propagation algorithm on the labeled schema graph to classify database schema and create a schema summary. Extensive evaluations demonstrate the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

TPCE, http://www.tpc.org/tpce/tpc-e.asp
Jagadish, H.V., Chapman, A., Elkiss, A., et al.: Making database systems usable. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pp. 13–24. ACM (2007)
Google Scholar
Akoka, J., Comyn-Wattiau, I.: Entity-relationship and object-oriented model automatic clustering. Data & Knowledge Engineering 20(2), 87–117 (1996)
Article Google Scholar
Yu, C., Jagadish, H.V.: Schema summarization. Proceedings of the VLDB Endowment, 319–330 (2006)
Google Scholar
Sampaio, M., Quesado, J., Barros, S.: Relational Schema Summarization:A Context-Oriented Approach. In: Morzy, T., Härder, T., Wrembel, R. (eds.) Advances in Databases and Information Systems. AISC, vol. 186, pp. 217–228. Springer, Heidelberg (2013)
Chapter Google Scholar
Yang, X., Procopiuc, C.M., Srivastava, D.: Summarizing relational databases. Proceedings of the VLDB Endowment 2(1), 634–645 (2009)
Article Google Scholar
Wang, X., Zhou, X., Wang, S.: Summarizing Large-Scale Database Schema Using Community Detection. Journal of Computer Science and Technology 27(3), 515–526 (2012)
Article Google Scholar
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML, vol. 3, pp. 912–919 (2003)
Google Scholar
Wang, F., Zhang, C.: Label propagation through linear neighborhoods. IEEE Transactions on Knowledge and Data Engineering 20(1), 55–67 (2008)
Article Google Scholar
Vert, J.P., Tsuda, K., Schölkopf, B.: A primer on kernel methods. Kernel Methods in Computational Biology, 35–70 (2004)
Google Scholar
Wu, W., Reinwald, B., Sismanis, Y., Manjrekar, R.: Discovering topical structures of databases. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1019–1030. ACM (2008)
Google Scholar
Salton, G., McGill, M.J.: Introduction to modern information retrieval (1983)
Google Scholar
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. The VLDB Journal 10(4), 334–350 (2001)
Article MATH Google Scholar
Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer and Control Engineering, Nankai University, P.R.China
Xiaojie Yuan, Xinkun Li, Man Yu, Xiangrui Cai & Yanlong Wen
College of Software, Nankai University, No. 94 Weijin Road, Tianjin, P.R.China, 300071
Ying Zhang

Authors

Xiaojie Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Xinkun Li
View author publications
You can also search for this author in PubMed Google Scholar
Man Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xiangrui Cai
View author publications
You can also search for this author in PubMed Google Scholar
Ying Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yanlong Wen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Beijing Institute of Spacecraft System Engineering, Beijing, China
Lei Chen
School of Computer Science, National University of Defense Technology, 410073, Changsha, Hunan, China
Yan Jia
RMIT University, Melbourne, Australia
Timos Sellis
School of Computer Science and Technology, Soochow University, 215006, Suzhou, China
Guanfeng Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yuan, X., Li, X., Yu, M., Cai, X., Zhang, Y., Wen, Y. (2014). Summarizing Relational Database Schema Based on Label Propagation. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds) Web Technologies and Applications. APWeb 2014. Lecture Notes in Computer Science, vol 8709. Springer, Cham. https://doi.org/10.1007/978-3-319-11116-2_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-11116-2_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11115-5
Online ISBN: 978-3-319-11116-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics