Abstract
With rapid growth of data size and schema complexity, many data sets are structured in tables but without explicit foreign key definitions. Automatically identifying foreign keys among relations will be beneficial to query optimization, schema matching, data integration and database design as well. This paper formulates foreign key discovery as a nearest neighbor search problem and proposes a fast foreign key discovery algorithm. To reduce foreign key candidates, we detect inclusion dependencies first. Then we choose statistical features to represent an attribute and define two attributes’s distance. Finally, foreign keys are discovered by finding nearest neighbors of all primary keys. Experiment results on real and synthetic data sets show that our algorithm can discover foreign keys efficiently.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bauckmann, J., Leser, U., Naumann, F., Tietz, V.: Efficiently detecting inclusion dependencies. In: Proceedings of ICDE, pp. 1448–1450 (2007)
Chen, Z., Narasayya, V., Chaudhuri, S.: Fast foreign-key detection in microsoft sql server powerpivot for excel. Proceedings of the VLDB Endowment 7(13) (2014)
De Marchi, F., Lopes, S., Petit, J.M.: Unary & n-ary inclusion dependency discovery in relational databases. Journal of Intelligent Information Systems 32(1) (2009)
Rostin, A., Albrecht, O., Bauckmann, J., Naumann, F., Leser, U.: A machine learning approach to foreign key discovery. In: 12th International Workshop on the Web and Databases, Providence, Rhode Island, USA (2009)
Zhang, M., Hadjieleftheriou, M., Ooi, B.C., Procopiuc, C.M., Srivastava, D.: On multi-column foreign key discovery. PVLDB 3(1), 805–814 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Yuan, X., Cai, X., Yu, M., Wang, C., Zhang, Y., Wen, Y. (2015). Efficient Foreign Key Discovery Based on Nearest Neighbor Search. In: Dong, X., Yu, X., Li, J., Sun, Y. (eds) Web-Age Information Management. WAIM 2015. Lecture Notes in Computer Science(), vol 9098. Springer, Cham. https://doi.org/10.1007/978-3-319-21042-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-21042-1_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21041-4
Online ISBN: 978-3-319-21042-1
eBook Packages: Computer ScienceComputer Science (R0)