Efficient Foreign Key Discovery Based on Nearest Neighbor Search

Yuan, Xiaojie; Cai, Xiangrui; Yu, Man; Wang, Chao; Zhang, Ying; Wen, Yanlong

doi:10.1007/978-3-319-21042-1_37

Efficient Foreign Key Discovery Based on Nearest Neighbor Search

Xiaojie Yuan¹⁷,
Xiangrui Cai¹⁷,
Man Yu¹⁷,
Chao Wang¹⁷,
Ying Zhang¹⁷ &
…
Yanlong Wen¹⁷

Conference paper
First Online: 01 January 2015

2700 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9098))

Abstract

With rapid growth of data size and schema complexity, many data sets are structured in tables but without explicit foreign key definitions. Automatically identifying foreign keys among relations will be beneficial to query optimization, schema matching, data integration and database design as well. This paper formulates foreign key discovery as a nearest neighbor search problem and proposes a fast foreign key discovery algorithm. To reduce foreign key candidates, we detect inclusion dependencies first. Then we choose statistical features to represent an attribute and define two attributes’s distance. Finally, foreign keys are discovered by finding nearest neighbors of all primary keys. Experiment results on real and synthetic data sets show that our algorithm can discover foreign keys efficiently.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bauckmann, J., Leser, U., Naumann, F., Tietz, V.: Efficiently detecting inclusion dependencies. In: Proceedings of ICDE, pp. 1448–1450 (2007)
Google Scholar
Chen, Z., Narasayya, V., Chaudhuri, S.: Fast foreign-key detection in microsoft sql server powerpivot for excel. Proceedings of the VLDB Endowment 7(13) (2014)
Google Scholar
De Marchi, F., Lopes, S., Petit, J.M.: Unary & n-ary inclusion dependency discovery in relational databases. Journal of Intelligent Information Systems 32(1) (2009)
Google Scholar
Rostin, A., Albrecht, O., Bauckmann, J., Naumann, F., Leser, U.: A machine learning approach to foreign key discovery. In: 12th International Workshop on the Web and Databases, Providence, Rhode Island, USA (2009)
Google Scholar
Zhang, M., Hadjieleftheriou, M., Ooi, B.C., Procopiuc, C.M., Srivastava, D.: On multi-column foreign key discovery. PVLDB 3(1), 805–814 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer and Control Engineering, Nankai University, 94 Weijin Road, Tianjin, 300071, People’s Republic of China
Xiaojie Yuan, Xiangrui Cai, Man Yu, Chao Wang, Ying Zhang & Yanlong Wen

Authors

Xiaojie Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Xiangrui Cai
View author publications
You can also search for this author in PubMed Google Scholar
Man Yu
View author publications
You can also search for this author in PubMed Google Scholar
Chao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yanlong Wen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Zhang .

Editor information

Editors and Affiliations

Google, CA, USA
Xin Luna Dong
Postdoc Apartments (Hong Lou) 4-1-4, Shandong University, Li Cheng, Jinan, China
Xiaohui Yu
Tsinghua University, Beijing, China
Jian Li
Northeastern University, BOSTON, USA
Yizhou Sun

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yuan, X., Cai, X., Yu, M., Wang, C., Zhang, Y., Wen, Y. (2015). Efficient Foreign Key Discovery Based on Nearest Neighbor Search. In: Dong, X., Yu, X., Li, J., Sun, Y. (eds) Web-Age Information Management. WAIM 2015. Lecture Notes in Computer Science(), vol 9098. Springer, Cham. https://doi.org/10.1007/978-3-319-21042-1_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-21042-1_37
Published: 06 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21041-4
Online ISBN: 978-3-319-21042-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics