Advertisement

Cluster Computing

, Volume 22, Supplement 1, pp 731–747 | Cite as

Scalable top-k keyword search in relational databases

  • Yanwei XuEmail author
Article

Abstract

Keyword search in relational databases has been widely studied in recent years because it does not require users neither to master a certain structured query language nor to know the complex underlying database schemas. There would be a huge number of valid results for a keyword query in a large database. However, only the top 10 or 20 most relevant matches for the keyword query—according to some definition of “Relevance”—are generally of interest. In this paper, we propose an efficient method which can efficiently compute the top-k results for keyword queries in a pipelined pattern, by incorporating the ranking mechanisms into the query processing method. Four optimization methods based on bounding the relevance scores of potential results, reusing and sharing the intermediate result are presented to improve the efficiency of the proposed algorithms. Compared to the existing top-k keyword search systems, the proposed methods can significantly reduce the number of computed query results with low relevance scores and the times for accessing databases, which result in the high efficiency in computing top-k keyword query results in relational databases. Extensive experiments on two real data sets are conducted to evaluate the effectiveness and efficiency of the proposed approach.

Keywords

Relational databases Keyword search Top-k query 

Notes

Acknowledgements

This research was supported by the Natural Science Foundation of Shanghai under Grant No. \(\sim \)14ZR1427700 and the Shanghai Engineering Research Center for Broadband Technologies and Applications (14DZ2280100).

References

  1. 1.
    Yu, J.X., Qin, L., Chang, L.: Keyword Search in Relational Databases: A Survey. In: Bulletin of the IEEE Technical Committee on Data Engineering, vol. 33, no. 10 (2010)Google Scholar
  2. 2.
    Hristidis, V., Gravano, L., Papakonstantinou, Y.: Efficient IR-style keyword search over relational databases. In: VLDB, pp. 850–861 (2003)Google Scholar
  3. 3.
    Luo, Y., Lin, X., Wang, W., Zhou, X.: SPARK: top-k keyword query in relational databases. In: ACM SIGMOD, pp. 115–126 (2007)Google Scholar
  4. 4.
    Luo, Y., Wang, W., Lin, X., Zhou, X., Wang, J., Li, K.: SPARK2: top-k keyword query in relational databases. IEEE Trans. Knowl. Data Eng. 23(12), 1763–1780 (2011)CrossRefGoogle Scholar
  5. 5.
    Hristidis, V., Papakonstantinou, Y.: DISCOVER: keyword search in relational databases. In: VLDB, pp. 670–681 (2002)Google Scholar
  6. 6.
    Markowetz, A., Yang, Y., Papadias, D.: Keyword search on relational data streams. In: ACM SIGMOD, pp. 605–616 (2007)Google Scholar
  7. 7.
    Qin, L., Yu, J.X., Chang, L., Tao, Y.: Scalable keyword search on large data streams. In: ICDE, pp. 1199–1202 (2009)Google Scholar
  8. 8.
    Xu, Y., Guan, J., Ishikawa, Y.: Scalable top-k keyword search in relational databases. In: Database Systems for Advanced Applications—17th International Conference, DASFAA 2012, Proceedings, Part II, Busan, South Korea, 15–19 April 2012, pp. 65–80 (2012)Google Scholar
  9. 9.
    Yu, J.X., Qin, L., Chang, L.: Keyword Search in Databases, Synthesis Lectures on Data Management. Morgan and Claypool Publishers, San Rafael (2010)Google Scholar
  10. 10.
    Xu, Y., Ishikawa, Y., Guan, J.: Efficient continuous top-k keyword search in relational databases. In: WAIM, pp. 755–767 (2010)Google Scholar
  11. 11.
    Luo, Y.: SPARK: a keyword search system on relational databases. PhD Thesis, The University of New South Wales (2009)Google Scholar
  12. 12.
    Cheng, S., Termehchy, A., Hristidis, V.: Predicting the effectiveness of keyword queries on databases. In: CIKM, pp. 1213–1222 (2012)Google Scholar
  13. 13.
    Liu, F., Yu, C., Meng, W., Chowdhury, A.: Effective keyword search in relational databases. In: ACM SIGMOD, pp. 563–574 (2006)Google Scholar
  14. 14.
    Bergamaschi, S., Ferro, N., Guerra, F., Silvello, G.: Keyword-based search over databases: a roadmap for a reference architecture paired with an evaluation framework. Trans. Comput. Collect. Intell. 21, 1–20 (2016)Google Scholar
  15. 15.
    Zhang, J., Peng, Z., Wang, S., Nie, H.: CLASCN: candidate network selection for efficient top-\(k\) keyword queries over databases. J. Comput. Sci. Technol. 22(2), 197–207 (2007)CrossRefGoogle Scholar
  16. 16.
    Aditya, B., Bhalotia, G., Chakrabarti, S., Hulgeri, A., Nakhe, C., Parag, P.: BANKS: browsing and keyword searching in relational databases. In: VLDB, pp. 1083–1086 (2002)Google Scholar
  17. 17.
    He, H., Wang, H., Yang, J., Yu, P.S.: BLINKS: ranked keyword searches on graphs. In: ACM SIGMOD, New York, NY, USA, pp. 305–316. ACM (2007)Google Scholar
  18. 18.
    Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: VLDB, pp. 505–516 (2005)Google Scholar
  19. 19.
    Li, G., Ooi, B.C., Feng, J., Wang, J., Zhou, L.: EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In: ACM SIGMOD, pp. 903–914 (2008)Google Scholar
  20. 20.
    Li, G., Zhou, X., Feng, J., Wang, J.: Progressive keyword search in relational databases. In: ICDE, pp. 1183–1186 (2009)Google Scholar
  21. 21.
    Qin, L., Yu, J.X., Chang, L., Tao, Y.: Querying communities in relational databases. In: ICDE, pp. 724–735 (2009)Google Scholar
  22. 22.
    Li, G., Feng, J., Zhou, X., Wang, J.: Providing built-in keyword search capabilities in RDBMS. VLDB J. 20(1), 1–19 (2011)CrossRefGoogle Scholar
  23. 23.
    Lopez-Veyna, J.I., Sosa, V.J.S., Lopez-Arevalo, I.: KESOSD: keyword search over structured data. In: KEYS, pp. 23–31 (2012)Google Scholar
  24. 24.
    Kargar, M., An, A.: Efficient top-k keyword search in graphs with polynomial delay. In: ICDE, pp. 1269–1272 (2012)Google Scholar
  25. 25.
    Ling, T.W., Le, T.N., Zeng, Z.: Towards an intelligent keyword search over XML and relational databases. In: International Conference on Big Data and Smart Computing, BIGCOMP 2014, Bangkok, Thailand, 15–17 January 2014, pp. 1–6 (2014)Google Scholar
  26. 26.
    Torlone, R.: Towards a new foundation for keyword search in relational databases. In: Proceedings of the 8th Alberto Mendelzon Workshop on Foundations of Data Management, Cartagena de Indias, Colombia, 4–6 June 2014 (2014)Google Scholar
  27. 27.
    Lin, Z., Li, Y., Lai, Y.: Improve the effectiveness of keyword search over relational database by node-temperature-based ant colony optimization. In: 12th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2015, Zhangjiajie, China, 15–17 August 2015, pp. 1209–1214 (2015)Google Scholar
  28. 28.
    Ling, T.W., Zeng, Z., Le, T.N., Lee, M.: ORA-semantics based keyword search in XML and relational databases. In: 32nd IEEE International Conference on Data Engineering Workshops, ICDE Workshops 2016, Helsinki, Finland, 16–20 May 2016, pp. 157–160 (2016)Google Scholar
  29. 29.
    Yu, Z., Yu, X., Chen, Y., Ma, K.: Distributed top-k keyword search over very large databases with MapReduce. In: 2016 IEEE International Congress on Big Data, San Francisco, CA, USA, 27 June–2 July 2016, pp. 349–352 (2016)Google Scholar
  30. 30.
    Park, J., Lee, S.: Keyword search in relational databases. Knowl. Inf. Syst. 26(2), 175–193 (2011)CrossRefGoogle Scholar
  31. 31.
    Agrawal, S., Chaudhuri, S., Das, G.: DBXplorer: a system for keyword-based search over relational databases. In: ICDE, pp. 5–16 (2002)Google Scholar
  32. 32.
    Qin, L., Yu, J.X., Chang, L.: Scalable keyword search on large data streams. VLDB J. 20(1), 35–57 (2011)CrossRefGoogle Scholar
  33. 33.
    Baid, A., Rae, I., Li, J., Doan, A., Naughton, J.F.: Toward scalable keyword search over relational data. PVLDB 3(1), 140–149 (2010)Google Scholar
  34. 34.
    Fakas, G.J.: A novel keyword search paradigm in relational databases: object summaries. Data Knowl. Eng. 70(2), 208–229 (2011)CrossRefGoogle Scholar
  35. 35.
    Xu, Y., Ishikawa, Y., Guan, J.: Efficient continual top-k keyword search in relational databases. J. Inf. Process. 20(1), 1–14 (2012)Google Scholar
  36. 36.
    Zeng, Z., Bao, Z., Ling, T.W., Lee, M.L.: iSearch: an interpretation based framework for keyword search in relational databases. In: KEYS, pp. 3–10 (2012)Google Scholar
  37. 37.
    Xu, Y., Guan, J., Li, F., Zhou, S.: Scalable continual top-k keyword search in relational databases. Data Knowl. Eng. 86, 206–223 (2013)CrossRefGoogle Scholar
  38. 38.
    de Oliveira, P., da Silva, A.S., de Moura, E.S.: Ranking candidate networks of relations to improve keyword search over relational databases. In: 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, 13–17 April 2015, pp. 399–410 (2015)Google Scholar
  39. 39.
    Zeng, Z., Bao, Z., Lee, M., Ling, T.W.: Towards an interactive keyword search over relational databases. In: Proceedings of the 24th International Conference on World Wide Web Companion, WWW 2015, Companion Volume, Florence, Italy, 18–22 May 2015, pp. 259–262 (2015)Google Scholar
  40. 40.
    Kargar, M., An, A., Cercone, N., Godfrey, P., Szlichta, J., Yu, X.: Meaningful keyword search in relational databases with large and complex schema. In: 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, 13–17 April 2015, pp. 411–422 (2015)Google Scholar
  41. 41.
    Zhou, J., Liu, Y., Yu, Z.: Improving the effectiveness of keyword search in databases using query logs. In: Web-Age Information Management—16th International Conference, WAIM 2015, Proceedings, Qingdao, China, 8–10 June 2015, pp. 193–206 (2015)Google Scholar
  42. 42.
    Luo, Y., Wang, W., Lin, X.: SPARK: a keyword search engine on relational databases. In: ICDE, pp. 1552–1555 (2008)Google Scholar
  43. 43.
    Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 129–136 (1982)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Shanghai Engineering Research Center for Broadband Technologies & ApplicationsShanghaiChina

Personalised recommendations