Query Reuse Based Query Planning for Searches over the Deep Web

Wang, Fan; Agrawal, Gagan

doi:10.1007/978-3-642-15251-1_5

Fan Wang¹⁹ &
Gagan Agrawal¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6262))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

907 Accesses

Abstract

Nowadays, data dissemination often involves online databases that are hidden behind query forms, thus forming the deep web. Lately, there has been a lot of research interest on supporting query answering over the deep web. To answer a deep web query efficiently, the current approaches generate a query plan for each query independently.

However, in practice, deep web queries issued by a user over a short period of time can often share similarities. This, if properly utilized, can help us in generating more efficient query plan. In this paper, we have developed a solution for generating query plan for a deep web query based on the similarities between a given query and a set of earlier queries. Our algorithm systematically finds the reusable components of earlier query plans, and then develops a new query plan reusing these. While the resulting query plans may not be optimal, they are likely to enable more data reuse, and hence, speedup the execution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bergman, M.K.: The deep web: Surfacing hidden value. Journal of Electronic Publishing 7 (2001)
Google Scholar
Braga, D., Ceri, S., Daniel, F., Martinenghi, D.: Optimization of Multi-domain Queries on the Web. VLDB Endowment 1, 562–673 (2008)
Google Scholar
Candan, K.S., Li, W.-S., Luo, Q., Hsiung, W.-P., Agrawal, D.: Enabling dynamic content caching for database-driven web sites. ACM SIGMOD Record 30, 532–543 (2001)
Article Google Scholar
Das, G., Gunopulos, D., Koudas, N.: Answering top-k queries using views. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 451–462 (2006)
Google Scholar
Datta, A., Dutta, K., Thomas, H., VanderMeer, D., Ramamritham, K., Fishman, D.: A comparative study of alternative middle tier caching solutions to support dynamic web content acceleration. In: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 667–670 (2001)
Google Scholar
Deutsch, A., Ludascher, B., Nash, A.: Rewriting queries using views wiht access patterns under integrity constraints. Theoretical Computer Science 371, 200–226 (2007)
Article MATH MathSciNet Google Scholar
Goldstein, J., Larson, P.: Optimizing queries using materialized views: A practical, scalable solution. In: Proceedings of the 2001 ACM SIGMOD International conference on Management of Data, pp. 331–342 (2001)
Google Scholar
Halevy, A.: Answering queries using views: A survey. The International Journal on Very Large Data Bases 10, 270–294 (2001)
Article MATH Google Scholar
Keller, A.M., Basu, J.: A predicate-based caching scheme for client-server database architectures. The International Journal on Very Large Data Bases 5, 35–47 (1995)
Article Google Scholar
Kementsietsidis, A., Neven, F., de Craen, D.V., Vansummeren, S.: Scalable multi-query optimization for exploratory queries over federated scientific databases. VLDB Endowment 1, 16–27 (2008)
Google Scholar
Luo, Q., Krishnamurthy, S., Mohan, C., Pirahesh, H., Woo, H., Lindsay, B.G., Naughton, J.F.: Middle-tier database caching for e-business. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 600–611 (2002)
Google Scholar
Luo, Q., Naughton, J.F., Xue, W.: Form-based proxy caching for database-backed web sites. The International Journal on Very Large Data Bases 17, 489–531 (2001)
Article Google Scholar
Robinson, J., Lowden, B.G.: Extending the re-use of query results at remote client sites. In: Ibrahim, M., Küng, J., Revell, N. (eds.) DEXA 2000. LNCS, vol. 1873, pp. 536–547. Springer, Heidelberg (2000)
Chapter Google Scholar
Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient and extensible algorithms for multi query optimization. ACM SIGMOD Record 29, 249–260 (2000)
Article Google Scholar
Sellis, T.K.: Multiple-query optimization. ACM Transactions on Database Systems 13, 23–52 (1988)
Article Google Scholar
Teevan, J., Adar, E., Jones, R., Potts, M.A.: Information re-retrieval: Repeat queries in yahoo’s logs. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 151–158 (2007)
Google Scholar
Wang, F., Agrawal, G.: Querying Deep Web Data Sources: A Structured Keyword Query Approach. Technical Report OSU-CISRC-6/09-TR33, The Ohio State University (June 2009)
Google Scholar
Wang, F., Agrawal, G., Jin, R.: Query planning for searching inter-dependent deep-web databases. In: Ludäscher, B., Mamoulis, N. (eds.) SSDBM 2008. LNCS, vol. 5069, pp. 24–41. Springer, Heidelberg (2008)
Chapter Google Scholar
Wang, F., Agrawal, G., Jin, R., Piontkivska, H.: Snpminer: A domain-specific deep web mining tool. In: Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, pp. 192–199 (2007)
Google Scholar
Yagoub, K., Florescu, D., Lssarny, V., Valduriez, P.: Caching strategies for data-intensive web sites. In: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 188–199 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Ohio State University, Columbus, OH, 43210
Fan Wang & Gagan Agrawal

Authors

Fan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Gagan Agrawal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DeustoTech Computing, University of Deusto, Avda. Universidades, 24, 48007, Bilbao, Spain
Pablo García Bringas
Institut de Recherche en Informatique de Toulouse (IRIT), Paul Sabatier University, 118, route de Narbonne, 31062, Toulouse Cedex, France
Abdelkader Hameurlain
Faculty of Computer Science, Department of Distributed Systems and Multimedia Systems, University of Vienna, Liebiggasse 4/3-4, 1010, Vienna, Austria
Gerald Quirchmayr

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, F., Agrawal, G. (2010). Query Reuse Based Query Planning for Searches over the Deep Web. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15251-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-15251-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15250-4
Online ISBN: 978-3-642-15251-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics