Advertisement

Distributed and Parallel Databases

, Volume 23, Issue 2, pp 151–188 | Cite as

Query optimization via contention space partitioning and cost error controlling for dynamic multidatabase systems

  • Qiang Zhu
  • Jaidev Haridas
  • Wen-Chi Hou
Article

Abstract

A multidatabase system (MDBS) integrates information from multiple autonomous local databases. Performing global query optimization to achieve efficient query processing in such a system is challenging due to local autonomy of the data sources. Dynamic factors in the environment make the problem even more difficult. In this paper, we present two techniques, i.e., contention space partitioning and cost error controlling, to perform global query optimization in a dynamic MDBS. Both techniques generate an execution plan with multiple versions for a query in a dynamic MDBS, utilizing the multistate cost models built for the dynamic environment via our previous multistate query sampling method. The first technique partitions the contention space of a dynamic multidatabase environment into a given number of subspaces and chooses a good query execution plan version for each subspace, while the second technique selects a set of execution plan versions by using a given error tolerance to control query execution costs. Experiments demonstrate that the proposed techniques are quite promising for performing global query optimization in a dynamic MDBS. Compared with related work on dynamic query optimization, our approach has an advantage of avoiding the high overhead for modifying or re-generating an execution plan for a query based on dynamic runtime information.

Keywords

Multidatabase system Dynamic environment Query optimization Multistate cost model Execution plan Algorithm 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adali, S., et al.: Query caching and optimization in distributed mediator systems. In: Proc. of ACM SIGMOD Conf., pp. 137–148 (1996) Google Scholar
  2. 2.
    Amsaleg, L., Franklin, M.J., Tomasic, A., Urhan, T.: Scrambling query plans to cope with unexpected delays. In: Proc. of Int. Conf. on Paral. and Distr. Inf. Syst., pp. 208–219 (1996) Google Scholar
  3. 3.
    Amsaleg, L., et al.: Scrambling query plans to cope with unexpected delays. In: Proc. of Int. Conf. on Paral. and Distr. Inf. Syst., pp. 208–219 (1996) Google Scholar
  4. 4.
    Arasu, A., Babcock, B., et al.: STREAM: the Stanford stream data manager. IEEE Data Eng. Bull. 26(1), 19–26 (2003) Google Scholar
  5. 5.
    Bouganim, L., et al.: Dynamic query scheduling in data integration systems. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 425–434 (2000) Google Scholar
  6. 6.
    Chandrasekaran, S., Cooper, O., et al.: TelegraphCQ: continuous dataflow processing for an uncertain world. In: Proc. of CIDR Conf., pp. 1–12 (2003) Google Scholar
  7. 7.
    Chandrasekaran, S., Cooper, O., et al.: TelegraphCQ: continuous dataflow processing. In: Proc. of ACM SIGMOD Conf., pp. 668 (2003) Google Scholar
  8. 8.
    Chen, A.L.P.: Outerjoin optimization in multidatabase systems. In: Proc. of Int. Symp. on DB in Paral. and Distr. Syst., pp. 211–218 (1990) Google Scholar
  9. 9.
    Chen, C.-M., Sun, W., Rishe, N.: Performance comparison of three alternatives of distributed multidatabase systems: a global query perspective.. In: Proc. of Int. Conf. on Performance, Computing and Communications, pp. 53–59 (1998) Google Scholar
  10. 10.
    Cheng, X., Dong, G., Lau, T., Su, J.: Data integration by describing sources with constraint databases. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 374–381 (1999) Google Scholar
  11. 11.
    Reiss, F., Hellerstein, J.M.: Lifting the burden of history from adaptive query processing. In: Proc. of VLDB Conf., pp. 948–959 (2004) Google Scholar
  12. 12.
    Du, W., et al.: Query optimization in heterogeneous DBMS. In: Proc. of VLDB Conf., pp. 277–291 (1992) Google Scholar
  13. 13.
    Du, W., Shan, M.C., Dayal, U.: Reducing multidatabase query response time by tree balancing. In: Proc. of ACM SIGMOD Conf., pp. 293–303 (1995) Google Scholar
  14. 14.
    Evrendilek, C., Dogac, A., Nural, S., Ozcan, F.: Multidatabase query optimization. Distrib. Parallel Databases 5(1), 77–113 (1997) CrossRefGoogle Scholar
  15. 15.
    Garcia-Molina, H., Labio, W., Yerneni, R.: Capability-sensitive query processing on Internet sources. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 50–59 (1999) Google Scholar
  16. 16.
    Gardarin, G., et al.: Calibrating the query optimizer cost model of IRO-DB, an object-oriented federated database system. In: Proc. of VLDB Conf., pp. 378–389 (1996) Google Scholar
  17. 17.
    Goni, A., Bermudez, J., Blanco, J.M., Illarramendi, A.: Using reasoning of description logics for query processing in multidatabase systems. In: Proc. of the 3rd Workshop on Knowl. Repres. Meets DB, pp. 1–6 (1996) Google Scholar
  18. 18.
    Hsu, C.-N., Knoblock, C.A.: Reformulating query plans for multidatabase systems. In: Proc. of ACM CIKM Conf., pp. 423–432 (1993) Google Scholar
  19. 19.
    Hsu, C.-N., Knoblock, C.A.: Semantic query optimization for query plans of heterogeneous multidatabase systems. IEEE Trans. Knowl. Data Eng. 12(6), 959–978 (2000) CrossRefGoogle Scholar
  20. 20.
    Ives, Z.G., Florescu, D., Friedman, M.: An adaptive query execution system for data integration. In: Proc. of ACM SIGMOD Conf., pp. 299–310 (1999) Google Scholar
  21. 21.
    Ives, Z.G., Levy, A.Y., Weld, D.S.: Adaptive query processing for Internet applications. IEEE Data Eng. Bull. 23(2), 19–26 (2000) Google Scholar
  22. 22.
    Josifovski, V., Katchaounov, T., Risch, T.: Optimizing queries in distributed and composable mediators. In: Proc. of Int. Conf. CoopIS, pp. 291–302 (1999) Google Scholar
  23. 23.
    Josinski, H.: Dynamic query optimization and query processing in multidatabase systems. In: Int. Conf. on Extending DB Tech. Ph.D. Workshop, pp. 1–4 (2000) Google Scholar
  24. 24.
    Kang, S., Moon, S.: Global query management in heterogeneous distributed database systems. Microproces. Microprogram. 38, 377–384 (1993) CrossRefGoogle Scholar
  25. 25.
    Lee, C., Chen, C.J.: Query optimization in multidatabase systems considering schema conflicts. IEEE Trans. Know. Data Eng. 9(6), 941–955 (1997) CrossRefGoogle Scholar
  26. 26.
    Lee, J.-O., Baik, D.-K.: SemQL: a semantic query language for multidatabase systems. In: Proc. of ACM CIKM Conf., pp. 259–266 (1999) Google Scholar
  27. 27.
    Levy, A.Y., Rajaraman, A., Ordille, J.J.: Querying heterogeneous information sources using source descriptions. In: Proc. of VLDB Conf., pp. 226–251 Google Scholar
  28. 28.
    Lim, E.-P., et al.: An algebraic transformation framework for multidatabase queries. Distrib. Parallel Databases 3, 273–307 (1995) CrossRefGoogle Scholar
  29. 29.
    Motwani, R., Widom, J., et al.: Query processing, resource management, and approximation in a data stream management system. In: Proc. of CIDR Conf., pp. 1–12 (2003) Google Scholar
  30. 30.
    Naacke, H., Gardarin, G., Tomasic, A.: Leveraging mediator cost models with heterogeneous data sources. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 351–360 (1998) Google Scholar
  31. 31.
    Otsuka, S., Miyazaki, N.: An incomplete database approach to global query processing. In: Proc. of the 12th Int. Conf. on Inf. Networking, pp. 337–342 (1998) Google Scholar
  32. 32.
    Ozcan, F., Nural, S., Koksal, P., Evrendilek, C.: Dynamic query optimization in multidatabases. IEEE Data Eng. Bull. 20(3), 38–44 (1997) Google Scholar
  33. 33.
    Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: the Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992) Google Scholar
  34. 34.
    Rahal, A., Zhu, Q., Larson, P.-Å.: Evolutionary techniques for updating query cost models in a dynamic multidatabase environment. VLDB J. 13(2), 162–176 (2004) CrossRefGoogle Scholar
  35. 35.
    Reiss, F., Hellerstein, J.M.: Data Triage: an adaptive architecture for load shedding in TelegraphCQ. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 155–156 (2005) Google Scholar
  36. 36.
    Roth, M.T. et al.: Cost models DO matter: providing cost information for diverse data sources in a federated system. In: Proc. of VLDB Conf., pp. 599–610 (1999) Google Scholar
  37. 37.
    Subramanian, D.K., Subramanian, K.: Query optimization in multidatabase systems. Distrib. Parallel Databases 6(3), 183–210 (1998) CrossRefGoogle Scholar
  38. 38.
    Tsai, P.S.M., Chen, A.L.P.: Optimizing entity join queries when data transmission cost dominates. Data Knowl. Eng. 22, 283–308 (1997) CrossRefGoogle Scholar
  39. 39.
    Tomasic, A., Raschid, L.: Scaling access to heterogeneous data sources with DISCO. IEEE Trans. Knowl. Data Eng. 10(5), 808–823 (1998) CrossRefGoogle Scholar
  40. 40.
    Urhan, T., Franklin, M.J., Amsaleg, L.: Cost-based query scrambling for initial delays. In: Proc. of ACM SIGMOD Conf., pp. 130–141 (1998) Google Scholar
  41. 41.
    Vassalos, V., Papakonstantinou, Y.: Describing and using query capabilities of heterogeneous sources. In: Proc. of VLDB Conf., pp. 256–265 (1997) Google Scholar
  42. 42.
    Wei, C.-P., Sheng, O.R.L., Hu, P.J.-H.: Fuzzy statistics estimation in supporting multidatabase query optimization. Electron. Commer. Res. 2(3), 287–316 (2002) MATHCrossRefGoogle Scholar
  43. 43.
    Zhu, Q., Haridas, J., Hou, W.-C.: Global query optimization based on multistate cost models for a dynamic multidatabase system. In: Proc. of Int. Conf. on Enterprise Infor. Syst., pp. 144–155 (2003) Google Scholar
  44. 44.
    Zhu, Q., Larson, P.-Å.: A query sampling method for estimating local cost parameters in a multidatabase system. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 144–153 (1994) Google Scholar
  45. 45.
    Zhu, Q., Larson, P.-Å.: Building regression cost models for multidatabase systems. In: Proc. of Int. Conf. on Paral. and Distr. Inf. Syst., pp. 220–231 (1996) Google Scholar
  46. 46.
    Zhu, Q., Larson, P.-Å.: Global query processing and optimization in the CORDS multidatabase system. In: Proc. of 9th Int. Conf. on Paral. and Distr. Comp. Syst., pp. 640–646 (1996) Google Scholar
  47. 47.
    Zhu, Q., Larson, P.-Å.: A fuzzy query optimization approach for multidatabase systems. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 5(6), 701–722 (1997) CrossRefMathSciNetGoogle Scholar
  48. 48.
    Zhu, Q., Larson, P.-Å.: Solving local cost estimation problem for global query optimization in multidatabase systems. Distrib. Parallel Databases 6(4), 373–420 (1998) CrossRefGoogle Scholar
  49. 49.
    Zhu, Q., Sun, Y., Motheramgari, S.: Developing cost models with qualitative variables for dynamic multidatabase environments. In: Proc. of IEEE Int. Conf. on Data Eng., pp. 413–424 (2000) Google Scholar
  50. 50.
    Zhu, Q., Larson, P.-Å.: Classifying local queries for global query optimization in multidatabase systems. Int. J. Cooperative Inf. Syst. 9(3), 315–355 (2000) CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Department of Computer and Information ScienceThe University of MichiganDearbornUSA
  2. 2.Department of Computer ScienceSouthern Illinois UniversityCarbondaleUSA

Personalised recommendations