Abstract
Our study introduces a novel distributed query plan refinement phase in an enhanced architecture of distributed query processing engine (DQPE). Query plan refinement generates potentially efficient distributed query plan by reusable aggregate query shipping (RAQS) approach. The approach improves response time at the cost of pre-processing time. If the overheads could not be compensated by query results reusage, RAQS is no more favorable. Therefore a global cost estimation model is employed to get proper operators: RR_Agg, R_Agg, or R_Scan. For the purpose of reusing results of queries with aggregate function in distributed query processing, a multi-level hybrid view caching (HVC) scheme is introduced. The scheme retains the advantages of partial match and aggregate query results caching. By our solution, evaluations with distributed TPC-H queries show significant improvement on average response time.
Similar content being viewed by others
References
Chaudhuri S, Shim K. Including group-by in query optimization. In Proc. VLDB, Santiago, Chile, 1994, pp.354–366.
Dar S, Franklin M J, Jonsson B T, Srivastava D, Tan M. Semantic data caching and replacement. In Proc. VLDB, Bombay, India, 1996, pp.330–341.
Amiri K, Park S, Tewari R, Padmanabhan S. DBProxy: A dynamic data cache for web applications. In Proc. ICDE, Bangalore, India, March 2003, pp.821–831.
Li R, Zhou M, Liao H. Request window: An approach to improve throughput of RDBMS-based data integration system by utilizing data sharing across concurrent distributed queries. In Proc. VLDB, Vienna, Austria, September 2007, pp.1219–1230.
Kotidis Y, Roussopoulos N. DynaMat: A dynamic view management system for data warehouses., In Proc. ACM-SIGMOD, Philadelphia, Pennsylvania, USA, June 1999, pp.371–382.
Deshpande P, Ramasamy K, Shukla A, Naughton J F. Caching multidimensional queries using chunks. In Proc. ACM-SIGMOD, Seattle, Washington, United States, 1998, pp.259–270.
Scheuermann P, Shim J, Vingralek R. WATCHMAN: A data warehouse intelligent cache manager. In Proc. VLDB, Bombay, India, 1996, pp.51–62.
TPC Homepage. TPC-H benchmark. www.tpc.org.
Josifovski V, Schwarz P, Haas L, Lin E. Garlic: A new flavor of federated query processing for DB2. In Proc. SIGMOD, Madison, Wisconsin, USA, June 2002, pp.524–532.
Kossmann D. The state of the art in distributed query processing. ACM Computing Surveys, 2000, 32(4): 422–469.
Jonsson B, Arinbjarnar M, Borsson B, Franklin M J, Srivastava D. Performance and overhead of semantic cache management. ACM Transactions on Internet Technology (TOIT), 2006, 6(3): 302–331.
Amiri K, Park S, Tewari R, Padmanabhan S. Scalable template-based query containment checking for web semantic caches. In Proc. ICDE, Bangalore, India, March 2003, pp.493–504.
Hao X W, Zhang T, Li L. Optimization technology of query processing based on logic rules in semantic caching. Chinese Journal of Computers, 2005, 28(7): 1096–1103. (In Chinese)
Franklin M J, Jonsson B T, Kossmann D. Performance trade-offs for client-server query processing. In Proc. SIGMOD, Montreal, Quebec, Canada, June 1996, pp.149–160.
Zhou J, Larson P A, Freytag J C, Lehner W. Efficient exploitation of similar subexpressions for query processing. In Proc. ACM-SIGMOD, Beijing, China, June 2007, pp.533–54
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is partially supported by the National Basic Research 973 Program of China under Grant No. 2005CB321807, and the National High Technology Rresearch and Development 863 Program of China under Grant Nos. 2006AA01A106 and 2006AA04Z158.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Liao, HM., Pei, GS. Cache-Based Aggregate Query Shipping: An Efficient Scheme of Distributed OLAP Query Processing. J. Comput. Sci. Technol. 23, 905–915 (2008). https://doi.org/10.1007/s11390-008-9190-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-008-9190-3