Abstract
Since a decade, the database community researches opportunities to exploit graphics processing units to accelerate query processing. While the developed GPU algorithms often outperform their CPU counterparts, it is not beneficial to keep processing devices idle while over utilizing others. Therefore, an approach is needed that effectively distributes a workload on available (co-)processors while providing accurate performance estimations for the query optimizer. In this paper, we extend our hybrid query-processing engine with heuristics that optimize query processing for response time and throughput simultaneously via inter-device parallelism. Our empirical evaluation reveals that the new approach doubles the throughput compared to our previous solution and state-of-the-art approaches, because of nearly equal device utilization while preserving accurate performance estimations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anderson, T., Finn, J.D.: The New Statistical Analysis of Data, 1st edn. Springer (1996)
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.-A.: StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. Concurrency and Computation: Practice & Experience 23(2), 187–198 (2011)
Bakkum, P., Skadron, K.: Accelerating SQL Database Operations on a GPU with CUDA. In: GPGPU, pp. 94–103. ACM (2010)
Breß, S., Beier, F., Rauhe, H., Schallehn, E., Sattler, K.-U., Saake, G.: Automatic Selection of Processing Units for Coprocessing in Databases. In: Morzy, T., Härder, T., Wrembel, R. (eds.) ADBIS 2012. LNCS, vol. 7503, pp. 57–70. Springer, Heidelberg (2012)
Diamos, G., Wu, H., Lele, A., Wang, J., Yalamanchili, S.: Efficient Relational Algebra Algorithms and Data Structures for GPU. Technical report, Center for Experimental Research in Computer Systems (CERS) (2012)
Govindaraju, N.K., Lloyd, B., Wang, W., Lin, M., Manocha, D.: Fast Computation of Database Operations using Graphics Processors. In: SIGMOD, pp. 215–226. ACM (2004)
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. Data Mining and Knowledge Discovery 1(1), 29–53 (1997)
He, B., Lu, M., Yang, K., Fang, R., Govindaraju, N.K., Luo, Q., Sander, P.V.: Relational Query Co-Processing on Graphics Processors. ACM Trans. Database Syst. 34, 21:1–21:39 (2009)
Ilić, A., Pratas, F., Trancoso, P., Sousa, L.: High-Performance Computing on Heterogeneous Systems: Database Queries on CPU and GPU. In: High Performance Scientific Computing with Special Emphasis on Current Capabilities and Future Perspectives, pp. 202–222. IOS Press (2011)
Ilić, A., Sousa, L.: CHPS: An Environment for Collaborative Execution on Heterogeneous Desktop Systems. International Journal of Networking and Computing 1(1), 96–113 (2011)
Iverson, M., Ozguner, F., Potter, L.: Statistical Prediction of Task Execution Times Through Analytic Benchmarking for Scheduling in a Heterogeneous Environment. In: HCW, pp. 99–111 (1999)
Kerr, A., Diamos, G., Yalamanchili, S.: Modeling GPU-CPU Workloads and Systems. In: GPGPU, pp. 31–42. ACM (2010)
Lauer, T., Datta, A., Khadikov, Z., Anselm, C.: Exploring Graphics Processing Units as Parallel Coprocessors for Online Aggregation. In: DOLAP, pp. 77–84. ACM (2010)
Malik, M., Riha, L., Shea, C., El-Ghazawi, T.: Task Scheduling for GPU Accelerated Hybrid OLAP Systems with Multi-core Support and Text-to-Integer Translation. In: 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pp. 1987–1996. IEEE (2012)
Pirk, H.: Efficient Cross-Device Query Processing. In: The VLDB PhD Workshop. VLDB Endowment (2012)
Sanders, J., Kandrot, E.: CUDA by Example: An Introduction to General-Purpose GPU Programming, 1st edn., vol. 186, pp. 2–6. Addison-Wesley Professional (2010)
Schlicht, E.: Isolation and Aggregation in Economics, 1st edn. Springer (1985)
Tang, X., Chanson, S.: Optimizing Static Job Scheduling in a Network of Heterogeneous Computers. In: ICPP, pp. 373–382. IEEE (2000)
Topcuouglu, H., Hariri, S., Wu, M.-Y.: Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
Wu, R., Zhang, B., Hsu, M., Chen, Q.: GPU-Accelerated Predicate Evaluation on Column Store. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds.) WAIM 2010. LNCS, vol. 6184, pp. 570–581. Springer, Heidelberg (2010)
Zhao, Y., Deshpande, P.M., Naughton, J.F.: An Array-Based Algorithm for Simultaneous Multidimensional Aggregates. In: SIGMOD, pp. 159–170. ACM (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Breß, S., Siegmund, N., Bellatreche, L., Saake, G. (2013). An Operator-Stream-Based Scheduling Engine for Effective GPU Coprocessing. In: Catania, B., Guerrini, G., Pokorný, J. (eds) Advances in Databases and Information Systems. ADBIS 2013. Lecture Notes in Computer Science, vol 8133. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40683-6_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-40683-6_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40682-9
Online ISBN: 978-3-642-40683-6
eBook Packages: Computer ScienceComputer Science (R0)