Skip to main content

Partitioning pipelines with communication costs

  • Invited Papers
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1006))

Abstract

In this paper, we consider the problem of scheduling a database query execution graph on a parallel machine. Specifically, we consider the problem of data-partitioning pipelined operators with the objective of minimizing response time. This is a basic problem in scheduling database execution trees. Partitioning promises increased parallelism and memory availability at the price of greater communication overhead. Current partitioning methods [BB90, TWPY92, LCRY93, NSHL93] do not consider these trade-offs. We present a mathematical framework within which these alternatives can be quantified for many interesting practical scenarios. We then present an algorithm whose performance is within a factor of 2 of the optimum possible.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K.P. Belkhale and P. Banerjee: Approximate Algorithms for the Partitionable Independent Task Scheduling Problem. Proceedings of the 1990 International Conference on Parallel Processing. pp. I-72–I-75

    Google Scholar 

  2. D.DeWitt, S. Ghandehariziadeh, D. Schneider, A. Bricker, H.Hsiao, R.Rasmussen; The Gamma Database Machine. IEEE Transactions on Knowledge and Data Engineering, March 1990

    Google Scholar 

  3. D. DeWitt and J. Gray: The future of high performance database systems. Communications of the ACM, June 1992

    Google Scholar 

  4. S. Ganguly. Parallel Evaluation of Deductive Database Queries. PhD thesis, University of Texas, Austin, 1992

    Google Scholar 

  5. S. Ganguly, W. Hasan and R. Krishnamurthy. Query Optimization for Parallel Executions. Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data

    Google Scholar 

  6. R.L. Graham. Bounds on Multiprocessing Timing Anomalies SIAM J. Appl. Math., 17(1969) 416–429

    Article  Google Scholar 

  7. W. Hasan and R. Motwani. Optimization Algorithms for Exploiting the Parallelism-Communication Trade-off in Pipelined Parallelism. Proceedings of the 1994 International Conference on Very Large Databases.

    Google Scholar 

  8. W. Hong and M. Stonebraker. Optimization of Parallel Query Execution Plans in XPRS. Proceedings of the First International Conference on Parallel and Distributed Database Systems. December 1991

    Google Scholar 

  9. W. Hong. Exploiting Inter-Operation Parallelism in XPRS Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data

    Google Scholar 

  10. R.S.G. Lanzelotte, P. Valduriez and M. Zait. On the Effectiveness of Optimization Search Strategies for Parallel Execution. Proceedings of the 1993 International Conference on Very Large Databases

    Google Scholar 

  11. M-L. Lo, M-S. Chen, C.V. Ravishankar and P.S. Yu. On Optimal Processor Allocation to Support Pipelined Hash Joins. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data.

    Google Scholar 

  12. Hui-I Hsiao, M-S. Chen, P.S. Yu. On Parallel Execution of Multiple Pipelined Hash Joins. Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data.

    Google Scholar 

  13. H. Lu, M.C. Shan and K.L. Tan. Optimization of Multi-Way Join Queries for Parallel Execution. Proceedings of 1991 International Conference on Very Large Databases

    Google Scholar 

  14. T.H. Niccum, J. Srivastava, B. Himatsingka, J-Z. Li. A Tree-Decomposition Approach to the Parallel Execution of Relational Query Plans. Technical Report, University of Minnesota at Minneapolis

    Google Scholar 

  15. H. Pirahesh, C. Mohan, J.Cheung, T.S. Liu and P. Selinger. Parallelism in Relational Database Systems: Architectural Issues and Design Approaches. Proceedings of the 1991 International Conference on Parallel and Distributed Information Systems

    Google Scholar 

  16. D. Schneider. Complex Query Processing in Multiprocessor Database Machines. PhD thesis, University of Wisconsin, Madison, 1990

    Google Scholar 

  17. P. Selinger, M.M. Astrahan, D.D. Chamberlain, R.A. Lorie and T.G. Price. Access Path Selection in a Relational Database Management System. Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data

    Google Scholar 

  18. Jaideep Srivastava and G. Elsesser. Query Optimization for Parallel Relational Databases. Preliminary version appeared in Proceedings of 1993 International Conference on Parallel and Distributed Information Systems

    Google Scholar 

  19. Eugene J. Shekita, Honesty C. Young and Kian-Lee Tan. Multi-Join Optimization for Symmetric Multiprocessors. Proceedings of the 1993 Conference on Very large Databases

    Google Scholar 

  20. K-L. Tan, H. Lu. On resource scheduling of multi-join queries in parallel database systems. Information Processing Letters 48 (1993), 189–195.

    Article  Google Scholar 

  21. J. Turek, J.L. Wolf, K.R. Pattipati and P.S. Yu. Scheduling Parallelizable Tasks: Putting it All on the Shelf. Proceedings of the 1992 ACM Sigmetrics Conference

    Google Scholar 

  22. M. Ziane, M. Zait, and P. Borla-Salamet. Parallel Query Processing in DBS3. In Proceedings of the 1993 International Conference on Parallel and Distributed Information Systems

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Subhash Bhalla

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ganguly, S., Gerasoulis, A., Wang, W. (1995). Partitioning pipelines with communication costs. In: Bhalla, S. (eds) Information Systems and Data Management. CISMOD 1995. Lecture Notes in Computer Science, vol 1006. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60584-3_40

Download citation

  • DOI: https://doi.org/10.1007/3-540-60584-3_40

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60584-3

  • Online ISBN: 978-3-540-47799-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics