Advertisement

Solving the Static Task Scheduling Problem for Real Machines

  • Cristina Boeres
  • Vinod E. F. Rebello
Part of the Applied Optimization book series (APOP, volume 67)

Abstract

While the task scheduling problem under the delay model has been studied extensively, relatively little research exists for more realistic communication models such as the LogP model. The task scheduling problem is known to be NPcomplete even under the delay model (a simplified instance of the LogP model). This chapter describes the LogP model and the influence of its communication parameters on task scheduling. The similarities and differences between clustering algorithms under the delay and LogP models are discussed and a design methodology for clustering-based scheduling algorithms for the LogP model is presented. Using this design methodology, a task scheduling algorithm for the allocation of arbitrary task graphs to fully connected networks of processors under LogP model is proposed. The strategy exploits the replication and clustering of tasks to minimize the ill effects of communication on the makespan.

Keywords

Multicomputer task scheduling clustering task replication LogP model 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    T. Adam, K. Chandy, and J. Dickson. A comparison of list schedulers for parallel processing systems. Communications of the ACM, 17(12):685–690, 1974.zbMATHCrossRefGoogle Scholar
  2. [2]
    A. Aggarwal, A.K. Chandra, and M. Snir. On communication latency in PRAM computations. Technical Report RE RC 14973, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA, September 1989.Google Scholar
  3. [3]
    I. Ahmad, Y. Kwok, and M. Wu. Analysis, evaluation and comparison of algorithms for scheduling task graphs to parallel processors. In Proceedings of the 1996 International Symposium on Parallel Architecture, Algorithms and Networks, Beijing, China, June 1996.Google Scholar
  4. [4]
    I. Ahmad and Y K Kwok. A new approach to scheduling parallel programs using task duplication. In K.C. Tai, editor, Proceedings ofthe International Conference on Parallel Processing (ICPP ’94), volume 2, pages 47–51, St. Charles, Illinois, USA., August 1994.Google Scholar
  5. [5]
    A. Alexandrov, M. Ionescu, K.E. Schauser, and C. Scheiman. LogGP: Incorporating long messages into the LogP model — one step closer towards a realistic model for parallel computation. The Proceedings of the 7th Annual Symposium on Parallel Algorithms and Architectures (SPAA’95), July 1995. Also as TRCS95–09 — Department of Computer Science — University of Santa Barbara, CA, USA.Google Scholar
  6. [6]
    T.E. Anderson, D.E. Culler, and D.A. Patterson. A case for NOW (Networks of Workstations). IEEE Micro, 15(1):23–39, February 1995.CrossRefGoogle Scholar
  7. [7]
    F.D. Anger, J-J Hwang, and Y C Chow. Scheduling with sufficient loosely coupled processors. Journal ofParallel and Distributed Computing, 9:87–92, 1990.CrossRefGoogle Scholar
  8. [8]
    M. Baker and R. Buyya. Cluster Computing at a Glance. In R. Buyya, editor, High Performance Cluster Computing Architectures and Systems, chapter 1, pages 3–47. Prentice Hall, 1999.Google Scholar
  9. [9]
    C. Boeres. Versatile Communication Cost Modelling for Multicomputer Task Scheduling Heuristics. PhD thesis, Department of Computer Science, University of Edinburgh, May 1997.Google Scholar
  10. [10]
    C. Boeres, G. Chochia, and P. Thanisch. On the scope of applicability of the ETF algorithm. In A. Ferreira and J. Rolim, editors, Proceeding of the 2nd International Workshop on Parallel Algorithms for Irregularly Structured Problems (IRREGULAR’95), LNCS 980, pages 159–164, Lyon, France, September 1995. Springer.CrossRefGoogle Scholar
  11. [11]
    C. Boeres, A. Nascimento, and V. E. F. Rebello. Cluster-based task scheduling for LogP model. International Journal ofFoundations of Computer Science, 10(4):405–424, 1999.MathSciNetCrossRefGoogle Scholar
  12. [12]
    C. Boeres, A. Nascimento, and V. E. F. Rebello. Scheduling arbitrary task graphs on LogP machines. In P. Amestoy, P. Berger, M. Daydé, I. Duff, V. Frayssé, L. Giraud, and D. Ruiz, editors, Proceedings of the Fifth International Euro-Par Conference on Parallel Processing (Euro-Par’99), LNCS 1685, pages 340–349, Toulouse, France, April 1999. Springer.CrossRefGoogle Scholar
  13. [13]
    C. Boeres and V. E. F. Rebello. Versatile task scheduling of binary trees for realistic machines. In C. Lengauer, M. Griebl, and S. Gorlatch, editors, Proceedings of the Third International Euro-Par Conference on Parallel Processing (Euro-Par’97), LNCS 1300, pages 913–921, Passau, Germany, August 1997. Springer-Verlag.CrossRefGoogle Scholar
  14. [14]
    C. Boeres and V. E. F. Rebello. A versatile cost modelling approach for multicomputer task scheduling. Parallel Computing, 25(1):63–86, 1999.zbMATHCrossRefGoogle Scholar
  15. [15]
    C. Boeres, V. E. F. Rebello, and D. Skillicorn. Static scheduling using task replication for LogP and BSP models. In D. Pritchard and J. Reeve, editors, The Proceedings of the 4th International Euro-Par Conference on Parallel Processing (Euro-Par’98), LNCS 1470, pages 337–346, Southampton, UK, September 1998. Springer.CrossRefGoogle Scholar
  16. [16]
    R.P. Brent. The parallel evaluation of general arithmetic expressions. Journal of the ACM, 21:201–206, 1974.MathSciNetzbMATHCrossRefGoogle Scholar
  17. [17]
    T.L. Casavant and J.G. Kuhl. A taxonomy of scheduling in generalpurpose distributed computing systems. IEEE Transactions on Software Engeneering, 14(2):141–154, February 1988.CrossRefGoogle Scholar
  18. [18]
    G. Chochia, C. Boeres, M. Norman, and P. Thanisch. Analysis of multicomputer schedules in a cost and latency model of communication. In Proceedings ofthe 3rd Workshop on AbstractMachine Models for Parallel and Distributed Computing, Leeds, UK., April 1996. IOS press.Google Scholar
  19. [19]
    E.G. Coffman Jr. Computer and Job Shop Scheduling Theory. John Wiley, 1976.zbMATHGoogle Scholar
  20. [20]
    R. Corrêa and A. Ferreira. A polynomial-time branching procedure for the multiprocessor scheduling problem. In P. Amestoy, P. Berger, M. Daydé, I. Duff, V. Frayssé, L. Giraud, and D. Ruiz, editors, Proceedings of the Fifth International Euro-Par Conference on Parallel Processing (EuroPar’99), LNCS 1685, pages 272–279, Toulouse, France, August 1999. Springer.CrossRefGoogle Scholar
  21. [21]
    M. Cosnard and M. Loi. Automatic task graph generation techniques. Parallel Processing Letters, 5(4):527–538, December 1995.CrossRefGoogle Scholar
  22. [22]
    D. Culler, R. Karp, D. Patterson, A. Sahay, K.E. Schauser, E. Santos, R. Subramonian, and T. von Eicken. LogP: Towards a realistic model of parallel computation. In The Proceedings of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, USA, May 1993.Google Scholar
  23. [23]
    B. Di Martino and G. Ianello. Parallelization on non-simultaneous iterative methods for systems of linear equations. In Parallel Processing (CONPAR’94-VAPP VI), LNCS 854, pages 254–264. Springer-Verlag, 1994.Google Scholar
  24. [24]
    J. Eisenbiergler, W. Lowe, and A. Wehrenpfennig. On the optimization by redundancy using an extended LogP model. In Proceedings of the International Conference on Advances in Parallel and Distributed Computing (APDC’97), pages 149–155. IEEE Comp. Soc. Press, 1997.CrossRefGoogle Scholar
  25. [25]
    H. El-Rewini and T.G. Lewis. Scheduling parallel program tasks onto arbitrary target machines. Journal ofParallel and Distributed Computing, 9(2):138–153, June 1990.CrossRefGoogle Scholar
  26. [26]
    H. El-Rewini, T.G. Lewis, and H.A. Ali. Task Scheduling in Parallel and Distributed Systems. Prentice Hall Series in Innovative Technology. Prentice Hall, 1994.Google Scholar
  27. [27]
    S. Fortune and J. Wyllie. Parallelism in random access machines. In Proceedings ofthe 10th Annual ACM Symposium on Theory of Computing, pages 114–118. ACM Press, 1978.Google Scholar
  28. [28]
    M.R. Garey and D.S. Johnson. Computers and Intractability. W.H. Freeman and Co., 1979.zbMATHGoogle Scholar
  29. [29]
    A. Gerasoulis, Jiao J., and T. Yang. A multistage approach to scheduling task graphs. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 22:81–103, 1995.MathSciNetGoogle Scholar
  30. [30]
    A. Gerasoulis, S. Venugopol, and T. Yang. Clustering task graphs for message passing architectures. In Proceedings ofthe International Conference on Supercomputing, pages 447–456, Amsterdam, The Netherlands, June 1990.Google Scholar
  31. [31]
    A. Gerasoulis and T. Yang. A comparison of clustering heuristics for scheduling directed acyclic graphs on multiprocessors. Journal ofParallel and Distributed Computing, 16:276–291, 1992.MathSciNetzbMATHCrossRefGoogle Scholar
  32. [32]
    A. Gerasoulis and T. Yang. List scheduling with and without communication. Parallel Computing, 19:1321–1344, 1993.zbMATHCrossRefGoogle Scholar
  33. [33]
    A. Gerasoulis and T. Yang. DSC: scheduling parallel tasks on an unbounded number of processors. IEEE Transactions on Parallel and Distributed Systems, 5(9):951–967, 1994.CrossRefGoogle Scholar
  34. [34]
    J-J. Hwang, Y C. Chow, F.D. Anger, and C-Y. Lee. Scheduling precedence graphs in systems with interprocessor communication times. SIAM Journal on Computing, 18(2):244–257, 1989.MathSciNetzbMATHCrossRefGoogle Scholar
  35. [35]
    H. Jung, L. Kirousis, and P. Spirakis. Lower bounds and efficient algorithms for multiprocessor scheduling of DAGs with communication delays. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 254–264, 1989.Google Scholar
  36. [36]
    A.A. Khan, C.L. McCreary, and M.S. Jones. Comparison of multiprocessor scheduling heuristics. In K.C. Tai, editor, Proceedings of the Eighth International Conference on Parallel Processing, volume 2, pages 243–250, Cancun, Mexico, April 1994. IEEE Computer Society Press — ACM SIGARCH.Google Scholar
  37. [37]
    S.J. Kim and J.C. Browne. A general approach to mapping of parallel computations upon multiprocessor architectures. In Proceedings of the 3rd International Conference on Parallel Processing, pages 1–8, 1988.Google Scholar
  38. [38]
    I. Kort and D. Trystram. Scheduling fork graphs under LogP with an unbounded number of processors. In D. Pritchard and J. Reeve, editors, Proceedings of the 4th International Euro-Par Conference on Parallel Processing (Euro-Par’98), LNCS 1470, pages 940–943, Southampton, UK, September 1998. Springer.CrossRefGoogle Scholar
  39. [39]
    B Kruatrachue and T. Lewis. Grain size determination for parallel programming. IEEE Software, pages 23–32, Jan. January 1988.Google Scholar
  40. [40]
    Y. K. Kwok and I. Ahmad. Dynamic critical-path scheduling: An effective technique for allocating tasks graphs to multiprocessors. IEEE Transactions on Parallel and Distributed Systems, 7(5):506–521, May 1996.CrossRefGoogle Scholar
  41. [41]
    Y K Kwok and I. Ahmad. Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm. Journal of Parallel and Distributed Computing, 47(1):58–77, November 1997.CrossRefGoogle Scholar
  42. [42]
    Y K Kwok and I. Ahmad. Benchmarking and comparison of the task graph scheduling algorithms. Journal of Parallel and Distributed Computing, 59(3):381–422, December 1999.zbMATHCrossRefGoogle Scholar
  43. [43]
    Y K Kwok and I. Ahmad. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Computing Surveys, 31(4), December 1999.Google Scholar
  44. [44]
    W. Lowe and W. Zimmermann. On finding optimal clusterings of task graphs. In Aizu International Symposium on Parallel Algorithm and Architecture Synthesis, pages 241–247. IEEE Computer Society Press, 1995.CrossRefGoogle Scholar
  45. [45]
    R.P. Martin, A.M. Vahdat, D.E. Culler, and T.E. Anderson. Effects of communication latency, overhead, and bandwidth in a cluster architecture. In Proceedings of the 24th Annual International Symposium on Computer Architecture, pages 85–97, June 1997.Google Scholar
  46. [46]
    W.F. McColl. BSP programming. In G. Blelloch, M. Chandy, and S. Jagannathan, editors, Proc. DIMACS Workshop on Specification of Parallel Algorithms, Princeton, May 9–11, 1994. American Mathematical Society, 1994.Google Scholar
  47. [47]
    A. Nascimento. Aglomeraçâo de tarefas em arquiteturas paralelas com memória distribuida. Master’s thesis, Instituto de Computação, Universidade Federal Fluminense, Brazil, Niterói, RJ, Brazil, 1999. (In Portuguese).Google Scholar
  48. [48]
    M.A. Palis, J.-C Liou, and D.S.L. Wei. Task clustering and scheduling for distributed memory parallel architectures. IEEE Transactions on Parallel and Distributed Systems, 7(1):46–55, January 1996.CrossRefGoogle Scholar
  49. [49]
    C.H. Papadimitriou and M. Yannakakis. Towards an architectureindependent analysis of parallel algorithms. SIAM Journal on Computing, 19:322–328, 1990.MathSciNetzbMATHCrossRefGoogle Scholar
  50. [50]
    S.C.S. Porto, J.P. Kitajima, and C.C. Ribeiro. Performance evaluation of a parallel tabu search task scheduling algorithm. Parallel Computing, 26(1):73–90, 2000.zbMATHCrossRefGoogle Scholar
  51. [51]
    V. Sarkar. Partitioning and Scheduling Parallel Programs for Multiprocessors. Pitman, London, 1989.zbMATHGoogle Scholar
  52. [52]
    B. Shirazi, M. Wang, and G. Pathak. Analysis and evaluation of heuristic methods for static task scheduling. Journal of Parallel and Distributed Computing, 10:222–232, 1990.CrossRefGoogle Scholar
  53. [53]
    G.C. Sih and E.A. Lee. A compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures. IEEE Transactions on Parallel and Distributed Systems, 4(2):175–187, 1993.CrossRefGoogle Scholar
  54. [54]
    D. B. Skillicorn, J. M. D. Hill, and W. F. McColl. Question and answers about BSP. Scientific Computing, May 1997.Google Scholar
  55. [55]
    H. S. Stone. High-Performance Computer Architecture. Electrical and Computer Engineering. Addison-Wesley, 1993.Google Scholar
  56. [56]
    A. Tam and C. Wang. Realistic communication model for parallel computing on cluster. In The Proceedings of the First IEEE International Workshop on Cluster Computing, pages 92–101, Melbourne, Australia, December 1999. IEEE Computer Society Press.CrossRefGoogle Scholar
  57. [57]
    L.G. Valiant. A bridging model for parallel computation. Communication of the ACM, 33:103–111, 1990.CrossRefGoogle Scholar
  58. [58]
    W. H. Yu. LU Decomposition on a Multiprocessor System with Communication Delay. PhD thesis, Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA, USA., 1984.Google Scholar
  59. [59]
    W. Zimmermann, M. Middendorf, and W. Lowe. On optimal k-linear scheduling of tree-like task graph on LogP-machines. In D. Pritchard and J. Reeve, editors, Proceedings of the 4th International Euro-Par Conference on Parallel Processing (Euro-Par’98), LNCS 1470, pages 328–336, Southampton, UK, September 1998. Springer.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2002

Authors and Affiliations

  • Cristina Boeres
    • 1
  • Vinod E. F. Rebello
    • 1
  1. 1.Instituto de ComputaçãoUFFSão Domingos, NiteróiBrazil

Personalised recommendations