Advertisement

Group-Wise Performance Evaluation of Processor Co-allocation in Multi-cluster Systems

  • John Ngubiri
  • Mario van Vliet
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4942)

Abstract

Performance evaluation in multi-cluster processor co-allocation - like in many other parallel job scheduling problems- is mostly done by computing the average metric value for the entire job stream. This does not give a comprehensive understanding of the relative performance of the different jobs grouped by their characteristics. It is however the characteristics that affect how easy/hard jobs are to schedule. We, therefore, do not get to understand scheduler performance at job type level. In this paper, we study the performance of multi-cluster processor co-allocation for different job groups grouped by their size, components and widest component. We study their relative performance, sensitivity to parameters and how their performance is affected by the heuristics used to break them up into components. We show that the widest component us characteristic that most affects job schedulability. We also show that to get better performance, jobs should be broken up in such a way that the width of the widest component is minimized.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bal, et al.: The Distributed ASCI Supercomputer Project. Operating Systems Review 34(4), 76–96 (2000)CrossRefGoogle Scholar
  2. 2.
    Bucur, A.I.D., Epema, D.H.J.: The Influence of the Structure and Sizes of Jobs on the Performance of Co-allocation. In: Feitelson, D.G., Rudolph, L. (eds.) IPDPS-WS 2000 and JSSPP 2000. LNCS, vol. 1911, pp. 154–173. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  3. 3.
    Bucur, A.I.D., Epema, D.H.J.: The Performance of Processor Co-allocation in Multicluster Systems. In: proceedings of the 3rd IEEE/ ACM International Symposium on Cluster Computing and the Grid (CCGrid 2003), pp. 302–309 (2003)Google Scholar
  4. 4.
    Bucur, A.I.D.: Performance Analysis of Processor Co-allocation in Multicluster Systems. PhD Thesis, Delft University of Technology, Delft, The Netherlands (2004)Google Scholar
  5. 5.
    Bucur, A.I.D., Epema, D.H.J.: Local versus Global Schedulers with Processor Co-allocation in Multicluster Systems. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 184–204. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  6. 6.
    Chiang, S.H., Arpaci-Dusseau, A., Vernon, M.K.: The Impact of More Accurate Requested Runtimes on Production Job Scheduling Performance. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 103–127. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Czajkowski, K., Foster, I.T., Kasselman, C.: Resource Co-allocation in Computational Grids. In: Proceedings of the 8th IEEE International Symposium on High Performance and Distributed Computing, California, USA, pp. 37–47 (1999)Google Scholar
  8. 8.
    Edmonds, J.: Scheduling in the Dark. In: Proceedings of the 31st Annual ACM Symposium on Theory of Computing, pp. 179–188 (1999)Google Scholar
  9. 9.
    Feitelson, D.G., Rudolph, L., Schweigelshohn, U., Sevcik, K., Wong, P.: Theory and Practice in Parallel Job Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 1–34. Springer, Heidelberg (1997)Google Scholar
  10. 10.
    Feitelson, D.G., Rudolph, L., Schwiegelshohn, U.: Parallel Job Scheduling - A Status Report. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 1–16. Springer, Heidelberg (2005)Google Scholar
  11. 11.
    Feitelson, D.G., Rudolph, L.: Towards Convergence of Job Schedulers for Parallel Supercomputers. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 1–26. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  12. 12.
    Feitelson, D.G.: Metric and Workload Effects on Computer Systems Evaluation. Computers 18, 18–25 (2003)CrossRefGoogle Scholar
  13. 13.
    Foster, I., Kasselman, C.: The Grid: Blue Print for a New Computing Infrastructure. Morgan Kaufmann, San Francisco, CA, USA (1999)Google Scholar
  14. 14.
    Frachtenberg, E., Feitelson, D.G.: Pitfalls in Parallel Job Scheduling Evaluation. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 257–282. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Jones, W.M.: Improving Parallel Job Scheduling Performance in Multi-clusters Through Selective Job Co-allocation, PhD dissertation, Clemson University, Clemson, South Carolina, USA (2005)Google Scholar
  16. 16.
    Lee, C.B., Schwartzman, Y., Hardy, J., Snavely, A.: Are User Runtime Estimates Inherently Inaccurate? In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 253–263. Springer, Heidelberg (2005)Google Scholar
  17. 17.
    Li, H., Groep, D., Walters, L.: Workload Characteristics of a Multi-cluster Supercomputer. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 176–193. Springer, Heidelberg (2005)Google Scholar
  18. 18.
    Lifka, L.: The ANL/IBM SP Scheduling System. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)Google Scholar
  19. 19.
    Moreira, J., Pattnaik, P., Franke, H., Jann, J.: An Evaluation of Parallel Job Scheduling for ASCI Blue-Pacific. In: Proceedings of the IEEE/ACM Supercomputing Conference SC 1999. Portland, Oregon, USA (1999)Google Scholar
  20. 20.
    Mualem, A.W., Feitelson, D.G.: Utilization, Predictability, Workloads and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Transactions in Parallel and Distributed Systems 12(6), 529–543 (2001)CrossRefGoogle Scholar
  21. 21.
    Shmueli, E., Feitelson, D.G.: Backfilling with Lookahead to Optimize the Performance of Parallel Job Scheduling. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 228–251. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  22. 22.
    Srinivasan, S., Krishnamoorthy, S., Sadayappan, P.: A Robust Scheduling Technology for Moldable Scheduling of Parallel Jobs. In: Proceedings of the IEEE International Conference on Cluster Computing, pp. 92–99 (2003)Google Scholar
  23. 23.
    Strohmaier, E., Dongarra, J.J., Meuer, H.W., Simon, D.: Recent Trends in the Marketplace of High Performance Computing. Journal of Parallel Computing 31, 261–273 (2005)CrossRefGoogle Scholar
  24. 24.
    Srinivasan, S., Kettimuthu, R., Subramani, V., Sadayappan, P.: Selective Reservation Strategies for Backfill Job Scheduling. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 55–71. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  25. 25.
    Srinivasan, S., Kettimuthu, R., Subramani, V., Sadayappan, P.: Characterization of backfilling Strategies for Parallel Job Scheduling. In: Proceedings of the 2002 International Conference on Parallel Processing Workshops, pp. 514–520 (2002)Google Scholar
  26. 26.
    The Distributed ASCI Supercomputer, http://www.cs.vu.nl/das2
  27. 27.
    The Global Grid Forum, http://www.gridforum.com
  28. 28.
    The Mesquite Software inc: The CSIM 18 Simulation Engine Users GuideGoogle Scholar
  29. 29.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • John Ngubiri
    • 1
  • Mario van Vliet
    • 1
  1. 1.Nijmegen Institute for Informatics and Information ScienceRadboud University NijmegenNijmegenThe Netherlands

Personalised recommendations