Advertisement

Grid Multicriteria Job Scheduling with Resource Reservation and Prediction Mechanisms

  • Krzysztof Kurowski
  • Jarek Nabrzyski
  • Ariel Oleksiak
  • Jan Weglarz
Chapter
Part of the International Series in Operations Research & Management Science book series (ISOR, volume 92)

Abstract

Grids link together computers, data, sensors, large scale scientific instruments, visualization systems, networks and people. They can provide very large pools of computer resources, enable distributed collaborations and deliver increased efficiency and on-demand computing capabilities. The complexity of Grids on one hand and the requirements towards performance and capability on the other hand call for efficient resource management and scheduling mechanisms. Such mechanisms must take into account not only the hardware and software resources, user jobs and applications, but also policies of the resource owners. Policies usually describe cost models for the resource usage, security mechanisms, quality of service of resource provisioning etc. The problem of scheduling jobs in real Grid environments is very difficult. Due to lack of time characteristics of jobs, and difficulties in characterizing the overall system, traditional OR techniques usually fail or achieve very weak results. Usually, best effort scheduling is the best option. There are, however, some ways to deal with the problems described above.

The main goal of this paper it to present some practical issues of scheduling Grid jobs. Methods and techniques described in the paper are used in a Grid scheduling system, called GRMS (Grid Resource Management System) developed at Poznan Supercomputing and Networking Center. GRMS is widely used in many Grid infrastructures worldwide.

Keywords

Grid computing Grid resource management and scheduling multicriteria decision support 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramson, D., Buyya, R. and Giddy, J. (2002). A computational economy for Grid computing and its implementation in the Nimrod-G resource broker, Future Generation Computer Systems, 18(8): 1061–1074.CrossRefGoogle Scholar
  2. Agrawal, R. and Srikant, R. (1994). Fast Algorithms for Mining Association Rules, in: Proceedings of the Twentieth Intl. Conference on Very Large Databases, Morgan Kaufmann, pp. 487–499.Google Scholar
  3. Allen, G., Davis, K., Dolkas, K.N., Doulamis, N.D., Goodale, T., Kielmann, T., Merzky, A., Nabrzyski, J., Pukacki, J., Radke, T., Russell, M., Seidel, E., Shalf, J. and Taylor, I. (2003). Enabling Applications on the Grid-A GridLab Overview, International Journal of High Performance Computing Applications, 17(4):449–466.CrossRefGoogle Scholar
  4. Bode, B., Kendall, D.M. and Lei, Z. (2000). The Portable Batch Scheduler and the Maui scheduler on Linux clusters, in: Proceedings of 4th Annual Linux Showcase and Conference, October 2000.Google Scholar
  5. Černy, V. (1985). Thermodynamical Approach to the Traveling Salesman Problem: An Efficient Simulation Algorithm, Journal of Optimization Theory and Applications, 45:41–51.CrossRefMathSciNetGoogle Scholar
  6. Cheung, L.S. (2001). A Fuzzy Approach to Load Balancing in a Distributed Object Computing Network, in: Proceedings of the First IEEE International Symposium of Cluster Computing and the Grid (CCGrid’01), pp. 694–699.Google Scholar
  7. Condor Group, Condor project, http://www.cs.wisc.edu/condor.Google Scholar
  8. Czajkowski, K., Foster, I., Kesselman, C., Martin, S., Smith, W. and Tuecke, S. (1997). A resource management architecture for metacomputing systems, JSSPP Whorskshop, Lecture Notes on Computer Science, 1459:62–68.CrossRefGoogle Scholar
  9. Dail, H. (2001). A Modular Framework for Adaptive Scheduling in Grid Application Development Environments, Technical report CS2002-0698, Computer Science Department, University of California, San Diego.Google Scholar
  10. Darken, C. and Moody, J. (1990). Fast Adaptive k-means clustering: Some empirical results, in: Proceedings of the International Joint Conference on Neural Networks, vol. II, IEEE Neural Networks Council, pp. 233–238.CrossRefGoogle Scholar
  11. Dinda, P. (2001). Online prediction of the running time of tasks, in: Proceedings of 10th IEEE Symp. on High Performance Distributed Computing, pp. 336–337.Google Scholar
  12. Downey, A. (1997). Predicting Queue Times on Space-Sharing Parallel Computers, in: 11th International Parallel Processing Symposium, pp. 209–218.Google Scholar
  13. Global Grid Forum DRMAA WG, DRMAA Web Site, http://www.drmaa.org.Google Scholar
  14. European DataGrid Project, http://www.eu-datagrid.org.Google Scholar
  15. El-Ghazawi, T., Gaj, K., Alexandridis, N., Vroman, F., Nguyen, N., Radzikowski, J.R., Samipagdi, P. and Suboh, S.A. (2004). A performance study of job management systems, Concurrency and Computation: Practice and Experience, 16(13): 1229–1246.CrossRefGoogle Scholar
  16. Feitelson, D.G. and Mu’alem Weil, A. (1998). Utilization and predictability in sche-duling the IBM SP2 with backfilling, Proceedings of 12th International Parallel Processing Symp., Orlando, pp. 542–546.Google Scholar
  17. Feitelson, D.G., Parallel Workload Archive, http://www.cs.huji.ac.il/labs/parallel/work-load.Google Scholar
  18. Figuiera, S.M. and Bermann, F. (2001). Mapping Parallel Applications to Distributed Heterogeneous Systems, Technical report CS2002-0698, Computer Science Department, University of California, San Diego.Google Scholar
  19. Foster, I. and Kesselman, C. (1998). The Globus Project: A Status Report, in: Proceedings of the Seventh Heterogeneous Computing Workshop, pp. 4–18.Google Scholar
  20. Foster, I. and Kesselman, C. (editors) (1999). The Grid: Blueprint for a New Computing Infrastructure, Morgan Kauffmann, San Francisco, California.Google Scholar
  21. Foster, I. and Kesselman, C. (1999). Computational Grids, in: The Grid: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman, eds, Morgan Kaufmann, San Francisco, California, pp. 15–52.Google Scholar
  22. Gibbons, R. (1997). A Historical Application Profiler for Use by Parallel Schedulers, Lecture Notes on Computer Science, 1297:58–75.Google Scholar
  23. Globus Team, Globus Project, http://www.globus.org.Google Scholar
  24. Glover, F. (1989). Tabu Search-part 1, ORSA Journal of Computing, 1(3): 190–206.Google Scholar
  25. Glover, F. (1990). Tabu Search-part 2, ORSA Journal of Computing, 2:4–32.Google Scholar
  26. Glover, F. (1986). Future Path for Integer Programming and Links to Artificial Intelligence, Computers & Operations Research, 13:533–549.CrossRefMathSciNetGoogle Scholar
  27. Goldberg, D.E., (1989). Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading.zbMATHGoogle Scholar
  28. Greco, S., Matarazzo, B., Slowinski, R. and Tsoukias, A. (1998). Exploitation of a rough approximation of the outranking relation in multicriteria choice and ranking, in: Trends in Multi-Criteria Decision Making, T.J Stewart and R.C van der Honert, eds, Springer Verlag, Berlin, pp. 45–60.Google Scholar
  29. Greco, S., Matarazzo, S. and Slowinski, R. (2001). Rough sets theory for multicriteria decision analysis, European Journal of Operational Research, 129(1): 1–47.CrossRefMathSciNetGoogle Scholar
  30. Holland, J.H. (1975). Adaptation in Natural and Artificial Systems, University of Michigan Press.Google Scholar
  31. Ishibushi, H. and Murata, T. (1998). A Multi-Objective Genetic Local Search Algorithm and Its Application to Flowshop Scheduling, IEEE Transactions on Systems, Man and Cybernetics, 28(3):392–403.CrossRefGoogle Scholar
  32. Jackson, D.B., Maui Admin Guide, http://supercluster.org/maui/docs/mauiadmin.html.Google Scholar
  33. Jaszkiewicz, A. (1998). Genetic local search for multiple objective combinatorial optimisation, Technical Report RA014 /98, Institute of Computing Science, Poznan University of Technology.Google Scholar
  34. Kirkpatrick, S., Gelatt, C.D., Jr and Vecchi, M.P. (1983)., Optimization by Simulated Annealing, Science, 230:671–680.CrossRefMathSciNetADSGoogle Scholar
  35. Knowles, J.D. and Corne, D.W. (2000). A Comparison of Diverse Approaches to Memetic Multiobjective Combinatorial Optimization, in: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000), Workshop On Memetic Algorithms, pp. 103–108.Google Scholar
  36. Knowles, J.D. and Corne, D.W. (2000). M-PAES: A Memetic Algorithm for Multiobjective Optimization, in: Proceedings of the 2000 Congress on Evolutionary Computation, pp. 325–332.Google Scholar
  37. Kurowski, K., Nabrzyski, J. and Pukacki, J. (2000). Multicriteria Resource Management Architecture for Grid, in: Proceedings of the 4th Globus Retreat, Pittsburgh, PA, July 2000.Google Scholar
  38. Kurowski, K., Nabrzyski, J. and Pukacki, J. (2000). Predicting Job Execution Times in the Grid, in: Proceedings of the 1st SGI 2000 International User Conference, Krakow, pp. 272–282.Google Scholar
  39. Kurowski, K., Nabrzyski, J. and Pukacki, J. (2001). User preference driven multiobjective resource management in Grid environments, in: Proceedings of the First IEEE International Symposium on Cluster Computing and the Grid (CCGrid’01), pp. 114–121.Google Scholar
  40. Kurowski, K., Nabrzyski, J., Oleksiak, A. and Węglarz, J. (2003). Multicriteria Aspects of Grid Resource Management, in: Grid Resource Management, J. Nabrzyski, J. Schopf, and J. Węglarz, eds, Kluwer Academic Publishers, Boston/Dordrecht/London, pp. 271–294.Google Scholar
  41. Kurowski, K., Ludwiczak, B., Nabrzyski, J., Oleksiak, A. and Pukacki, J. (2004). Improving Grid Level Throughput Using Job Migration and Rescheduling Techniques in GRMS, Scientific Programming, 12:(4)263–273.Google Scholar
  42. Kurowski, K., Oleksiak, A., Nabrzyski, J., Guim, F., Corbalan, J., Labarta, J., Kwiecien, A., Wojtkiewicz, M. and Dyczkowski, M. (2005). Multicriteria Grid Resource Management using Performance Prediction Techniques, in: Proceedings of the 2nd CoreGrid Workshop, Springer Verlag (to appear).Google Scholar
  43. Langley, P., Iba, W. and Thompson, K. (1992). in: An Analysis of Bayesian Classifiers, Proceedings of AAAI-92, pp. 223–228.Google Scholar
  44. Lifka, D. (1995). The ANL/IBM SP scheduling system, in: Job Scheduling Strategies for Parallel Processing, D.G. Feitelson and L. Rudolph, eds, Springer-Verlag, Lecture Notes of Computer Science, 949:295–303.Google Scholar
  45. Liu, C., Yang, L., Foster, I. and Angulo, D. (2002). Design and evaluation of a resource selection framework for Grid applications, in: Proceedings if the Eleventh IEEE International Symposium on High-Performance Distributed Computing (HPDC-II), pp. 63–72.Google Scholar
  46. Nabrzyski, J., Schopf, J. and Weglarz, J., editors, (2003). Grid Resource Management-State of the Art and Future Trends, Kluwer Academic Publishers.Google Scholar
  47. Nabrzyski, J. (2000). User Preference Driven Expert System for Solving Multi-objective Project Scheduling Problems, Ph.D. Thesis, Poznan University of Technology.Google Scholar
  48. Pawlak, Z. (1982). Rough Sets, International Journal of Information & Computer Sciences, 11(5):341–356.CrossRefMathSciNetGoogle Scholar
  49. Platform Computing Technical Docs, http://www.platform.com/services/support /docs/LSFDoc51.asp.Google Scholar
  50. Quinlan, J.R. (1986), Induction of Decision Trees, Machine Learning, 1:81–106.Google Scholar
  51. Rumelhart, D.E., Hinton, G.E. and Williams, RJ. (1986). Learning Representations by Back Propagating Errors, Nature, 323:533–536.CrossRefADSGoogle Scholar
  52. Sandholm, T.W. (1999). Distributed Rational Decision Making, in: Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence, G. Weiss, ed, MIT Press, pp. 201–258.Google Scholar
  53. Schopf, J. and Berman, F. (1998). Performance prediction in production environments, in: Proceedings of IPPS/SPDP, pp. 647–653.Google Scholar
  54. Shirazi, B.A., Husson, A.R. and Kavi, K.M. (1995). Scheduling and Load Balancing in Parallel and Distributed Systems, IEEE Computer Society Press.Google Scholar
  55. Smith, W., Taylor, V. and Foster, I. (1999), Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance, Proceedings of the IPPS/SPDP’ 99 Workshop on Job Scheduling Strategies for Parallel Processing, pp. 202–219.Google Scholar
  56. Taylor V., Wu, X., Geisler, J., Li, X., Lan, Z., Hereld, M., Judson, R. and Stevens, R. (2001). Prophesy: Automating the modeling process, in: Proceedings Of the Third International Workshop on Active Middleware Services.Google Scholar
  57. Veridian Inc. PBS: The Portable Batch System. http://www.openpbs.org/Google Scholar
  58. Vazhkudai, S. and Schopf, J. (2003). Using Regression Techniques to Predict Large Data Transfers, Journal of High Performance Computing Applications-Special Issue on Grid Computing: Infrastructure and Application, 17: 249–268.Google Scholar
  59. Węglarz, J., editor (1999). Project Scheduling-Recent Models, Algorithms and Applications, Kluwer Academic Publishers.Google Scholar
  60. Wolski, R., Spring, N. and Hayes, J. (1999). The Network Weather Service: a distributed resource performance forecasting service for metacomputing, Future Generation Computer Systems, 15(5–6): 757–768.CrossRefGoogle Scholar
  61. Wolski, R. (1997). Dynamically Forecasting Network Performance to Support Dynamic Scheduling Using the Network Weather Service, Cluster Computing, 1(1): 119–132.CrossRefGoogle Scholar
  62. Zadeh, L.A. (1965), Fuzzy Sets, Information and Control, 8(3):338–353.CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2006

Authors and Affiliations

  • Krzysztof Kurowski
    • 1
  • Jarek Nabrzyski
    • 1
  • Ariel Oleksiak
    • 1
  • Jan Weglarz
    • 1
  1. 1.Poznan Supercomputing and Networking CenterPoland

Personalised recommendations