Users Matter: A Multi-agent Systems Model of High Performance Computing Cluster Users

  • Michael J. North
  • Cynthia S. Hood
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3415)


High performance computing clusters have been a critical resource for computational science for over a decade and have more recently become integral to large-scale industrial analysis. Despite their well-specified components, the aggregate behavior of clusters is poorly understood. The difficulties arise from complicated interactions between cluster components during operation. These interactions have been studied by many researchers, some of whom have identified the need for holistic multi-scale modeling that simultaneously includes network level, operating system level, process level, and user level behaviors. Each of these levels presents its own modeling challenges, but the user level is the most complex due to the adaptability of human beings. In this vein, there are several major user modeling goals, namely descriptive modeling, predictive modeling and automated weakness discovery. This study shows how multi-agent techniques were used to simulate a large-scale computing cluster at each of these levels.


Unify Modeling Language High Performance Computing Cluster Performance Manual Trace User Matter 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dongarra, J., Meuer, H., Simon, H., Strohmaier, E.: High Performance Computing Today. In: Proceedings of the 1st International Conference on Molecular Modeling and Simulation, American Institute of Chemical Engineers, New York, USA, July 23-28, pp. 1–7 (2000)Google Scholar
  2. 2.
    Sterling, T.: Launching Into the Future of Commodity Cluster Computing. In: Proceedings of the 2002 IEEE International Conference on Cluster Computing, September 23-26. IEEE, Los Alamitos (2002)Google Scholar
  3. 3.
    Downey, A., Feitelson, D.: The Elusive Goal of Workload Characterization. ACM SIGMETRICS Performance Evaluation Review 26(4), 14–29 (1999)CrossRefGoogle Scholar
  4. 4.
    Dowdy, L., Rosti, E., Serazzi, G., Smirni, E.: Scheduling Issues in High-Performance Computing. ACM SIGMETRICS Performance Evaluation Review (Special Issue on Parallel Scheduling), 60–69 (March 1999)Google Scholar
  5. 5.
    Feitelson, D., Nitzberg, W.: Job Characteristics of A Production Parallel Scientific Workload on the NASA Ames iPSC/860. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 337–360. Springer-Verlag, Heidelberg (1995)Google Scholar
  6. 6.
    Feitelson, D.: Experimental Analysis of the Root Causes of Performance Evaluation Results: A Backfilling Case Study, Technical Report 2002-4, School of Computer Science and Engineering, the Hebrew University of Jerusalem, Jerusalem, Israel (March 2002)Google Scholar
  7. 7.
    Draper, J., Ghosh, J.: A Comprehensive Analytical Model for Wormhole Routing in Multicomputer Systems. Journal of Parallel and Distributed Computing 23(2), 202–214 (1994)CrossRefGoogle Scholar
  8. 8.
    Kim, S., Lee, S.: Measurement and Prediction of Communication Delays in Myrinet Networks. Journal of Parallel and Distributed Computing 61(2), 1692–1704 (2001)zbMATHCrossRefGoogle Scholar
  9. 9.
    Chang, X.: Network Simulations with OPNET. In: Winter Simulation Conference Proceedings, December 5-8, vol. 1, pp. 307–314. IEEE, Piscataway (1999)Google Scholar
  10. 10.
    Kang, S., Cha, K.: Performance Evaluation and Simulation on Myrinet-Based Packet Router. Electrical Engineering 611 Class Project, School of Engineering, Cleveland State University, Cleveland (Fall 2000)Google Scholar
  11. 11.
    Lawson, B., Smirni, E.: Multiple-Queue Backfilling Scheduling with Priorities and Reservations for Parallel Systems. ACM SIGMETRICS Performance Evaluation Review 29(4), 40–47 (2002)CrossRefGoogle Scholar
  12. 12.
    Argonne National Laboratory, Chiba City Project (May 2004), Available as
  13. 13.
    Serenko, A., Detlor, B.: A General Overview of the Market and An Assessment of Instructor Satisfaction With Utilizing Toolkits in the Classroom (Working Paper 455), McMaster University, Hamilton (2002)Google Scholar
  14. 14.
    Gilbert, N., Bankes, S.: Platforms and Methods for Agent-Based Modeling. Proceedings of the National Academy of Sciences of the USA 99(suppl. 3), 7197–7198 (2002)CrossRefGoogle Scholar
  15. 15.
    Collier, N., Howe, T., North, M.: Onward and Upward: The Transition to Repast 2.0. In: Proceedings of the First Annual North American Association for Computational Social and Organizational Science Conference, Electronic Proceedings, Pittsburgh, PA, USA (June 2003)Google Scholar
  16. 16.
    ROAD: Repast 2.0 (May 2004), Available as
  17. 17.
    Law, A.A., Kelton, D.: Simulation Modeling and Analysis. McGraw-Hill, New York (1982)zbMATHGoogle Scholar
  18. 18.
    Cloyer, A., Clement, A., Bodkin, R., Hugunin, J.: Practitioners Report: Using Aspect J for Component Integration in Middleware. In: Companion of the 18th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications. ACM, New York (2003)Google Scholar
  19. 19.
    Elrad, T., Filman, R., Bader, A.: Aspect-Oriented Programming: Introduction. Communications of the ACM 44(10), 29–32 (2001)CrossRefGoogle Scholar
  20. 20.
    Boden, N., Cohen, D., Felderman, Kulawik, R., Seitz, A., Seizovic, C., Su, J.: Myrinet: Aa Gigabit-Per-Second Local Area Network. IEEE Micro 15(1), 29–36 (1995)CrossRefGoogle Scholar
  21. 21.
    Francis, S., Frost, V., Soldan, D.: Measured Ethernet Performance for Multiple Large File Transfers. In: Proceedings of the 14th Conference on Local Computer Networks, October 10-12, pp. 323–327. IEEE, Piscataway (1989)CrossRefGoogle Scholar
  22. 22.
    Smith, W., Kain, R.: Ethernet Performance Under Actual and Simulated Loads. In: Proceedings of the 16th Conference on Local Computer Networks, October 14-17, pp. 569–581. IEEE, Piscataway (1991)CrossRefGoogle Scholar
  23. 23.
    Supercluster Research and Development Group: Maui Source Code (January 2004), Available as
  24. 24.
    Supercluster Research and Development Group: Maui Scheduler Administrator’s Guide v.3.2, Cluster Resources, Covered Bridge Canyon, Utah, USA (2002)Google Scholar
  25. 25.
    North, M.: Towards Strength and Stability: Agent-Based Modeling of Infrastructure Market. Social Science Computer Review, 307–323 (Fall 2001)Google Scholar
  26. 26.
    Murakami, Y., Minami, K., Kawasoe, T., Ishida, T.: Multi-Agent Simulation for Crisis Management. In: Proceedings of the 2002 IEEE Workshop on Knowledge Media Networking, July 10-12, pp. 135–139. IEEE, Piscataway (2002)CrossRefGoogle Scholar
  27. 27.
    Gozzi, A., Paolucci, M., Boccalatte, A.: A Multi-Agent Approach To Support Dynamic Scheduling Decisions. In: Proceedings of the Seventh International Symposium on Computers and Communications, July 1-4, pp. 983–988. IEEE, Piscataway (2002)CrossRefGoogle Scholar
  28. 28.
    Veselka, T., Boyd, G., Conzelmann, G., Koritarov, V., Macal, C., North, M., Schoepfle, B.,and Thimmapuram, P.: Simulating the Behavior of Electricity Markets With an Agent-Based Methodology: the Electricity Market Complex Adaptive System (EMCAS) Model. In: Proceedings of the 22nd International Association for Energy Economics International Conference, Vancouver, British Columbia, Canada (October 2002); Published on CD-ROM Google Scholar
  29. 29.
    Bonabeau, E.: Agent-Based Modeling: Methods and Techniques for Simulating Human Systems. Proceedings of the National Academy of Sciences of the USA 99(suppl. 3), 7280–7287 (2002)CrossRefGoogle Scholar
  30. 30.
    Ebben, M., De Boer, L., Pop Sitar, C.: Multi-Agent Simulation of Purchasing Activities in Organizations. In: Proceedings of the 2002 Winter Simulation Conference, December 8-11, vol. 2, pp. 1337–1344. IEEE, Piscataway (2002)Google Scholar
  31. 31.
    North, M., Macal, C., Campbell, A.: Oh Behave! Problem Solving Environments for Agent Behavioral Simulation. International Journal of Future Generation Computer Systems (Accepted January 2004)Google Scholar
  32. 32.
    Supercluster Research and Development Group: HPC Workload/Resource Trace Repository (May 2004), Available as
  33. 33.
    Booch, G.: Object-oriented Design with Applications, 2nd edn. Addison-Wesley, Boston (1993)Google Scholar
  34. 34.
    Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, Reading (1995)Google Scholar
  35. 35.
    Foxwell, H.: Java 2 Software Development Kit. Linux Journal (October 1999)Google Scholar
  36. 36.
    Freeman-Benson, B., Borning, A.: Practitioners Report: Experience in Developing the UrbanSim System: Tools and Processes. In: Companion of the 18th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, ACM, New York (2003)Google Scholar
  37. 37.
    Walker, R., Baniassad, E.: An Initial Assessment of Aspect-oriented Programming. In: Proceedings of the 1999 International Conference on Software Engineering, May 16-22, pp. 120–130. IEEE, Piscataway (1999)CrossRefGoogle Scholar
  38. 38.
    Gülcü, C.: Log4j Delivers Control Over Logging. Java World (November 2000), Online Magazine Available as
  39. 39.
    Beck, K., Gamma, E.: Test Infected: Programmers Love Writing Tests. Java Report 3(7), 37–50 (1998)Google Scholar
  40. 40.
    Fogel, K., Bar, M.: Open Source Development with CVS, 2nd edn. Coriolis, Scottsdale (2000)Google Scholar
  41. 41.
    Barowski, L.: Visualizing Graphs with Java Library, Auburn University (January 2004), Available as
  42. 42.
    Fruchterman, T., Reingold, E.: Graph Drawing by Force Directed Placement. Journal of Software: Practice and Experience 21(11), 129–1164 (1991)Google Scholar
  43. 43.
    Flich, J., Malumbres, M., Lopez, P., Duato, J.: Improving Routing Performance in Myrinet Networks. In: Proceedings of the 14th International Parallel and Distributed Processing Symposium, May 1-5, pp. 27–32. IEEE, Piscataway (2000)Google Scholar
  44. 44.
    Baik, S., Hood, C., Gropp, W.: Prototype of AM3: Active Mapper and Monitoring Module for the Myrinet Environment. In: Proceedings of the 27th Annual IEEE Conference on High Speed Local Networks, November 6-8, pp. 703–707. IEEE, Piscataway (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Michael J. North
    • 1
  • Cynthia S. Hood
    • 2
  1. 1.Argonne National LaboratoryArgonne
  2. 2.Illinois Institute of TechnologyChicagoUSA

Personalised recommendations