Skip to main content

Network-Aware Parallel Computing with Remos

  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1656))

  • 264 Accesses

Abstract

Networked systems provide a cost-effective platform for parallel computing, but the applications have to deal with the changing availability of computation and communication resources. Network-awareness is a recent attempt to bridge the gap between the realities of networks and the demands of applications. Network-aware applications obtain information about their execution environment and dynamically adapt to enhance their performance. Adaptation is especially important for synchronous parallel applications since a single busy communication link can become the bottleneck and degrade overall performance dramatically. This paper presents Remos, a uniform API that allows applications to obtain relevant network information, and reports on the development of parallel applications in this environment. The challenges in defining a uniform interface include network heterogeneity, diversity and variability in network traffic, and resource sharing in the network and even inside an application. The first implementation of the Remos system is hosted on an IP-based network testbed. The paper reports on our methodology for developing adaptive parallel applications for high-speed networks with Remos, and presents results that highlight the importance and effectiveness of adaptive parallel computing.

Effort sponsored by the Advanced Research Projects Agency and Rome Laboratory, Air Force Materiel Command, USAF, under agreement number F30602-96-1-0287. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes not with standing any copyright annotation thereon.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ATM User-Network Interface Specification. Version 4.0, 1996. ATM Forum document.

    Google Scholar 

  2. Bao, H., Bielak, J., Ghattas, O., O’Hallaron, D. R., Kallivokas, L. F., Shewchuk, J. R., and Xu, J. Earthquake ground motion modeling on parallel computers. In Proceedings of Supercomputing’ 96 Pittsburgh, PA, Nov. 1996).

    Google Scholar 

  3. Bolliger, J., and Gross, T. A framework-based approach to the development of network-aware applications. IEEE Trans. Softw. Eng. 24, 5 (May 1998), 376–390.

    Article  Google Scholar 

  4. Case, J., McCloghrie, K., Rose, M., and Waldbusser, S. Protocol Operations for Version 2 of the Simple Network Management Protocol (SNMPv2), January 1999. RFC 1905.

    Google Scholar 

  5. DeWitt, T., Gross, T., Lowekamp, B., Miller, N., Steenkiste, P., Subhlok, J., and Sutherland, D. Remos: A resource monitoring system for network-aware applications. Tech. Rep. CMU-CS-97-194, Carnegie Mellon University, Dec 1997.

    Google Scholar 

  6. Dinda, P. Statistical properties of host load in a distributed environment. In Fourth Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers Pittsburgh, PA, May 1998.

    Google Scholar 

  7. Eckhardt, D., and Steenkiste, P. A Wireless MAC with Service Guarantees. In preparation, 1998.

    Google Scholar 

  8. Forum, T. M. MPI: A Message Passing Interface. InProceedings of Supercomputing’ 93 (Oregon, November 1993), ACM/IEEE, pp. 878–883.

    Google Scholar 

  9. Foster, I., and Kesselman, K. Globus: A metacomputing infrastructure toolkit. Journal of Supercomputer Applications 112 (1997), 115–128.

    Article  Google Scholar 

  10. Geist, G. A., and Sunderam, V. S. The PVM System: Supercomputer Level Concurrent Computation on a Heterogeneous Network of Workstations. In Proceedings of the Sixth Distributed Memory Computing Conference (April 1991), IEEE, pp. 258–261.

    Google Scholar 

  11. Grimshaw, A., Wulf, W., and Legion Team. The Legion vision of a worldwide virtual computer. Communications of the ACM 401 (January 1997).

    Google Scholar 

  12. Gross, T., O’Hallaron, D., and Subhlok, J. Task parallelism in a High Performance Fortran framework. IEEE Parallel & Distributed Technology 23 (Fall 1994), 16–26.

    Article  Google Scholar 

  13. Hahne, E. L. Round-robin scheduling for max-min fairness in data networks. IEEE Journal on Selected Areas in Communication 97 (September 1991).

    Google Scholar 

  14. Inouye, J., Cen, S., Pu, C., and Walpole, J. System support for mobile multimedia applications. In Proceedings of the 7thInternational Workshop on Network and Operating System Support for Digital Audio and Video (St. Louis, May 1997), pp. 143–154.

    Google Scholar 

  15. Jaffe, J. M. Bottleneck flow control. IEEE Transactions on Communications 297 (July 1981), 954–962.

    Article  MathSciNet  Google Scholar 

  16. Jain, R. The Art of Computer Systems Performance Analysis. John Wiley & Sons, Inc., 1991.

    Google Scholar 

  17. Jain, R. Congestion control and traffic management in ATM networks: Recent advances and a survey. Computer Networks and ISDN Systems (February 1995).

    Google Scholar 

  18. Koelbel, C., Loveman, D., Steele, G., and Zosel, M. The High Performance Fortran Handbook. The MIT Press, Cambridge, MA, 1994.

    Google Scholar 

  19. Litzkow, M., Livny, M., and Mutka, M. Condor — A hunter of idle workstations. In Proceedings of the Eighth Conference on Distributed Computing Systems (San Jose, California, June 1988).

    Google Scholar 

  20. Sharma, S., Ponnusamy, R., Moon, B., Hwang, Y., Das, R., and Saltz, J. Run-time and compile-time support for adaptive irregular problems. In Proceedings of Supercomputing’ 94 (Washington, DC, Nov 1994), pp.97–106.

    Google Scholar 

  21. Siegell, B. Automatic Generation of Parallel Programs with Dynamic Load Balancing for a Network of Workstations. PhD thesis, Department of Computer and Electrical Engineering, Carnegie Mellon University, 1995. Also appeared as technical report CMU-CS-95-168.

    Google Scholar 

  22. Siegell, B., and Steenkiste, P. Automatic selection of load balancing parameters using compile-time and run-time information. Concurrency-Practice and Experience 93 (1996), 275–317.

    Google Scholar 

  23. Stemm, M., Seshan, S., and Katz, R. Spand: Shared passive network performance discovery. In USENIX Symposium on Internet Technologies and Systems (Monterey, CA, June 1997).

    Google Scholar 

  24. Subhlok, J., Steenkiste, P., Stichnoth, J., and Lieu, P. Airshed pollution modeling: A case study in application development in an HPF environment. In 12th International Parallel Processing Symposium (Orlando, FL, April 1998).

    Google Scholar 

  25. Subhlok, J., and Vondran, G. Optimal latency-throughput tradeoffs for data parallel pipelines. In Eighth Annual ACM Symposium on Parallel Algorithms and Architectures (Padua, Italy, June 1996), pp. 62–71.

    Google Scholar 

  26. Subhlok, J., and Yang, B. A new model for integrated nested task and data parallel programming. In Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (June 1997), ACM.

    Google Scholar 

  27. Tangmunarunkit, H., and Steenkiste, P. Network-aware distributed computing: A case study. In Second Workshop on Runtime Systems for Parallel Programming (RTSPP) (Orlando), March 1998), IEEE, p. Proceedings to be published by Springer. Held in conjunction with IPPS’ 98.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lowekamp, B., Miller, N., Sutherland, D., Gross, T., Steenkiste, P., Subhlok, J. (1999). Network-Aware Parallel Computing with Remos. In: Chatterjee, S., et al. Languages and Compilers for Parallel Computing. LCPC 1998. Lecture Notes in Computer Science, vol 1656. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48319-5_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-48319-5_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66426-0

  • Online ISBN: 978-3-540-48319-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics