Skip to main content

Toward a Comprehensive Software Based Dsm System

  • Chapter
New Horizons of Parallel and Distributed Computing

Abstract

Software based Distributed Shared Memory (DSM) systems have been the focus of considerable research effort, primarily in improving performance and consistency protocols. Unfortunately, computer clusters present a number of challenges for any DSM systems that are not solvable through consistency protocols alone. These challenges relate to the ability of DSM systems to adjust to load fluctuations, computers being added/removed from the cluster, to deal with faults, and the ability to use DSM objects larger than the available physical memory. We present here a proposal for the Synergy Distributed Shared Memory System and its integration with the virtual memory, group communication and process migration services of the Genesis Cluster Operating System.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agbaria, A. and Plank, J. (2000). Design, implementation, and performance of checkpointing in netsolve. In International Conference on Dependable Systems and Networks, pages 49–55, New York, New York. IEEE Computer Society.

    Google Scholar 

  2. Amza, C, Cox, A., Dwarkadas, S., Keleher, P., Lu, H., Rajamony, R., Yu, W., and Zwaenepoel, W. (1996). Treadmarks: Shared memory computing on networks of workstations. IEEE Computer, 29(2):18–28.

    Google Scholar 

  3. Bal, H., Kaashoek, F, and Tanenbaum, A (1992). Orca: A language for parallel programming of distributed systems. IEEE Transactions on Software Engineering, 18(3): 190–205.

    Article  Google Scholar 

  4. Carter, J., Bennett, J., and Zwaenepoel, W. (1995). Techniques for reducing consistency-related communication in distributed shared-memory systems. ACM Transactions on Computer Systems, 13(3).

    Google Scholar 

  5. Dwarkadas, S., Hardavellas, N., Kontothanassis, L., Nikhil, R., and Stets, R. (1999). Cashmere-vlm: Remote memory paging for software distributed shared memory. In 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing, pages 153–159, San Juan, Puerto Rico. IEEE Computer Society.

    Google Scholar 

  6. Gharachorloo, K. (1999). The plight of software distributed shared memory. In 1st Workshop on Software Distributed Shared Memory (WSDSM’ 99), Rhodes, Greece.

    Google Scholar 

  7. Goscinski, A., Hobbs, M., and Silcock, J. (2002). Genesis: An efficient, transparent and easy to use cluster-based operating system. Parallel Computing, 28(4):557–606.

    Article  Google Scholar 

  8. Hsieh, W. (1995). Dynamic Computation Migration in Distributed Shared Memory Systems. PhD thesis, Massachusetts Institute of Technology.

    Google Scholar 

  9. Iftode, L. and Singh, J. (1999). Shared virtual memory: Progress and challenges. Proc. of the IEEE, 87(3).

    Google Scholar 

  10. Ioannidis, S. and Dwarkdas, S. (1998). Compiler and run-time support for adaptive load balancing in software distributed shared memory systems. In Fourth Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR’ 98), pages 107–122, Pittsburgh, Philadelphia. ACM.

    Google Scholar 

  11. Keleher, P. (1996). The relative importance of concurrent writers and weak consistency models. In 16th International Conference on Distributed Computing Systems (ICDCS-16), pages 91–98, Hong Kong. IEEE.

    Google Scholar 

  12. Li, Q., Jing, J., and Xie, L. (1997). Bfxm: A parallel file system model based on the mechanism of distributed shared memory. Operating Systems Review, 31(4):30–40.

    Article  Google Scholar 

  13. Markatos, E. and Dramitinos, G. (1996). Implementation of a reliable remote memory pager. In 1996 Usenix Technical Conference, pages 177–190, San Diego, CA. Usenix.

    Google Scholar 

  14. Morin, C, Lottiaux, R., and Kermarrec, A.-M. (2001). A two-level checkpoint algorithm in a highly-available parallel single level store system. In Workshop on Distributed Shared Memory on Clusters (CCGrid-01), Brisbane, Australia.

    Google Scholar 

  15. Parallel-Tools (1994). Concurrent programming with treadmarks. User manual, Parallel Tools L.L.C.

    Google Scholar 

  16. Pnevmatikatos, D., Markatos, E. P., Magklis, G., and Ioannidis, S. (1999). On using network ram as a non-volatile buffer. Cluster Computing, 2(4):295–303.

    Article  Google Scholar 

  17. Shi, W., Hu, W., Tang, Z., and Eskicioglu, M. (1999). Dynamic task migration in home-based software dsm systems. In 8th IEEE International Symposium on High Performance Distributed Computing, Redondo Beach, California.

    Google Scholar 

  18. Shi, W. and Tang, Z. (1998). Intervals to evaluating distributed shared memory systems. IEEE TCCA Newsletter, pages 3–10.

    Google Scholar 

  19. Silcock, J. and Goscinski, A. (1998). The rhodos dsm system. Microprocessor and Microsystems, 22(3-4): 183–196.

    Article  Google Scholar 

  20. Stelling, P., Foster, I., Kesselman, C, Lee, C, and Laszewski, G. v. (1999). A fault detection service for wide area distributed computations. Cluster Computing, 2(2): 117–128.

    Article  Google Scholar 

  21. Thitikamol, K. and Keleher, P. (1999). Thread migration and load balancing in non-dedicated environments. In Dwarkadas, S., editor, 3rd Workshop on Runtime Systems for Parallel Programming, San Juan, Puerto Rico. Lecture Notes in Computer Science, Springer-Verlag.

    Google Scholar 

  22. Top500 (2002). Worlds top 500 computer systems. ¡http://www.top500.org¿. Web Site Last accessed 6th December, 2002.

    Google Scholar 

  23. Zoraja, I., Rackl, G., and Ludwig, T. (1999). Towards monitoring in parallel and distributed systems. In Conference on Software in Telecommunications and Computer Networks (SoftCOM’ 99), pages 133–141.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer Science+Business Media, Inc.

About this chapter

Cite this chapter

Hobbs, M., Silcock, J., Goscinski, A. (2005). Toward a Comprehensive Software Based Dsm System. In: Guo, M., Yang, L.T. (eds) New Horizons of Parallel and Distributed Computing. Springer, Boston, MA. https://doi.org/10.1007/0-387-28967-4_13

Download citation

  • DOI: https://doi.org/10.1007/0-387-28967-4_13

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-24434-1

  • Online ISBN: 978-0-387-28967-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics