Skip to main content

Improving Speedup and Response Times by Replicating Parallel Programs on a SNOW

  • Conference paper
Job Scheduling Strategies for Parallel Processing (JSSPP 2004)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3277))

Included in the following conference series:

Abstract

Idle computation cycles of a shared network of workstations are increasingly being used to run batch parallel programs. For one common paradigm, the batch program task running on an idle workstation is preempted when the owner reclaims the workstation. This owner interference has a considerable impact on the execution time of a batch program, especially in the case of large parallel programs. Replication of batch program tasks has been used to reduce the impact of owner interference. We show analytically that replication can significantly improve parallel program speedup. Perhaps surprisingly, replication can also improve efficiency for certain workloads. We present analysis to quantify the amount of speedup and efficiency improvement. Furthermore, we provide analysis to help determine whether extra available workstations should be used for increasing job parallelism or for task replication.

This work was supported by the NSF grant ACI-9733658.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Acharya, A., Edjlali, G., Saltz, J.: The utility of exploiting idle workstations for parallel computation. In: Proc. 1997 ACM SIGMETRICS, pp. 225–236 (1997)

    Google Scholar 

  2. Adler, M., Gong, Y., Rosenberg, A.: Optimal sharing of bags of tasks in heterogeneous clusters. In: Proc. of the fifteenth annual ACM Symposium on Parallel Algorithms and Architectures, pp. 1–10 (2003)

    Google Scholar 

  3. Anderson, T., Culler, D., Patterson, D.: A case for networks of workstations: Now. IEEE Micro (1995)

    Google Scholar 

  4. Arpaci, R., Dusseau, A., Vahdat, A., Liu, L., Anderson, T., Patterson, D.: The interaction of parallel and sequential workloads on a network of workstations. In: Proc. 1995 ACM SIGMETRICS (1995)

    Google Scholar 

  5. Cho, S.: Competitive Execution in a Distributed Environment. PhD thesis, University of California, Los Angeles (1996)

    Google Scholar 

  6. Heymann, E., Senar, M., Luque, E., Livny, M.: Evaluation of strategies to reduce the impact of machine reclaim in cycle-stealing environments. In: Proc. First IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 320–328 (2001)

    Google Scholar 

  7. Leutenegger, S., Sun, X.: Limitations of cycle stealing for parallel processing on a network of homogeneous workstations. Journal of Parallel and Distributed Computing 43, 169–178 (1997)

    Article  Google Scholar 

  8. Pruyne, J., Livny, M.: Interfacing condor and pvm to harness the cycles of workstation clusters. Journal on Future Generations of Computer Systems 12 (1996)

    Google Scholar 

  9. Pruyne, J., Livny, M.: Managing checkpoints for parallel programs. In: Workshop on Job Scheduling Strategies for Parallel Processing, IPPS 1996 (1996)

    Google Scholar 

  10. Rosenberg, A.L.: Optimal schedules for cycle-stealing in a network of workstations with a bag-of-tasks workload. IEEE Transactions on Parallel and Distributed Systems 13, 179–191 (2002)

    Article  Google Scholar 

  11. Sterling, T., Becker, D., Savarese, D., Dorband, J., Ranawake, U., Packer, C.: Beowulf: A parallel workstation for scientific computation. In: Proc. of the International Conf. on Parallel Processing (1995)

    Google Scholar 

  12. Litzkow, M., Livny, M.: Experience with the condor distributed batch system. In: Proc. of IEEE Workshop on Experimental Distributed Systems (1990)

    Google Scholar 

  13. Litzkow, M.J., Livny, M., Mutka, M.W.: Condor - a hunter of idle workstations. In: Proc. of the 8th International Conference on Distributed Computing Systems (ICDCS), pp. 104–111 (1987)

    Google Scholar 

  14. Eager, D., Zahorjan, J., Lazowska, E.: Speedup versus efficiency in parallel systems. IEEE Transactions on Computers 38, 408–423 (1989)

    Article  Google Scholar 

  15. Goux, J., Kulkarni, S., Yoder, M., Linderoth, J.: An enabling framework for masterworker applications on the computational grid. In: Proc. of the 9th IEEE International Symposium on High Performance Distributed Computing, pp. 43–50 (2000)

    Google Scholar 

  16. David, H.: Order Statistics. John Wiley and Sons, Inc., Chichester (1970)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ghare, G.D., Leutenegger, S.T. (2005). Improving Speedup and Response Times by Replicating Parallel Programs on a SNOW. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2004. Lecture Notes in Computer Science, vol 3277. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11407522_15

Download citation

  • DOI: https://doi.org/10.1007/11407522_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25330-3

  • Online ISBN: 978-3-540-31795-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics