Improving Speedup and Response Times by Replicating Parallel Programs on a SNOW

Ghare, Gaurav D.; Leutenegger, Scott T.

doi:10.1007/11407522_15

Gaurav D. Ghare¹⁹ &
Scott T. Leutenegger¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3277))

Included in the following conference series:

Workshop on Job Scheduling Strategies for Parallel Processing

740 Accesses
20 Citations

Abstract

Idle computation cycles of a shared network of workstations are increasingly being used to run batch parallel programs. For one common paradigm, the batch program task running on an idle workstation is preempted when the owner reclaims the workstation. This owner interference has a considerable impact on the execution time of a batch program, especially in the case of large parallel programs. Replication of batch program tasks has been used to reduce the impact of owner interference. We show analytically that replication can significantly improve parallel program speedup. Perhaps surprisingly, replication can also improve efficiency for certain workloads. We present analysis to quantify the amount of speedup and efficiency improvement. Furthermore, we provide analysis to help determine whether extra available workstations should be used for increasing job parallelism or for task replication.

This work was supported by the NSF grant ACI-9733658.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Acharya, A., Edjlali, G., Saltz, J.: The utility of exploiting idle workstations for parallel computation. In: Proc. 1997 ACM SIGMETRICS, pp. 225–236 (1997)
Google Scholar
Adler, M., Gong, Y., Rosenberg, A.: Optimal sharing of bags of tasks in heterogeneous clusters. In: Proc. of the fifteenth annual ACM Symposium on Parallel Algorithms and Architectures, pp. 1–10 (2003)
Google Scholar
Anderson, T., Culler, D., Patterson, D.: A case for networks of workstations: Now. IEEE Micro (1995)
Google Scholar
Arpaci, R., Dusseau, A., Vahdat, A., Liu, L., Anderson, T., Patterson, D.: The interaction of parallel and sequential workloads on a network of workstations. In: Proc. 1995 ACM SIGMETRICS (1995)
Google Scholar
Cho, S.: Competitive Execution in a Distributed Environment. PhD thesis, University of California, Los Angeles (1996)
Google Scholar
Heymann, E., Senar, M., Luque, E., Livny, M.: Evaluation of strategies to reduce the impact of machine reclaim in cycle-stealing environments. In: Proc. First IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 320–328 (2001)
Google Scholar
Leutenegger, S., Sun, X.: Limitations of cycle stealing for parallel processing on a network of homogeneous workstations. Journal of Parallel and Distributed Computing 43, 169–178 (1997)
Article Google Scholar
Pruyne, J., Livny, M.: Interfacing condor and pvm to harness the cycles of workstation clusters. Journal on Future Generations of Computer Systems 12 (1996)
Google Scholar
Pruyne, J., Livny, M.: Managing checkpoints for parallel programs. In: Workshop on Job Scheduling Strategies for Parallel Processing, IPPS 1996 (1996)
Google Scholar
Rosenberg, A.L.: Optimal schedules for cycle-stealing in a network of workstations with a bag-of-tasks workload. IEEE Transactions on Parallel and Distributed Systems 13, 179–191 (2002)
Article Google Scholar
Sterling, T., Becker, D., Savarese, D., Dorband, J., Ranawake, U., Packer, C.: Beowulf: A parallel workstation for scientific computation. In: Proc. of the International Conf. on Parallel Processing (1995)
Google Scholar
Litzkow, M., Livny, M.: Experience with the condor distributed batch system. In: Proc. of IEEE Workshop on Experimental Distributed Systems (1990)
Google Scholar
Litzkow, M.J., Livny, M., Mutka, M.W.: Condor - a hunter of idle workstations. In: Proc. of the 8th International Conference on Distributed Computing Systems (ICDCS), pp. 104–111 (1987)
Google Scholar
Eager, D., Zahorjan, J., Lazowska, E.: Speedup versus efficiency in parallel systems. IEEE Transactions on Computers 38, 408–423 (1989)
Article Google Scholar
Goux, J., Kulkarni, S., Yoder, M., Linderoth, J.: An enabling framework for masterworker applications on the computational grid. In: Proc. of the 9th IEEE International Symposium on High Performance Distributed Computing, pp. 43–50 (2000)
Google Scholar
David, H.: Order Statistics. John Wiley and Sons, Inc., Chichester (1970)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Denver, Denver, CO, 80208-0189, USA
Gaurav D. Ghare & Scott T. Leutenegger

Authors

Gaurav D. Ghare
View author publications
You can also search for this author in PubMed Google Scholar
Scott T. Leutenegger
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The Hebrew University of Jerusalem,
Dror G. Feitelson
Massachusetts Institute of Technology, 77 Massachusetts Avenue, MA 02139, Cambridge, USA
Larry Rudolph
No Affiliations,
Uwe Schwiegelshohn

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghare, G.D., Leutenegger, S.T. (2005). Improving Speedup and Response Times by Replicating Parallel Programs on a SNOW. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2004. Lecture Notes in Computer Science, vol 3277. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11407522_15

Download citation

DOI: https://doi.org/10.1007/11407522_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25330-3
Online ISBN: 978-3-540-31795-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics