Abstract
Idle computation cycles of a shared network of workstations are increasingly being used to run batch parallel programs. For one common paradigm, the batch program task running on an idle workstation is preempted when the owner reclaims the workstation. This owner interference has a considerable impact on the execution time of a batch program, especially in the case of large parallel programs. Replication of batch program tasks has been used to reduce the impact of owner interference. We show analytically that replication can significantly improve parallel program speedup. Perhaps surprisingly, replication can also improve efficiency for certain workloads. We present analysis to quantify the amount of speedup and efficiency improvement. Furthermore, we provide analysis to help determine whether extra available workstations should be used for increasing job parallelism or for task replication.
This work was supported by the NSF grant ACI-9733658.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Acharya, A., Edjlali, G., Saltz, J.: The utility of exploiting idle workstations for parallel computation. In: Proc. 1997 ACM SIGMETRICS, pp. 225–236 (1997)
Adler, M., Gong, Y., Rosenberg, A.: Optimal sharing of bags of tasks in heterogeneous clusters. In: Proc. of the fifteenth annual ACM Symposium on Parallel Algorithms and Architectures, pp. 1–10 (2003)
Anderson, T., Culler, D., Patterson, D.: A case for networks of workstations: Now. IEEE Micro (1995)
Arpaci, R., Dusseau, A., Vahdat, A., Liu, L., Anderson, T., Patterson, D.: The interaction of parallel and sequential workloads on a network of workstations. In: Proc. 1995 ACM SIGMETRICS (1995)
Cho, S.: Competitive Execution in a Distributed Environment. PhD thesis, University of California, Los Angeles (1996)
Heymann, E., Senar, M., Luque, E., Livny, M.: Evaluation of strategies to reduce the impact of machine reclaim in cycle-stealing environments. In: Proc. First IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 320–328 (2001)
Leutenegger, S., Sun, X.: Limitations of cycle stealing for parallel processing on a network of homogeneous workstations. Journal of Parallel and Distributed Computing 43, 169–178 (1997)
Pruyne, J., Livny, M.: Interfacing condor and pvm to harness the cycles of workstation clusters. Journal on Future Generations of Computer Systems 12 (1996)
Pruyne, J., Livny, M.: Managing checkpoints for parallel programs. In: Workshop on Job Scheduling Strategies for Parallel Processing, IPPS 1996 (1996)
Rosenberg, A.L.: Optimal schedules for cycle-stealing in a network of workstations with a bag-of-tasks workload. IEEE Transactions on Parallel and Distributed Systems 13, 179–191 (2002)
Sterling, T., Becker, D., Savarese, D., Dorband, J., Ranawake, U., Packer, C.: Beowulf: A parallel workstation for scientific computation. In: Proc. of the International Conf. on Parallel Processing (1995)
Litzkow, M., Livny, M.: Experience with the condor distributed batch system. In: Proc. of IEEE Workshop on Experimental Distributed Systems (1990)
Litzkow, M.J., Livny, M., Mutka, M.W.: Condor - a hunter of idle workstations. In: Proc. of the 8th International Conference on Distributed Computing Systems (ICDCS), pp. 104–111 (1987)
Eager, D., Zahorjan, J., Lazowska, E.: Speedup versus efficiency in parallel systems. IEEE Transactions on Computers 38, 408–423 (1989)
Goux, J., Kulkarni, S., Yoder, M., Linderoth, J.: An enabling framework for masterworker applications on the computational grid. In: Proc. of the 9th IEEE International Symposium on High Performance Distributed Computing, pp. 43–50 (2000)
David, H.: Order Statistics. John Wiley and Sons, Inc., Chichester (1970)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ghare, G.D., Leutenegger, S.T. (2005). Improving Speedup and Response Times by Replicating Parallel Programs on a SNOW. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2004. Lecture Notes in Computer Science, vol 3277. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11407522_15
Download citation
DOI: https://doi.org/10.1007/11407522_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25330-3
Online ISBN: 978-3-540-31795-1
eBook Packages: Computer ScienceComputer Science (R0)