Abstract
In the past decade, large distributed systems with unreliable hosts including P2P systems and volunteer computing systems have become common. The volatility nature of resources makes it a challenge to schedule tasks with soft deadline in such systems. In this paper we examine one of the critical problems, estimating deadline-miss probabilities of tasks running on unreliable hosts. Through analysis of trace data gathered from an actual volunteer computing system, we get a general property about host’s period available fraction, based on which we propose an efficient method of estimating deadline-miss probability. To evaluate the accuracy of this method, we conduct trace-driven simulations whose results show that average absolute difference between estimated probability and real ratio is smaller than 2%. To compare our method with two other methods, we simulate a scheduler which distributes task based on estimated probability. Results show that our method performs better.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anderson, D.P., Cobb, J., Korpela, E., Lebofsky, M., Werthimer, D.: SETI@home: an experiment in public-resource computing. Commun. ACM 45(11), 56–61 (2002)
Larson, S.M., Snow, C.D., Shirts, M.R., Pande, V.S.: Folding@home and Genome@home: Using distributed computing to tackle previously intractable problems in computational biology. In: Modern Methods in Computational Biology, Horizon, Marseille (2003)
Anderson, D.P., Fedak, G.: The Computational and Storage Potential of Volunteer Computing. In: CCGRID 2006: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid, pp. 73–80. IEEE Computer Society, Washington, DC (2006)
Yi, S., Jeannot, E., Kondo, D., Anderson, D.P.: Towards Real-Time, Volunteer Distributed Computing. In: CCGRID 2011: Proceedings of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 154–163 (2011)
Bouguerra, M.-S., Kondo, D., Trystram, D.: On the Scheduling of Checkpoints in Desktop Grids. In: CCGRID 2011: Proceedings of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 305–313 (2011)
Javadi, B., Kondo, D., Vincent, J.-M., Anderson, D.P.: Mining for Statistical Availability Models in Large-Scale Distributed Systems: An Empirical Study of SETI@home. In: MASCOTS 2009: Proceedings of the 17th Annual Meeting of the IEEE/ACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 1–10 (2009)
Nurmi, D., Brevik, J., Wolski, R.: Modeling Machine Availability in Enterprise and Wide-Area Distributed Computing Environments. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005. LNCS, vol. 3648, pp. 432–441. Springer, Heidelberg (2005)
Wolski, R., Nurmi, D., Brevik, J.: An Analysis of Availability Distributions in Condor. In: IPDPS 2007: Proceedings of the 21th International Parallel and Distributed Processing Symposium, pp. 1–6. IEEE (2007)
Kondo, D., Andrzejak, A., Anderson, D.P.: On correlated availability in Internet-distributed systems. In: GRID 2008: Proceedings of the 9th IEEE/ACM International Conference on Grid Computing, pp. 276–283 (2008)
Brevik, J., Nurmi, D., Wolski, R.: Automatic Methods for Predicting Machine Availability in Desktop Grid and Peer-to-peer Systems. In: CCGrid 2004: Proceedings of the 4th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 190–199. IEEE Computer Society (2004)
Mickens, J.W., Noble, B.D.: Exploiting Availability Prediction in Distributed Systems. In: NSDI 2006: Proceedings of the 3rd Symposium on Networked Systems Design and Implementation, pp. 73–86. USENIX (2006)
Andrzejak, A., Kondo, D., Anderson, D.P.: Ensuring Collective Availability in Volatile Resource Pools Via Forecasting. In: De Turck, F., Kellerer, W., Kormentzas, G. (eds.) DSOM 2008. LNCS, vol. 5273, pp. 149–161. Springer, Heidelberg (2008)
Heien, E.M., Anderson, D.P., Hagihara, K.: Computing Low Latency Batches with Unreliable Workers in Volunteer Computing Environments. J. Grid Comput. 7(4), 501–518 (2009)
Kondo, D., Javadi, B., Iosup, A., Epema, D.H.J.: The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems. In: CCGRID 2010: Proceedings of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 398–407. IEEE (2010)
Anderson, D.P.: BOINC: a system for public-resource computing and storage. In: GRID 2004: Proceedings of the fifth IEEE/ACM International Workshop on Grid Computing, pp. 4–10 (2004)
Douceur, J.R.: Is remote host availability governed by a universal law? SIGMETRICS Performance Evaluation Review 31(3), 25–29 (2003)
LĂ¡zaro, D., Kondo, D., Marquès, J.M.: Long-term availability prediction for groups of volunteer resources. J. Parallel Distrib. Comput. (2011), doi:10.1016/j.jpdc.2011.10.007
John, G., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proc. 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, D., Gong, B., Zhao, G. (2012). Estimating Deadline-Miss Probabilities of Tasks in Large Distributed Systems. In: Li, R., Cao, J., Bourgeois, J. (eds) Advances in Grid and Pervasive Computing. GPC 2012. Lecture Notes in Computer Science, vol 7296. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30767-6_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-30767-6_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30766-9
Online ISBN: 978-3-642-30767-6
eBook Packages: Computer ScienceComputer Science (R0)