Estimating Deadline-Miss Probabilities of Tasks in Large Distributed Systems

Wang, Dongping; Gong, Bin; Zhao, Guoling

doi:10.1007/978-3-642-30767-6_22

Dongping Wang¹⁹,
Bin Gong¹⁹ &
Guoling Zhao²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7296))

Included in the following conference series:

International Conference on Grid and Pervasive Computing

1851 Accesses
2 Citations

Abstract

In the past decade, large distributed systems with unreliable hosts including P2P systems and volunteer computing systems have become common. The volatility nature of resources makes it a challenge to schedule tasks with soft deadline in such systems. In this paper we examine one of the critical problems, estimating deadline-miss probabilities of tasks running on unreliable hosts. Through analysis of trace data gathered from an actual volunteer computing system, we get a general property about host’s period available fraction, based on which we propose an efficient method of estimating deadline-miss probability. To evaluate the accuracy of this method, we conduct trace-driven simulations whose results show that average absolute difference between estimated probability and real ratio is smaller than 2%. To compare our method with two other methods, we simulate a scheduler which distributes task based on estimated probability. Results show that our method performs better.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anderson, D.P., Cobb, J., Korpela, E., Lebofsky, M., Werthimer, D.: SETI@home: an experiment in public-resource computing. Commun. ACM 45(11), 56–61 (2002)
Article Google Scholar
Larson, S.M., Snow, C.D., Shirts, M.R., Pande, V.S.: Folding@home and Genome@home: Using distributed computing to tackle previously intractable problems in computational biology. In: Modern Methods in Computational Biology, Horizon, Marseille (2003)
Google Scholar
Anderson, D.P., Fedak, G.: The Computational and Storage Potential of Volunteer Computing. In: CCGRID 2006: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid, pp. 73–80. IEEE Computer Society, Washington, DC (2006)
Chapter Google Scholar
Yi, S., Jeannot, E., Kondo, D., Anderson, D.P.: Towards Real-Time, Volunteer Distributed Computing. In: CCGRID 2011: Proceedings of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 154–163 (2011)
Google Scholar
Bouguerra, M.-S., Kondo, D., Trystram, D.: On the Scheduling of Checkpoints in Desktop Grids. In: CCGRID 2011: Proceedings of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 305–313 (2011)
Google Scholar
Javadi, B., Kondo, D., Vincent, J.-M., Anderson, D.P.: Mining for Statistical Availability Models in Large-Scale Distributed Systems: An Empirical Study of SETI@home. In: MASCOTS 2009: Proceedings of the 17th Annual Meeting of the IEEE/ACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 1–10 (2009)
Google Scholar
Nurmi, D., Brevik, J., Wolski, R.: Modeling Machine Availability in Enterprise and Wide-Area Distributed Computing Environments. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005. LNCS, vol. 3648, pp. 432–441. Springer, Heidelberg (2005)
Chapter Google Scholar
Wolski, R., Nurmi, D., Brevik, J.: An Analysis of Availability Distributions in Condor. In: IPDPS 2007: Proceedings of the 21th International Parallel and Distributed Processing Symposium, pp. 1–6. IEEE (2007)
Google Scholar
Kondo, D., Andrzejak, A., Anderson, D.P.: On correlated availability in Internet-distributed systems. In: GRID 2008: Proceedings of the 9th IEEE/ACM International Conference on Grid Computing, pp. 276–283 (2008)
Google Scholar
Brevik, J., Nurmi, D., Wolski, R.: Automatic Methods for Predicting Machine Availability in Desktop Grid and Peer-to-peer Systems. In: CCGrid 2004: Proceedings of the 4th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 190–199. IEEE Computer Society (2004)
Google Scholar
Mickens, J.W., Noble, B.D.: Exploiting Availability Prediction in Distributed Systems. In: NSDI 2006: Proceedings of the 3rd Symposium on Networked Systems Design and Implementation, pp. 73–86. USENIX (2006)
Google Scholar
Andrzejak, A., Kondo, D., Anderson, D.P.: Ensuring Collective Availability in Volatile Resource Pools Via Forecasting. In: De Turck, F., Kellerer, W., Kormentzas, G. (eds.) DSOM 2008. LNCS, vol. 5273, pp. 149–161. Springer, Heidelberg (2008)
Chapter Google Scholar
Heien, E.M., Anderson, D.P., Hagihara, K.: Computing Low Latency Batches with Unreliable Workers in Volunteer Computing Environments. J. Grid Comput. 7(4), 501–518 (2009)
Article Google Scholar
Kondo, D., Javadi, B., Iosup, A., Epema, D.H.J.: The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems. In: CCGRID 2010: Proceedings of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 398–407. IEEE (2010)
Google Scholar
Anderson, D.P.: BOINC: a system for public-resource computing and storage. In: GRID 2004: Proceedings of the fifth IEEE/ACM International Workshop on Grid Computing, pp. 4–10 (2004)
Google Scholar
Douceur, J.R.: Is remote host availability governed by a universal law? SIGMETRICS Performance Evaluation Review 31(3), 25–29 (2003)
Article Google Scholar
Lázaro, D., Kondo, D., Marquès, J.M.: Long-term availability prediction for groups of volunteer resources. J. Parallel Distrib. Comput. (2011), doi:10.1016/j.jpdc.2011.10.007
Google Scholar
John, G., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proc. 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, ShanDong University, Jinan, China
Dongping Wang & Bin Gong
Shandong College of Electronic Technology, Jinan, China
Guoling Zhao

Authors

Dongping Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Gong
View author publications
You can also search for this author in PubMed Google Scholar
Guoling Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Technology, Huazhong University of Science and Technology, 1037 Luoyu Road, 430074, Wuhan, China
Ruixuan Li
Department of Computing, Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China
Jiannong Cao
University of Franche-Comte, FEMTO-ST, 1 cours Leprince-Ringuet, 25200, Montbéliard, France
Julien Bourgeois

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, D., Gong, B., Zhao, G. (2012). Estimating Deadline-Miss Probabilities of Tasks in Large Distributed Systems. In: Li, R., Cao, J., Bourgeois, J. (eds) Advances in Grid and Pervasive Computing. GPC 2012. Lecture Notes in Computer Science, vol 7296. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30767-6_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-30767-6_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30766-9
Online ISBN: 978-3-642-30767-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics