Skip to main content

Near-Data Prediction Based Speculative Optimization in a Distribution Environment

  • Conference paper
  • First Online:
Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications (CloudComp 2019, SmartGift 2019)

Abstract

Apache Hadoop is an open source software framework that supports data-intensive distributed applications and is distributed under the Apache 2.0 licensing agreement, where consumers will no longer deal with complex configuration of software and hardware but only pay for cloud services on demand. So how to make the performance of the cloud platform become more important in a consumer-centric environment. There exists imbalance between in some distribution of slow tasks, which results in straggling tasks will have a great influence on the Hadoop framework. By monitoring those tasks in real-time progress and copying the potential Stragglers to a different node, the speculative execution (SE) realizes to improve the probability of finishing those backup tasks before the original ones. The Speculative execution (SE) applies this principle and thus proposed a solution to handle the Straggling tasks. At present, the performance of the Hadoop system is unsatisfying because of the erroneous judgement and inappropriate selection for the backup nodes in the current SE policy. This paper proposes an SE optimized strategy which can be used in prediction of near data. In this strategy, the first step is gathering the real-time task execution information and the remaining runtime required for the task is predicted by a local prediction method. Then it chooses a proper backup node according to the near data and actual demand in the second step. On the other side, this model also includes a cost-effective model in order to make the performance of SE to the peak. The results show that using this strategy in Hadoop effectively improves the accuracy of alternative tasks and effects better in heterogeneous Hadoop environments in various situations, which is beneficial to consumers and cloud platform.

M. Sun and X. Wu—Both authors are the first author due to equal contribution to this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Vaquero, L.F., Rodero, L., Caceres, J.: A break in the clouds: towards a cloud definition. Acm Sigcomm Comput. Commun. Rev. 39(1), 50–55 (2008)

    Article  Google Scholar 

  2. Iqbal, M.H., Soomro, T.R.: Big data analysis: apache storm perspective. Int. J. Comput. Trends Technol. 19(1), 9–14 (2015)

    Article  Google Scholar 

  3. Zaharia, M., Chowdhury, M., Franklin, M.J.: Spark: cluster computing with working sets. In: Proceedings USENIX Conference on Hot Topics in Cloud Computing, pp. 1765–1773. Springer, Heidelberg (2010)

    Google Scholar 

  4. Li, Z., Shen, H., Ligon, W.: An exploration of designing a hybrid scale-up/out hadoop architecture based on performance measurements. IEEE Trans. Parallel Distrib. Syst 28(2), 386–400 (2017)

    Google Scholar 

  5. Gunarathne, T., Wu, T.L., Qiu, J.: MapReduce in the clouds for science. In: Proceedings Second International Conference on Cloud computing, pp. 565–572 (2010)

    Google Scholar 

  6. Dean, J., Ghemawa, S.: MapReduce: simplified data processing on large clusters. In: Proceedings OSDI, pp. 107–113 (2004)

    Google Scholar 

  7. Liu, Q., Cai, W., Jin, D.: Estimation accuracy on execution time of run-time tasks in a heterogeneous distributed environment. Sensors 16(9), 1386 (2016)

    Article  Google Scholar 

  8. Xu, H., Lau, W.C.: Optimization for speculative execution in big data processing clusters. IEEE Trans. Parallel Distrib. Syst. 28(2), 530–545 (2017)

    Google Scholar 

  9. Xu, H., Lau, W.C.: Optimization for speculative execution in a mapreduce-like cluster. In: Proceedings IEEE Conference on Computer Communications (INFOCOM), pp. 1071–1079 (2015)

    Google Scholar 

  10. Sanchez, R., Almenares, F., Arias, P.: Enhancing privacy and dynamic federation in IdM for consumer cloud computing. IEEE Trans. Consum. Electron. 58(1), 95–103 (2012)

    Article  Google Scholar 

  11. Cabarcos, P.A., Mendoza, F.A., Guerrero, R.S.: SuSSo: seamless and ubiquitous single sign-on for cloud service continuity across devices. IEEE Trans. Consum. Electron. 58(4), 1425–1433 (2012)

    Article  Google Scholar 

  12. Abolfazli, S., Sanaei, Z., Alizadeh, M.: An experimental analysis on cloud-based mobile augmentation in mobile cloud computing. IEEE Trans. Consum. Electron. 58(1), 146–154 (2014)

    Article  Google Scholar 

  13. Fu, Z., Sun, X., Linge, N.: Achieving effective cloud search services: multi-keyword ranked search over encrypted cloud data supporting synonym query. IEEE Trans. Consum. Electron. 60(1), 164–172 (2014)

    Article  Google Scholar 

  14. Eom, B., Lee, C., Lee, H.: An adaptive remote display scheme to deliver mobile cloud services. IEEE Trans. Consum. Electron. 60(3), 540–547 (2014)

    Article  Google Scholar 

  15. Xu, X., Xue, Y., Yuan, Y.: An edge computing-enabled computation offloading method with privacy preservation for internet of connected vehicles. Fut. Gener. Comput. Syst. 96(1), 89–100 (2019)

    Article  Google Scholar 

  16. Lee, Y.: An integrated cloud-based smart home management system with community hierarchy. IEEE Trans. Consum. Electron. 62(1), 1–9 (2016)

    Article  Google Scholar 

  17. Liu, Q., Cai, W., Shen, J.: A speculative approach to spatial-temporal efficiency with multi-objective optimization in a heterogeneous cloud environment. Secur. Commun. Netw. 7(17), 4002–4012 (2016)

    Article  Google Scholar 

  18. Liu, Q., Cai, W., Shen, J.: An adaptive approach to better load balancing in a consumer-centric cloud environment. IEEE Trans. Consum. Electron. 62(3), 243–250 (2016)

    Article  Google Scholar 

  19. Huang, X., Zhang, L., Li, R.: Novel heuristic speculative execution strategies in heterogeneous distributed environments. Comput. Electric. Eng. 50, 166–179 (2015)

    Article  Google Scholar 

  20. Chen, Q., Liu, C., Xiao, Z.: Improving MapReduce performance using smart speculative execution strategy. IEEE Trans. Comput. 63(4), 954–967 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  21. Wu, H., Li, K., Tang, Z.: A Heuristic speculative execution strategy in heterogeneous distributed environments. In: Proceedings Sixth International symposium on Parallel Architectures, Algorithms and Programming (PAAP), pp. 268–273 (2014)

    Google Scholar 

  22. Liu, Q., Cai, W., Shen, J.: A smart strategy for speculative execution based on hardware resource in a heterogeneous distributed environment. Int. J. Grid Distrib. Comput. 9(1), 203–214 (2015)

    Google Scholar 

  23. Wang, Y., Lu, W., Lou, R.: Improving MapReduce performance with partial speculative execution. J. Grid Comput. 13(1), 587–604 (2015)

    Article  Google Scholar 

  24. Li, Y., Yang, Q., Lai, S.: A new speculative execution algorithm based on C4.5 decision tree for hadoop. In: Proceedings the International Conference of Young Computer Scientists, Engineers and Educators (ICYCSEE 2015), pp. 284–291 (2015)

    Google Scholar 

  25. Tang, S., Lee, B., He, B.: DynamicMR: a dynamic slot allocation optimization framework for MapReduce clusters. IEEE Trans. Cloud Comput. 2(3), 333–347 (2014)

    Article  Google Scholar 

  26. Yang, S., Chen, Y.: Design adaptive task allocation scheduler to improve MapReduce performance in heterogeneous clouds. J. Netw. Comput. Appl. 57(1), 61–70 (2015)

    Article  Google Scholar 

  27. Liu, Q., Chen, F., Chen, F.: Home appliances classification based on multi-feature using ELM. Int. J. Sensor Netw. 28(1), 34–42 (2018)

    Article  MathSciNet  Google Scholar 

  28. Xu, X., Li, Y., Huang, T.: An energy-aware computation offloading method for smart edge computing in wireless metropolitan area networks. J. Netw. Comput. Appl. 133(1), 75–85 (2019)

    Article  Google Scholar 

Download references

Acknowledgement

This work has received funding from 5150 Spring Specialists (05492018012, 05762018039), Major Program of the National Social Science Fund of China (Grant No. 17ZDA092), 333 High-Level Talent Cultivation Project of Jiangsu Province (BRA2018332), Royal Society of Edinburgh, UK and China Natural Science Foundation Council (RSE Reference: 62967_Liu_2018_2) under their Joint International Projects funding scheme and basic Research Programs (Natural Science Foundation) of Jiangsu Province (BK20191398).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dandan Jin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, M., Wu, X., Jin, D., Xu, X., Liu, Q., Liu, X. (2020). Near-Data Prediction Based Speculative Optimization in a Distribution Environment. In: Zhang, X., Liu, G., Qiu, M., Xiang, W., Huang, T. (eds) Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications. CloudComp SmartGift 2019 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 322. Springer, Cham. https://doi.org/10.1007/978-3-030-48513-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-48513-9_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-48512-2

  • Online ISBN: 978-3-030-48513-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics