Skip to main content

The Fault Tolerance of Big Data Systems

  • Conference paper
  • First Online:
Management of Information, Process and Cooperation (MIPaC 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 686))

  • 506 Accesses

Abstract

When the size of the data itself becomes part of the problem, big data era is approaching. Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis. Fault tolerance is of great importance for big data systems, which have potential software and hardware faults after their development. This paper introduces some popular applications and case studies of big data mining. The architecture of big data’s individual components has parallel and distributed features, including distributed data processing, distributed storage and distributed memory, this paper briefly introduces Hadoop architecture of big data systems. Then presents some fault tolerance work recently in the big data systems such as batch computing, stream computing, Spark and Software defined networks, which shows great efforts to the capability of massive big data systems, and makes some comparison with each other.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jhawar, R., Piuri, V., Santambrogio, M.: A comprehensive conceptual system-level approach to fault tolerance in cloud computing. In: 2012 IEEE International Systems Conference (SysCon), pp. 1–5. IEEE (2012)

    Google Scholar 

  2. Dyavanur, M., Kori, K.: Fault tolerance techniques in big data tools: a survey. Int. J. Innovative Res. Comput. Commun. Eng. 2(2), 95–101 (2014)

    Google Scholar 

  3. Parker, P.A.: Discussion of “reliability meets big data: opportunities and challenges”. Qual. Eng. 26(1), 117–120 (2014)

    Article  Google Scholar 

  4. Shvachko, K., Kuang, H., Radia, S., et al.: The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10. IEEE (2010)

    Google Scholar 

  5. Neumeyer, L., Robbins, B., Nair, A., et al.: S4: distributed stream computing platform. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 170–177. IEEE (2010)

    Google Scholar 

  6. Jones, M.T.: Process real-time big data with Twitter Storm. IBM Tech. Libr. 14(2), 1–5 (2013)

    Google Scholar 

  7. Reitblatt, M., Canini, M., Guha, A., et al.: Fattire: declarative fault tolerance for software-defined networks. In: Proceedings of the Second ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking, pp. 109–114. ACM (2013)

    Google Scholar 

  8. Antoniu, G., Costan, A., Bigot, J., et al.: Scalable data management for map-reduce-based data-intensive applications: a view for cloud and hybrid infrastructures. Int. J. Cloud Comput. 2(2), 150–170 (2013)

    Article  Google Scholar 

  9. Hwang, J.H., Balazinska, M., Rasin, A., et al.: High-availability algorithms for distributed stream processing. In: Proceedings of 21st International Conference on Data Engineering 2005, ICDE 2005, pp. 779–790. IEEE (2005)

    Google Scholar 

  10. Zaharia, M., Chowdhury, M., Das, T., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, p. 2. USENIX Association (2012)

    Google Scholar 

  11. Zaharia, M., Chowdhury, M., Franklin, M.J., et al.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, p. 10 (2010)

    Google Scholar 

  12. Kim, H., Santos, J.R., Turner, Y., et al.: Coronet: fault tolerance for software defined networks. In: 2012 20th IEEE International Conference on Network Protocols (ICNP), pp. 1–2. IEEE (2012)

    Google Scholar 

Download references

Acknowledgements

This paper is supported by the project 61303094 supported by National Natural Science Foundation of China, by the Science and Technology Commission of Shanghai Municipality (16511102400), by Innovation Program of Shanghai Municipal Education Commission (14YZ024).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xing Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Wu, X., Du, Z., Dai, S., Liu, Y. (2017). The Fault Tolerance of Big Data Systems. In: Cao, J., Liu, J. (eds) Management of Information, Process and Cooperation. MIPaC 2016. Communications in Computer and Information Science, vol 686. Springer, Singapore. https://doi.org/10.1007/978-981-10-3996-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3996-6_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3995-9

  • Online ISBN: 978-981-10-3996-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics