Advertisement

Erasure-Coded Byzantine Storage with Separate Metadata

  • Elli Androulaki
  • Christian Cachin
  • Dan Dobre
  • Marko Vukolić
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8878)

Abstract

Although many distributed storage protocols have been introduced, a solution that combines the strongest properties in terms of availability, consistency, fault-tolerance, storage complexity, and concurrency has been elusive so far. Combining these properties is difficult, especially if the resulting solution is required to be efficient and incur low cost.

We present AWE, the first erasure-coded distributed implementation of a multi-writer multi-reader read/write register object that is, at the same time: (1) asynchronous, (2) wait-free, (3) atomic, (4) amnesic, (i.e., nodes store a bounded number of values), and (5) Byzantine fault-tolerant (BFT), using the optimal number of nodes. AWE maintains metadata separately from bulk data, which is encoded into fragments with a k-out-of-n erasure code and stored on dedicated data nodes that support only simple reads and writes. Furthermore, AWE is the first BFT storage protocol that uses only n = 2t + k data nodes to tolerate t Byzantine faults, for any k ≥ 1. Metadata, on the other hand, is stored using an atomic snapshot object, which may be realized from 3t + 1 metadata nodes for tolerating t Byzantine faults.

AWE is efficient and uses only lightweight cryptographic hash functions. Moreover, we show that hash functions are needed by any BFT distributed storage protocol that stores the bulk data on 3t or fewer data nodes.

Keywords

Hash Function Garbage Collection Data Node Erasure Code Bulk Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abraham, I., Chockler, G., Keidar, I., Malkhi, D.: Byzantine disk Paxos: Optimal resilience with Byzantine shared memory. Distributed Computing 18(5), 387–408 (2006)CrossRefzbMATHGoogle Scholar
  2. 2.
    Adya, A., Bolosky, W.J., Castro, M., Cermak, G., Chaiken, R., Douceur, J.R., Howell, J., Lorch, J.R., Theimer, M., Wattenhofer, R.P.: FARSITE: Federated, available, and reliable storage for an incompletely trusted environment. In: Proc. Symp. Operating Systems Design and Implementation (2002)Google Scholar
  3. 3.
    Afek, Y., Attiya, H., Dolev, D., Gafni, E., Merritt, M., Shavit, N.: Atomic snapshots of shared memory. Journal of the ACM 40(4), 873–890 (1993)CrossRefzbMATHGoogle Scholar
  4. 4.
    Androulaki, E., Cachin, C., Dobre, D., Vukolić, M.: Erasure-coded Byzantine storage with separate metadata. Report arXiv:1402.4958, CoRR (2014)Google Scholar
  5. 5.
    Bessani, A., Correia, M., Quaresma, B., André, F., Sousa, P.: DepSky: Dependable and secure storage in a cloud-of-clouds. In: Proc. European Conference on Computer Systems, pp. 31–46 (2011)Google Scholar
  6. 6.
    Bowers, K.D., Juels, A., Oprea, A.: HAIL: A high-availability and integrity layer for cloud storage. In: Proc. ACM Conference on Computer and Communications Security, pp. 187–198 (2009)Google Scholar
  7. 7.
    Cachin, C., Dobre, D., Vukolić, M.: Separating data and control: Asynchronous BFT storage with 2t + 1 data replicas. In: Felber, P., Garg, V. (eds.) SSS 2014. LNCS, vol. 8756, pp. 1–17. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  8. 8.
    Cachin, C., Guerraoui, R., Rodrigues, L.: Introduction to Reliable and Secure Distributed Programming, 2nd edn. Springer (2011)Google Scholar
  9. 9.
    Cachin, C., Tessaro, S.: Optimal resilience for erasure-coded Byzantine distributed storage. In: Proc. Dependable Systems and Networks, pp. 115–124 (2006)Google Scholar
  10. 10.
    Cadambe, V.R., Lynch, N., Medard, M., Musial, P.: Coded atomic shared memory emulation for message passing architectures. CSAIL Technical Report MIT-CSAIL-TR-2013-016. MIT (2013)Google Scholar
  11. 11.
    Chockler, G., Guerraoui, R., Keidar, I.: Amnesic distributed storage. In: Pelc, A. (ed.) DISC 2007. LNCS, vol. 4731, pp. 139–151. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Chockler, G., Guerraoui, R., Keidar, I., Vukolić, M.: Reliable distributed storage. IEEE Computer 42(4), 60–67 (2009)CrossRefGoogle Scholar
  13. 13.
    Dobre, D., Karame, G., Li, W., Majuntke, M., Suri, N., Vukolić, M.: PoWerStore: Proofs of writing for efficient and robust storage. In: Proc. ACM Conference on Computer and Communications Security (2013)Google Scholar
  14. 14.
    Dobre, D., Majuntke, M., Suri, N.: On the time-complexity of robust and amnesic storage. In: Baker, T.P., Bui, A., Tixeuil, S. (eds.) OPODIS 2008. LNCS, vol. 5401, pp. 197–216. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Dutta, P.S., Guerraoui, R., Levy, R.R.: Optimistic erasure-coded distributed storage. In: Taubenfeld, G. (ed.) DISC 2008. LNCS, vol. 5218, pp. 182–196. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Frølund, S., Merchant, A., Saito, Y., Spence, S., Veitch, A.: A decentralized algorithm for erasure-coded virtual disks. In: Proc. Dependable Systems and Networks, pp. 125–134 (2004)Google Scholar
  17. 17.
    Goldreich, O.: Foundations of Cryptography, vol. I & II. Cambridge University Press (2001–2004)Google Scholar
  18. 18.
    Goodson, G.R., Wylie, J.J., Ganger, G.R., Reiter, M.K.: Efficient Byzantine-tolerant erasure-coded storage. In: Proc. Dependable Systems and Networks, pp. 135–144 (2004)Google Scholar
  19. 19.
    Guerraoui, R., Levy, R.R., Vukolić, M.: Lucky read/write access to robust atomic storage. In: Proc. Dependable Systems and Networks, pp. 125–136 (2006)Google Scholar
  20. 20.
    Hendricks, J.: Efficient Byzantine Fault Tolerance for Scalable Storage and Services. Ph.D. thesis, School of Computer Science, Carnegie Mellon University (2009)Google Scholar
  21. 21.
    Hendricks, J., Ganger, G.R., Reiter, M.K.: Low-overhead Byzantine fault-tolerant storage. In: Proc. ACM Symposium on Operating Systems Principles (2007)Google Scholar
  22. 22.
    Herlihy, M.: Wait-free synchronization. ACM Transactions on Programming Languages and Systems 11(1), 124–149 (1991)CrossRefGoogle Scholar
  23. 23.
    Herlihy, M.P., Wing, J.M.: Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems 12(3), 463–492 (1990)CrossRefGoogle Scholar
  24. 24.
    Huang, C., Simitci, H., Xu, Y., Ogus, A., Calder, B., Gopalan, P., et al.: Erasure coding in Windows Azure Storage. In: Proc. USENIX Annual Technical Conference (2012)Google Scholar
  25. 25.
    Malkhi, D., Reiter, M.K.: Byzantine quorum systems. Distributed Computing 11(4), 203–213 (1998)CrossRefGoogle Scholar
  26. 26.
    Martin, J.P., Alvisi, L., Dahlin, M.: Minimal Byzantine storage. In: Malkhi, D. (ed.) DISC 2002. LNCS, vol. 2508, pp. 311–325. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  27. 27.
    Vukolić, M.: Quorum Systems: With Applications to Storage and Consensus. Synthesis Lectures on Distributed Computing Theory. Morgan & Claypool (2012)Google Scholar
  28. 28.
    Wong, W.: Cleversafe grows along with customers’ data storage needs. Chicago Tribune (2013)Google Scholar
  29. 29.
    Yin, J., Martin, J.P., Alvisi, A.V.L., Dahlin, M.: Separating agreement from execution in Byzantine fault-tolerant services. In: Proc. ACM Symposium on Operating Systems Principles, pp. 253–268 (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Elli Androulaki
    • 1
  • Christian Cachin
    • 1
  • Dan Dobre
    • 2
  • Marko Vukolić
    • 3
    • 4
  1. 1.IBM Research - ZurichRüschlikonSwitzerland
  2. 2.Work Done at NEC Labs EuropeGermany
  3. 3.Department of Computer ScienceETH ZurichSwitzerland
  4. 4.EurécomSophia AntipolisFrance

Personalised recommendations