Encyclopedia of Big Data Technologies

Living Edition
| Editors: Sherif Sakr, Albert Zomaya

Storage Hierarchies for Big Data

Living reference work entry
DOI: https://doi.org/10.1007/978-3-319-63962-8_175-1

Motivation

Big data applications usually have to rely on a combination of storage media to achieve an economic balance between the capabilities of different media types and application needs.

Applications requirements vary significantly in data lifetime, number of concurrent data producers/consumers, fraction of active to passive data volume, sharing between parallel processing units, and the relative balance between CPU and IO requirements.

Storage media properties vary by several orders in price per capacity, latency for sequential and random access patterns, aggregate and single stream bandwidth, power requirements, endurance, and reliability. Several methods exist to further adapt these capabilities by combining several storage devices of the same type, but larger and economically efficient setups are constructed by combining several different storage technologies.

In addition, larger storage deployments typically provide services to more than one application and hence aim to...

This is a preview of subscription content, log in to check access

References

  1. Bird I et al (2005) LHC computing grid—technical design report. CERN-LHCC-2005-024Google Scholar
  2. Bonwick J, Moore B (2003) ZFS: the last word in file systems. http://opensolaris.org/os/community/zfs/docs/zfs_last.pdf
  3. Brewer E et al (2016) Disks for data centers. https://research.google.com/pubs/pub44830.html
  4. Feldman T, Gibson G (2013) Shingled magnetic recording—areal density increase requires new data management. Login 38(3):22–30Google Scholar
  5. Gupta P et al (2014) An economic perspective of disk vs. flash media in archival storage. In: IEEE MASCOTS 2014Google Scholar
  6. Intel (2015) Micron debut 3D XPoint storage technology 1,000x faster than current SSDs. https://www.cnet. com/news/intel-and-micron-debut-3d-xpoint-storage-te chnology-thats-1000-times-faster-than-existing-drives/
  7. Klein A (2017) Hard disk cost per gigabyte. https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/
  8. Kryder et al (2008) Heat assisted magnetic recording. Proc IEEE 96(11):1810–1835Google Scholar
  9. Mellor C (2014) Kryder’s law craps out: race to UBER-CHEAP STORAGE is OVER. https://www.theregister. co.uk/2014/11/10/kryders_law_of_ever_cheaper_stor- age_disproven/
  10. Pace A (2014) Technologies for large data management in scientific computing. Int J Mod Phy C 25(2):1430001Google Scholar
  11. Walter C (2005) Kryder’s law. Sci Am 293(2):32–3Google Scholar
  12. Zaharia M et al (2010) Spark: cluster computing with working sets. Technical report No. UCB/EECS-2010-53, University of California, BerkeleyGoogle Scholar
  13. Zhu J, Zhu X, Tang Y (2007) Microwave Assisted Magnetic Recording, IDEMAGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Information Technology DepartmentCERNGenevaSwitzerland

Section editors and affiliations

  • Bingsheng He
  • Behrooz Parhami
    • 1
  1. 1.Dept. of Electrical and Computer EngineeringUniversity of California, Santa BarbaraSanta BarbaraUnited States