Storage Hierarchies for Big Data
Big data applications usually have to rely on a combination of storage media to achieve an economic balance between the capabilities of different media types and application needs.
Applications requirements vary significantly in data lifetime, number of concurrent data producers/consumers, fraction of active to passive data volume, sharing between parallel processing units, and the relative balance between CPU and IO requirements.
Storage media properties vary by several orders in price per capacity, latency for sequential and random access patterns, aggregate and single stream bandwidth, power requirements, endurance, and reliability. Several methods exist to further adapt these capabilities by combining several storage devices of the same type, but larger and economically efficient setups are constructed by combining several different storage technologies.
In addition, larger storage deployments typically provide services to more than one application and hence aim to...
- Bird I et al (2005) LHC computing grid—technical design report. CERN-LHCC-2005-024Google Scholar
- Bonwick J, Moore B (2003) ZFS: the last word in file systems. http://opensolaris.org/os/community/zfs/docs/zfs_last.pdf
- Brewer E et al (2016) Disks for data centers. https://research.google.com/pubs/pub44830.html
- Feldman T, Gibson G (2013) Shingled magnetic recording—areal density increase requires new data management. Login 38(3):22–30Google Scholar
- Gregg B (2009) Hybrid storage pool: top speeds. http:// dtrace.org/blogs/brendan/2009/10/08/hybrid-storage- pool-top-speeds/
- Gupta P et al (2014) An economic perspective of disk vs. flash media in archival storage. In: IEEE MASCOTS 2014Google Scholar
- Intel (2015) Micron debut 3D XPoint storage technology 1,000x faster than current SSDs. https://www.cnet. com/news/intel-and-micron-debut-3d-xpoint-storage-te chnology-thats-1000-times-faster-than-existing-drives/
- Klein A (2017) Hard disk cost per gigabyte. https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/
- Kryder et al (2008) Heat assisted magnetic recording. Proc IEEE 96(11):1810–1835Google Scholar
- Mellor C (2014) Kryder’s law craps out: race to UBER-CHEAP STORAGE is OVER. https://www.theregister. co.uk/2014/11/10/kryders_law_of_ever_cheaper_stor- age_disproven/
- Pace A (2014) Technologies for large data management in scientific computing. Int J Mod Phy C 25(2):1430001Google Scholar
- Walter C (2005) Kryder’s law. Sci Am 293(2):32–3Google Scholar
- Zaharia M et al (2010) Spark: cluster computing with working sets. Technical report No. UCB/EECS-2010-53, University of California, BerkeleyGoogle Scholar
- Zhu J, Zhu X, Tang Y (2007) Microwave Assisted Magnetic Recording, IDEMAGoogle Scholar