Skip to main content

Hadoop MapReduce Performance on SSDs: The Case of Complex Network Analysis Tasks

  • Conference paper
  • First Online:
Advances in Big Data (INNS 2016)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 529))

Included in the following conference series:

Abstract

This article investigates the relative performance of SSDs versus hard disk drives (HDDs) when they are used as underlying storage for Hadoop’s MapReduce. We examine MapReduce tasks and data suitable for performing analysis of complex networks which present different execution patterns. The obtained results confirmed in part earlier studies which showed that SSDs are beneficial to Hadoop; we also provide solid evidence that the processing pattern of the running application plays a significant role.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The MapReduce codes (along with many experiments) can be found in the technical report at http://www.inf.uth.gr/~dkatsar/Hadoop-SSD-HD-for-SNA.pdf.

References

  1. Chen, Y., Ganapathi, A., Griffith, R., Katz, R.: The case for evaluating MapReduce performance using workload suites. In: Proceedings of IEEE MASCOTS, pp. 390–399 (2011)

    Google Scholar 

  2. Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In: Proceedings of ICDE Workshops (2010)

    Google Scholar 

  3. Islam, N., Rahman, M., Jose, J., Rajachandrasekar, R., Wang, H., Subramoni, H., Murthy, C., Panda, D.: High performance RDMA-design of HDFS over InfiniBand. In: Proceedings of SC (2012)

    Google Scholar 

  4. Kambatla, K., Chen, Y.: The truth about MapReduce performance on SSDs. In: Proceedings of LISA, pp. 109–117 (2014)

    Google Scholar 

  5. Kang, S.-H., Koo, D.-H., Kang, W.-H., Lee, S.-W.: A case for flash memory SSD in Hadoop applications. Int. J. Control Autom. 6, 201–210 (2013)

    Article  Google Scholar 

  6. Krish, K.R., Iqbal, M.S., Butt, A.R.: VENU: orchestrating SSDs in Hadoop storage. In: Proceedings of IEEE BigData, pp. 207–212 (2014)

    Google Scholar 

  7. Min, C., Kim, K., Cho, H., Lee, S.-W., Eom, Y.I.: SFS: random write considered harmful in solid state drives. In: Proceedings of USENIX FAST (2012)

    Google Scholar 

  8. Moon, S., Lee, J., Kee, Y.S.: Introducing SSDs to the Hadoop MapReduce framework. In: Proceeding of IEEE CLOUD, pp. 272–279 (2014)

    Google Scholar 

  9. Saxena, P., Chou, J.: How much solid state drive can improve the performance of Hadoop cluster? Performance evaluation of Hadoop on SSD and HDD. Int. J. Mod. Commun. Technol. Res. 2(5), 1–7 (2014)

    Google Scholar 

  10. Sur, S., Wang, H., Huang, J., Ouyang, X., Panda, D.: Can high-performance interconnects benefit Hadoop distributed file system. In: Proceedings of the Workshop MASVDC (2010)

    Google Scholar 

  11. Wu, D., Xie, W., Ji, X., Luo, W., He, J., Wu, D.: Understanding the impacts of solid-state storage on the Hadoop performance. In: Proceedings of Advanced Cloud and Big Data, pp. 125–130 (2013)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the Project “REDUCTION: Reducing Environmental Footprint based on Multi-Modal Fleet management System for Eco-Routing and Driver Behaviour Adaptation,” funded by the EU.ICT program, Challenge ICT-2011.7.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitrios Katsaros .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Bakratsas, M., Basaras, P., Katsaros, D., Tassiulas, L. (2017). Hadoop MapReduce Performance on SSDs: The Case of Complex Network Analysis Tasks. In: Angelov, P., Manolopoulos, Y., Iliadis, L., Roy, A., Vellasco, M. (eds) Advances in Big Data. INNS 2016. Advances in Intelligent Systems and Computing, vol 529. Springer, Cham. https://doi.org/10.1007/978-3-319-47898-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47898-2_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47897-5

  • Online ISBN: 978-3-319-47898-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics