On the Effects of Data-Aware Allocation on Fully Distributed Storage Systems for Exascale

  • Jose A. Pascual
  • Caroline Concatto
  • Joshua Lant
  • Javier Navaridas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10659)

Abstract

The convergence between computing- and data-centric workloads and platforms is imposing new challenges on how to best use the resources of modern computing systems. In this paper we show the need of enhancing system schedulers to differentiate between compute- and data-oriented applications to minimise interferences between storage and application traffic. These interferences can be especially harmful in systems featuring fully distributed storage systems together with unified interconnects, such as our custom-made architecture ExaNeSt. We analyse several data-aware allocation strategies, and found that such strategies are essential to maintain performance in distributed storage systems.

Keywords

Near-data computing Scheduling Resource allocation 

Notes

Acknowledgement

This work was funded by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 671553.

References

  1. 1.
  2. 2.
    Balzuweit, E., et al.: Local search to improve coordinate-based task mapping. Parallel Comput. 51, 67–78 (2016)CrossRefGoogle Scholar
  3. 3.
  4. 4.
    Bezerra, A., et al.: Job scheduling for optimizing data locality in Hadoop clusters. In: 20th European MPI Users’ Group Meeting, EuroMPI 2013, pp. 271–276. ACM, New York, NY, USA (2013)Google Scholar
  5. 5.
    Bhatele, A., et al.: Mapping applications with collectives over sub-communicators on torus networks. In: International Conference on High Performance Computing Networking, Storage and Analysis, SC 2012, Salt Lake City, UT, p. 97 (2012)Google Scholar
  6. 6.
    Bhatele, A., et al.: There goes the neighborhood: performance degradation due to nearby jobs. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013, Denver, CO, USA, pp. 41:1–41:12 (2013)Google Scholar
  7. 7.
    Caíno-Lores, S., Carretero, J.: A survey on data-centric and data-aware techniques for large scale infrastructures. Int. J. Comput. Electr. Autom. Control Inf. Eng. 10(3), 517–523 (2016). http://waset.org/Publications?p=111
  8. 8.
    Chen, T.Y., et al.: LaSA: A locality-aware scheduling algorithm for Hadoop-MapReduce resource assignment. In: 2013 International Conference on Collaboration Technologies and Systems (CTS), pp. 342–346, May 2013Google Scholar
  9. 9.
    Feitelson, D.G., Rudolph, L., Schwiegelshohn, U.: Parallel job scheduling – a status report. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 1–16. Springer, Heidelberg (2005).  https://doi.org/10.1007/11407522_1 CrossRefGoogle Scholar
  10. 10.
    Hammoud, M., Sakr, M.F.: Locality-aware reduce task scheduling for MapReduce. In: International Conference on Cloud Computing Technology and Science, CLOUDCOM 2011, pp. 570–576, Washington, DC, USA (2011)Google Scholar
  11. 11.
    Jin, J., et al.: Bar: An efficient data locality driven task scheduling algorithm for cloud computing. In: CCGRID, pp. 295–304. IEEE Computer Society (2011). http://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2011.html#JinLSDX11
  12. 12.
    Johnson, C.R., Bunde, D.P., Leung, V.J.: A tie-breaking strategy for processor allocation in meshes. In: 39th International Conference on Parallel Processing, ICPP Workshops, San Diego, California, USA, pp. 331–338 (2010)Google Scholar
  13. 13.
    Katevenis, M., et al.: The ExaNeST project: interconnects, storage, and packaging for exascale systems. In: Euromicro Conferene on Digital System Design (DSD) (2016)Google Scholar
  14. 14.
    Kosar, T., Balman, M.: A new paradigm: data-aware scheduling in grid computing. Future Gener. Comput. Syst. 25(4), 406–413 (2009)CrossRefGoogle Scholar
  15. 15.
  16. 16.
    Pascual, J.A., Miguel-Alonso, J., Lozano, J.A.: Strategies to map parallel applications onto meshes. In: de Leon F. de Carvalho, A.P., Rodríguez-González, S., De Paz Santana, J.F., Rodríguez, J.M.C. (eds.) Distributed Computing and Artificial Intelligence. Advances in Intelligent and Soft Computing, vol 79, pp. 197–204. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-14883-5_26
  17. 17.
    Pascual, J.A., Miguel-Alonso, J., Lozano, J.A.: Optimization-based mapping framework for parallel applications. J. Parallel Distrib. Comput. 71(10), 1377–1387 (2011)CrossRefGoogle Scholar
  18. 18.
    Pascual, J.A., Miguel-Alonso, J., Lozano, J.A.: Locality-aware policies to improve job scheduling on 3D tori. J. Supercomput. 71(3), 966–994 (2015)CrossRefGoogle Scholar
  19. 19.
    Pascual, J.A., Navaridas, J., Miguel-Alonso, J.: Effects of topology-aware allocation policies on scheduling performance. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2009. LNCS, vol. 5798, pp. 138–156. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-04633-9_8 CrossRefGoogle Scholar
  20. 20.
    Power, R., Li, J.: Piccolo: building fast, distributed programs with partitioned tables. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI 2010, pp. 293–306 (2010)Google Scholar
  21. 21.
    Santos-Neto, E., Cirne, W., Brasileiro, F., Lima, A.: Exploiting replication and data reuse to efficiently schedule data-intensive applications on grids. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 210–232. Springer, Heidelberg (2005).  https://doi.org/10.1007/11407522_12 CrossRefGoogle Scholar
  22. 22.
    Topcuouglu, H., Hariri, S., Wu, M.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)CrossRefGoogle Scholar
  23. 23.
    Xu, Q., et al.: Performance analysis of NVMe SSDs and their implication on real world databases. In: Proceedings of the 8th ACM International Systems and Storage Conference, SYSTOR 2015, pp. 6:1–6:11. ACM, New York, NY, USA (2015)Google Scholar
  24. 24.
    Zhang, X., et al.: An effective data locality aware task scheduling method for MapReduce framework in heterogeneous environments. In: International Conference on Cloud and Service Computing, CSC 2011, pp. 235–242. Washington, DC, USA (2011)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Jose A. Pascual
    • 1
  • Caroline Concatto
    • 1
  • Joshua Lant
    • 1
  • Javier Navaridas
    • 1
  1. 1.Computer Science SchoolThe University of ManchesterManchesterUK

Personalised recommendations