On the Effects of Data-Aware Allocation on Fully Distributed Storage Systems for Exascale
Abstract
The convergence between computing- and data-centric workloads and platforms is imposing new challenges on how to best use the resources of modern computing systems. In this paper we show the need of enhancing system schedulers to differentiate between compute- and data-oriented applications to minimise interferences between storage and application traffic. These interferences can be especially harmful in systems featuring fully distributed storage systems together with unified interconnects, such as our custom-made architecture ExaNeSt. We analyse several data-aware allocation strategies, and found that such strategies are essential to maintain performance in distributed storage systems.
Keywords
Near-data computing Scheduling Resource allocationNotes
Acknowledgement
This work was funded by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 671553.
References
- 1.ARM. https://www.arm.com
- 2.Balzuweit, E., et al.: Local search to improve coordinate-based task mapping. Parallel Comput. 51, 67–78 (2016)CrossRefGoogle Scholar
- 3.BeeGFS. https://www.beegfs.com
- 4.Bezerra, A., et al.: Job scheduling for optimizing data locality in Hadoop clusters. In: 20th European MPI Users’ Group Meeting, EuroMPI 2013, pp. 271–276. ACM, New York, NY, USA (2013)Google Scholar
- 5.Bhatele, A., et al.: Mapping applications with collectives over sub-communicators on torus networks. In: International Conference on High Performance Computing Networking, Storage and Analysis, SC 2012, Salt Lake City, UT, p. 97 (2012)Google Scholar
- 6.Bhatele, A., et al.: There goes the neighborhood: performance degradation due to nearby jobs. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013, Denver, CO, USA, pp. 41:1–41:12 (2013)Google Scholar
- 7.Caíno-Lores, S., Carretero, J.: A survey on data-centric and data-aware techniques for large scale infrastructures. Int. J. Comput. Electr. Autom. Control Inf. Eng. 10(3), 517–523 (2016). http://waset.org/Publications?p=111
- 8.Chen, T.Y., et al.: LaSA: A locality-aware scheduling algorithm for Hadoop-MapReduce resource assignment. In: 2013 International Conference on Collaboration Technologies and Systems (CTS), pp. 342–346, May 2013Google Scholar
- 9.Feitelson, D.G., Rudolph, L., Schwiegelshohn, U.: Parallel job scheduling – a status report. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 1–16. Springer, Heidelberg (2005). https://doi.org/10.1007/11407522_1 CrossRefGoogle Scholar
- 10.Hammoud, M., Sakr, M.F.: Locality-aware reduce task scheduling for MapReduce. In: International Conference on Cloud Computing Technology and Science, CLOUDCOM 2011, pp. 570–576, Washington, DC, USA (2011)Google Scholar
- 11.Jin, J., et al.: Bar: An efficient data locality driven task scheduling algorithm for cloud computing. In: CCGRID, pp. 295–304. IEEE Computer Society (2011). http://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2011.html#JinLSDX11
- 12.Johnson, C.R., Bunde, D.P., Leung, V.J.: A tie-breaking strategy for processor allocation in meshes. In: 39th International Conference on Parallel Processing, ICPP Workshops, San Diego, California, USA, pp. 331–338 (2010)Google Scholar
- 13.Katevenis, M., et al.: The ExaNeST project: interconnects, storage, and packaging for exascale systems. In: Euromicro Conferene on Digital System Design (DSD) (2016)Google Scholar
- 14.Kosar, T., Balman, M.: A new paradigm: data-aware scheduling in grid computing. Future Gener. Comput. Syst. 25(4), 406–413 (2009)CrossRefGoogle Scholar
- 15.
- 16.Pascual, J.A., Miguel-Alonso, J., Lozano, J.A.: Strategies to map parallel applications onto meshes. In: de Leon F. de Carvalho, A.P., Rodríguez-González, S., De Paz Santana, J.F., Rodríguez, J.M.C. (eds.) Distributed Computing and Artificial Intelligence. Advances in Intelligent and Soft Computing, vol 79, pp. 197–204. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14883-5_26
- 17.Pascual, J.A., Miguel-Alonso, J., Lozano, J.A.: Optimization-based mapping framework for parallel applications. J. Parallel Distrib. Comput. 71(10), 1377–1387 (2011)CrossRefGoogle Scholar
- 18.Pascual, J.A., Miguel-Alonso, J., Lozano, J.A.: Locality-aware policies to improve job scheduling on 3D tori. J. Supercomput. 71(3), 966–994 (2015)CrossRefGoogle Scholar
- 19.Pascual, J.A., Navaridas, J., Miguel-Alonso, J.: Effects of topology-aware allocation policies on scheduling performance. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2009. LNCS, vol. 5798, pp. 138–156. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04633-9_8 CrossRefGoogle Scholar
- 20.Power, R., Li, J.: Piccolo: building fast, distributed programs with partitioned tables. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI 2010, pp. 293–306 (2010)Google Scholar
- 21.Santos-Neto, E., Cirne, W., Brasileiro, F., Lima, A.: Exploiting replication and data reuse to efficiently schedule data-intensive applications on grids. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 210–232. Springer, Heidelberg (2005). https://doi.org/10.1007/11407522_12 CrossRefGoogle Scholar
- 22.Topcuouglu, H., Hariri, S., Wu, M.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)CrossRefGoogle Scholar
- 23.Xu, Q., et al.: Performance analysis of NVMe SSDs and their implication on real world databases. In: Proceedings of the 8th ACM International Systems and Storage Conference, SYSTOR 2015, pp. 6:1–6:11. ACM, New York, NY, USA (2015)Google Scholar
- 24.Zhang, X., et al.: An effective data locality aware task scheduling method for MapReduce framework in heterogeneous environments. In: International Conference on Cloud and Service Computing, CSC 2011, pp. 235–242. Washington, DC, USA (2011)Google Scholar