Evaluation Through Realistic Simulations of File Replication Strategies for Large Heterogeneous Distributed Systems

  • Anchen ChaiEmail author
  • Sorina Camarasu-Pop
  • Tristan Glatard
  • Hugues Benoit-Cattin
  • Frédéric Suter
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11339)


File replication is widely used to reduce file transfer times and improve data availability in large distributed systems. Replication techniques are often evaluated through simulations, however, most simulation platform models are oversimplified, which questions the applicability of the findings to real systems. In this paper, we investigate how platform models influence the performance of file replication strategies on large heterogeneous distributed systems, based on common existing techniques such as prestaging and dynamic replication. The novelty of our study resides in our evaluation using a realistic simulator. We consider two platform models: a simple hierarchical model and a detailed model built from execution traces. Our results show that conclusions depend on the modeling of the platform and its capacity to capture the characteristics of the targeted production infrastructure. We also derive recommendations for the implementation of an optimized data management strategy in a scientific gateway for medical image analysis.


File replication Platform model Realistic simulation Evaluation 



This work is partially supported by the LABEX PRIMES (ANR-11-LABX-0063) of Université de Lyon, within the program “Investissements d’Avenir” (ANR-11-IDEX-0007) operated by the French National Research Agency (ANR). The authors also thank EGI and France Grilles for their support and the provided resources.


  1. 1.
    Bsoul, M., Abdallah, A., Almakadmeh, K., Tahat, N.: A round-based data replication strategy. IEEE TPDS 27(1), 31–39 (2016)Google Scholar
  2. 2.
    Camarasu-Pop, S., Glatard, T., Benoit-Cattin, H.: Simulating application workflows and services deployed on the european grid infrastructure. In: Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 18–25 (2013)Google Scholar
  3. 3.
    Casanova, H., Giersch, A., Legrand, A., Quinson, M., Suter, F.: Versatile, scalable, and accurate simulation of distributed applications and platforms. J. Parallel Distrib. Comput. 74(10), 2899–2917 (2014)CrossRefGoogle Scholar
  4. 4.
    Chai, A., Bazm, M.M., Camarasu-Pop, S., Glatard, T., Benoit-Cattin, H., Suter, F.: Modeling distributed platforms from application traces for realistic file transfer simulation. In: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 54–63 (2017)Google Scholar
  5. 5.
    Chai, A., Camarasu-Pop, S., Glatard, T., Benoit-Cattin, H., Suter, F.: Companion of article Evaluation through Realistic Simulations of File Replication Strategies for Large Heterogeneous Distributed Systems (2018).
  6. 6.
    Chervenak, A., et al.: Data placement for scientific applications in distributed environments. In: Proceedings of the 8th IEEE/ACM International Conference on Grid Computing, pp. 267–274 (2007)Google Scholar
  7. 7.
    David, M., et al.: Validation of grid middleware for the European grid infrastructure. J. Grid Comput. 12(3), 543–558 (2014)CrossRefGoogle Scholar
  8. 8.
    Dayyani, S., Khayyambashi, M.: RDT: a new data replication algorithm for hierarchical data grid. Int. J. Comput. Sci. Eng. 3(7), 186–197 (2015)Google Scholar
  9. 9.
    Elghirani, A., Subrata, R., Zomaya, A.: A proactive non-cooperative game-theoretic framework for data replication in data grids. In: Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid, pp. 433–440 (2008)Google Scholar
  10. 10.
    Glatard, T., Montagnat, J., Pennec, X.: Optimizing jobs timeouts on clusters and production grids. In: Proceedings of the 7th IEEE International Symposium on Cluster Computing and the Grid, pp. 100–107 (2007)Google Scholar
  11. 11.
    Glatard, T., et al.: A virtual imaging platform for multi-modality medical image simulation. IEEE Trans. Med. Imag. 32(1), 110–118 (2013)CrossRefGoogle Scholar
  12. 12.
    Gupta, H., et al.: iFogSim: a toolkit for modeling and simulation of resource management techniques in the internet of things, edge and fog computing environments. Softw. Pract. Exp. 47(9), 1275–1296 (2017)Google Scholar
  13. 13.
    Lamehamedi, H., et al.: Data replication strategies in grid environments. In: Proceedings of the 5th International Conference on Algorithms and Architectures for Parallel Processing, pp. 378–383 (2002)Google Scholar
  14. 14.
    Lei, M., Vrbsky, S., Hong, X.: An on-line replication strategy to increase availability in data grids. Future Gener. Comput. Syst. 24(2), 85–98 (2008)CrossRefGoogle Scholar
  15. 15.
    Ranganathan, K., Foster, I.: Simulation studies of computation and data scheduling algorithms for data grids. J. Grid Comput. 1(1), 53–62 (2003)CrossRefGoogle Scholar
  16. 16.
    Sato, H., Matsuoka, S., Endo, T., Maruyama, N.: Access-pattern and bandwidth aware file replication algorithm in a grid environment. In: Proceedings of the 9th IEEE/ACM International Conference on Grid Computing, pp. 250–257 (2008)Google Scholar
  17. 17.
    Shorfuzzaman, M., Graham, P., Eskicioglu, R.: Adaptive popularity-driven replica placement in hierarchical data grids. J. Supercomput. 51(3), 374–392 (2010)CrossRefGoogle Scholar
  18. 18.
    Suter, F., Chai, A., Camarasu-Pop, S.: VIPSimulator: A Simulator of Gate Workflow Execution (2016).
  19. 19.
    Tsaregorodtsev, A., et al.: DIRAC3 - the new generation of the LHCb grid software. J. Phys. Conf. Ser. 219(6), 062029 (2010)CrossRefGoogle Scholar
  20. 20.
    Vrbsky, S., Lei, M., Smith, K., Byrd, J.: Data replication and power consumption in data grids. In: Proceedings of the 2nd IEEE International Conference on Cloud Computing Technology and Science, pp. 288–295 (2010)Google Scholar
  21. 21.
    Yang, C.T., Fu, C.P., Hsu, C.H.: File replication, maintenance, and consistency management services in data grids. J. Supercomput. 53(3), 411–439 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Anchen Chai
    • 1
    • 2
    Email author
  • Sorina Camarasu-Pop
    • 1
  • Tristan Glatard
    • 4
  • Hugues Benoit-Cattin
    • 1
  • Frédéric Suter
    • 2
    • 3
  1. 1.Université de Lyon, CREATIS CNRS UMR5220, Inserm U1044, INSA-LyonLyonFrance
  2. 2.IN2P3 Computing Center, CNRSLyon-VilleurbanneFrance
  3. 3.InriaLyonFrance
  4. 4.Department of Computer Science and Software EngineeringConcordia UniversityMontrealCanada

Personalised recommendations