Abstract
PVFS is one of the most popular distributed file systems with parallelism, which is still widely used today. Now PVFS is in its version 2, called PVFS2. PVFS2 has a leading performance on I/O operations, but the reliability and stability are not as good. One of the reasons is the lack of data replication. This paper presents a new data replication scheme in PVFS2. In our approach, the backup operation is done on the servers, therefore the user experience is not affected while creating copies of files. In addition, we optimized the read operation of PVFS2. With copies, we can choose the servers to read from, so we can maintain parallelism of read operation under complex conditions such as a server is down or the load of some servers are obviously higher than others. Experimental results verify the effectiveness and efficiency of our method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhao, D., Raicu, I.: Distributed file systems for exascale computing. Doctoral Showcase, SC, 12 (2012)
Ross, R.B., Thakur, R.: PVFS: a parallel file system for Linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, pp. 391–430 (2000)
Parallel Virtual File System, Version 2. http://www.pvfs.org/
Wu, J., Wyckoff, P., Panda, D.: PVFS over InfiniBand: design and performance evaluation. In: 2003 Proceedings of the International Conference on Parallel Processing, pp. 125–132. IEEE (2003)
Wu, J., Wyckoff. P., Panda, D.: Supporting efficient noncontiguous access in PVFS over InfiniBand. In: 2003 Proceedings of the IEEE International Conference on Cluster Computing, pp. 344–351. IEEE (2003)
Zhu, Y., Jiang, H.: Ceft: a cost-effective, fault-tolerant parallel virtual file system. J. Parallel Distrib. Comput. 66(2), 291–306 (2006)
Bell, W.H., Cameron, D.G., Millar, A.P., et al.: Optorsim: a grid simulator for studying dynamic data replication strategies. Int. J. High Perform. Comput. Appl. 17(4), 403–416 (2003)
Nieto, E., Camacho, H.E., Anguita, M., et al.: Fault tolerant PVFS2 based on data replication. In: 2010 1st International Conference on Parallel Distributed and Grid Computing (PDGC), pp. 107–112. IEEE (2010)
Satyanarayanan, M.: A survey of distributed file systems. Annu. Rev. Comput. Sci. 4(1), 73–104 (1990)
Latham, R., Miller, N., Ross, R., et al.: A next-generation parallel file system for Linux cluster. LinuxWorld Mag. 2 (ANL/MCS/JA-48544) (2004)
Zhang, X., Jiang, S., Davis, K.: Making resonance a common case: a high-performance implementation of collective I/O on parallel file systems. In: IEEE International Symposium on Parallel & Distributed Processing, 2009, IPDPS 2009, pp. 1–12. IEEE (2009)
Kunkel, J.M., Ludwig, T.: Performance evaluation of the PVFS2 architecture. In: 15th EUROMICRO International Conference on Parallel, Distributed and Network-Based Processing, 2007, PDP 2007, pp. 509–516. IEEE (2007)
Chai, L., Ouyang, X., Noronha, R., et al.: pNFS/PVFS2 over InfiniBand: early experiences. In: Proceedings of the 2nd International Workshop on Petascale Data Storage: Held in Conjunction with Supercomputing 2007, pp. 5–11. ACM (2007)
Choi, Y.H., Cho, W.H., Eom, H., et al.: A study of the fault-tolerant PVFS2. In: 2011 6th International Conference on Computer Sciences and Convergence Information Technology (ICCIT), pp. 482–485. IEEE (2011)
Zhu, Y., Jiang, H., Qin, X., et al.: Improved read performance in a cost-effective, faulttolerant parallel virtual file system (CEFT-PVFS). In: 2003 Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGrid 2003, pp. 730–735. IEEE (2003)
Wolfson, O., Jajodia, S., Huang, Y.: An adaptive data replication algorithm. ACM Trans. Database Syst. (TODS) 22(2), 255–314 (1997)
Saadat, N., Rahmani, A.M.: PDDRA: a new pre-fetching based dynamic data replication algorithm in data grids. Future Gener. Comput. Syst. 28(4), 666–681 (2012)
Cachin, C., Junker, B., Sorniotti, A.: On limitations of using cloud storage for data replication. In: 2012 IEEE/IFIP 42nd International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 1–6. IEEE (2012)
Lustre. http://lustre.org/
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: SIGOPS Operating Systems Review, vol. 37(5), pp. 29–43. ACM (2003)
Shvachko, K., Kuang, H., Radia, S., et al.: The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10. IEEE (2010)
Acknowledgments
We would like to thank the anonymous reviewers for helping us refine this paper. Their constructive comments and suggestions are very helpful. This paper is partly funded by National Science and Technology Major Project of the Ministry of Science and Technology of China under grant 2011ZX05035-004-004HZ. The corresponding author of this paper is Jie Tang.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Bao, N., Tang, J., Zhang, X., Wu, G. (2015). A New Data Replication Scheme for PVFS2. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9530. Springer, Cham. https://doi.org/10.1007/978-3-319-27137-8_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-27137-8_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27136-1
Online ISBN: 978-3-319-27137-8
eBook Packages: Computer ScienceComputer Science (R0)