Abstract
This paper describes research in the use of Hadoop to develop applications.. This paper introduces the structure of Hadoop and describes the implementation of algorithms in our library. Hadoop is a top-level Apache project being built and used by a global community of contributors, written in the Java programming language. Yahoo! has been the largest contributor to the project, and uses Hadoop extensively across its businesses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ghemawat, S., Gobioff, H., Leung, S.: The Google file system. In: Proc. of ACM Symposium on Operating Systems Principles, Lake George, NY, pp. 29–43 (October 2003)
Junqueira, F.P., Reed, B.C.: The life and times of a zookeeper. In: Proc. of the 28th ACM Symposium on Principles of Distributed Computing, Calgary, AB, Canada, August 10-12 (2009)
Carns, P.H., Ligon III, W.B., Ross, R.B., Thakur, R.: PVFS: A parallel file system for Linux clusters. In: Proc. of 4th Annual Linux Showcase and Conference, pp. 317–327 (2000)
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. In: Proc. of the 6th Symposium on Operating Systems Design and Implementation, San Francisco CA (December 2004)
Weil, S., Brandt, S., Miller, E., Long, D., Maltzahn, C.: Ceph: A Scalable, High-Performance Distributed File System. In: Proc. of the 7th Symposium on Operating Systems Design and Implementation, Seattle, WA (November 2006)
Welch, B., Unangst, M., Abbasi, Z., Gibson, G., Mueller, B., Small, J., Zelenka, J., Zhou, B.: Scalable Performance of the Panasas Parallel file System. In: Proc. of the 6th USENIX Conference on File and Storage Technologies, San Jose, CA (February 2008)
Zhang, Z., Kulkarni, A., Ma, X., Zhou, Y.: Memory resource allocation for file system prefetching: from a supply chain management perspective. In: Proc. of the 4th ACM European Conf. on Computer Systems (EuroSys 2009), pp. 75–88. ACM Press, Germany (2009)
White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc. (June 2009)
Dong, B., Zheng, Q., Qiao, M., Shu, J., Yang, J.: BlueSky Cloud Framework: An E-Learning Framework Embracing Cloud Computing. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) CloudCom 2009. LNCS, vol. 5931, pp. 577–582. Springer, Heidelberg (2009)
Soundararajan, G., Mihailescu, M., Amza, C.: Context-aware prefetching at the storage server. In: Proc. of the 2008 USENIX Annual Tech. Conf. (USENIX 2008), pp. 377–390. USENIX Association Press, Berkeley (2008)
Schmuck, F., Haskin, R.: GPFS: A Shared-Disk File System for Large Computing Clusters. In: Proc. of the 1st USENIX Conf. on File and Storage Technologies (FAST 2002), pp. 231–244. USENIX Association Press, Monterey (2002)
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google File System. In: Proc. of the 19th ACM Symp. on Operating Systems Principles (SOSP 2003), pp. 29–43. ACM Press, Lake (2003)
Li, M., Varki, E., Bhatia, S., Merchant, A.: TaP: Table-based Prefetching for Storage Caches. In: Proc. of the 6th USENIX Conf. on File and Storage Technologies (FAST 2008), pp. 81–96. USENIX Association Press, San Jose (2008)
Gill, B.S., Modha, D.S.: SARC: Sequential prefetching in adaptive replacement cache. In: Proc. of the 2005 USENIX Annual Tech. Conf. (USENIX 2005), pp. 293–308. USENIX Association Press, Anaheim (2005)
Gill, B.S., Bathen, L.A.D.: AMP: Adaptive Multistream Prefetching in a Shared Cache. In: Proc. of the 5th USENIX Conf. on File and Storage Technologies (FAST 2007), pp. 185–198. USENIX Association Press, San Jose (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, J., Liang, J. (2012). Research on Distributed File System with Hadoop. In: Lei, J., Wang, F.L., Li, M., Luo, Y. (eds) Network Computing and Information Security. NCIS 2012. Communications in Computer and Information Science, vol 345. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35211-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-35211-9_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35210-2
Online ISBN: 978-3-642-35211-9
eBook Packages: Computer ScienceComputer Science (R0)