Research on Distributed File System with Hadoop

Xu, JunWu; Liang, JunLing

doi:10.1007/978-3-642-35211-9_19

JunWu Xu⁵ &
JunLing Liang⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 345))

Included in the following conference series:

International Conference on Network Computing and Information Security

1525 Accesses

Abstract

This paper describes research in the use of Hadoop to develop applications.. This paper introduces the structure of Hadoop and describes the implementation of algorithms in our library. Hadoop is a top-level Apache project being built and used by a global community of contributors, written in the Java programming language. Yahoo! has been the largest contributor to the project, and uses Hadoop extensively across its businesses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ghemawat, S., Gobioff, H., Leung, S.: The Google file system. In: Proc. of ACM Symposium on Operating Systems Principles, Lake George, NY, pp. 29–43 (October 2003)
Google Scholar
Junqueira, F.P., Reed, B.C.: The life and times of a zookeeper. In: Proc. of the 28th ACM Symposium on Principles of Distributed Computing, Calgary, AB, Canada, August 10-12 (2009)
Google Scholar
Carns, P.H., Ligon III, W.B., Ross, R.B., Thakur, R.: PVFS: A parallel file system for Linux clusters. In: Proc. of 4th Annual Linux Showcase and Conference, pp. 317–327 (2000)
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. In: Proc. of the 6th Symposium on Operating Systems Design and Implementation, San Francisco CA (December 2004)
Google Scholar
Weil, S., Brandt, S., Miller, E., Long, D., Maltzahn, C.: Ceph: A Scalable, High-Performance Distributed File System. In: Proc. of the 7th Symposium on Operating Systems Design and Implementation, Seattle, WA (November 2006)
Google Scholar
Welch, B., Unangst, M., Abbasi, Z., Gibson, G., Mueller, B., Small, J., Zelenka, J., Zhou, B.: Scalable Performance of the Panasas Parallel file System. In: Proc. of the 6th USENIX Conference on File and Storage Technologies, San Jose, CA (February 2008)
Google Scholar
Zhang, Z., Kulkarni, A., Ma, X., Zhou, Y.: Memory resource allocation for file system prefetching: from a supply chain management perspective. In: Proc. of the 4th ACM European Conf. on Computer Systems (EuroSys 2009), pp. 75–88. ACM Press, Germany (2009)
Chapter Google Scholar
White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc. (June 2009)
Google Scholar
Dong, B., Zheng, Q., Qiao, M., Shu, J., Yang, J.: BlueSky Cloud Framework: An E-Learning Framework Embracing Cloud Computing. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) CloudCom 2009. LNCS, vol. 5931, pp. 577–582. Springer, Heidelberg (2009)
Chapter Google Scholar
Soundararajan, G., Mihailescu, M., Amza, C.: Context-aware prefetching at the storage server. In: Proc. of the 2008 USENIX Annual Tech. Conf. (USENIX 2008), pp. 377–390. USENIX Association Press, Berkeley (2008)
Google Scholar
Schmuck, F., Haskin, R.: GPFS: A Shared-Disk File System for Large Computing Clusters. In: Proc. of the 1st USENIX Conf. on File and Storage Technologies (FAST 2002), pp. 231–244. USENIX Association Press, Monterey (2002)
Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google File System. In: Proc. of the 19th ACM Symp. on Operating Systems Principles (SOSP 2003), pp. 29–43. ACM Press, Lake (2003)
Chapter Google Scholar
Li, M., Varki, E., Bhatia, S., Merchant, A.: TaP: Table-based Prefetching for Storage Caches. In: Proc. of the 6th USENIX Conf. on File and Storage Technologies (FAST 2008), pp. 81–96. USENIX Association Press, San Jose (2008)
Google Scholar
Gill, B.S., Modha, D.S.: SARC: Sequential prefetching in adaptive replacement cache. In: Proc. of the 2005 USENIX Annual Tech. Conf. (USENIX 2005), pp. 293–308. USENIX Association Press, Anaheim (2005)
Google Scholar
Gill, B.S., Bathen, L.A.D.: AMP: Adaptive Multistream Prefetching in a Shared Cache. In: Proc. of the 5th USENIX Conf. on File and Storage Technologies (FAST 2007), pp. 185–198. USENIX Association Press, San Jose (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Hubei Provincial Key Laboratory of Intelligent Robot Wuhan Institute of Technology, 430073, China
JunWu Xu
School of Automation, Wuhan University of Technology, 430074, China
JunLing Liang

Authors

JunWu Xu
View author publications
You can also search for this author in PubMed Google Scholar
JunLing Liang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer and Information Engineering, Shanghai University of Electric Power, 200090, Shanghai, China
Jingsheng Lei
Caritas Institute of Higher Education, 18 Chui Ling Road, Tseung Kwan O, Hong Kong, China
Fu Lee Wang
School of Computer Engineering, Computer Science Division, Nanyang Technological University (NTU), 50 Nanyang Avenue, Singapore
Mo Li
Department of Computer Science and Engineering, Shanghai Jiao Tong University, 200030, Shanghai, China
Yuan Luo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, J., Liang, J. (2012). Research on Distributed File System with Hadoop. In: Lei, J., Wang, F.L., Li, M., Luo, Y. (eds) Network Computing and Information Security. NCIS 2012. Communications in Computer and Information Science, vol 345. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35211-9_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-35211-9_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35210-2
Online ISBN: 978-3-642-35211-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics