Abstract
Users of the SNS produce files every day, which makes it face a great demand for file storage. HDFS is such a good system to meet the demand. Files produced by SNS users are always small, but the HDFS is designed to store big files. When we use HDFS to store files in SNS directly, something bad will happen. This paper proposes a novel method to store SNS files in HDFS by merging all files of the same user to one single file. The method is evaluated by experiments conducted on files produced by SNS users and has a better performance than the original HDFS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boyd, D.M., Ellison, N.B.: Social network sites: definition, history and scholarship. J. Comput. Mediat. Commun. 13, 210–230 (2008)
Yi, C., Deng, W.: Analysis based on user browsing behavior to obtain user interest. Comput. Technol. Dev. 5, 37–39 (2008)
Dong, B., Qiu, J., Zheng, Q., Zhong, X., Li, J., Li, Y.: A novel approach to improving the efficiency of storing and accessing small files on Hadoop: a case study by PowerPoint files. In: Proceedings of IEEE International Conference on Services Computing, pp. 65–72, Miami, FL, USA, July 2010
White, T.: The small files problem (2009). http://www.cloudera.com/blog/2009/02/the-small-files-problem
White, T.: Hadoop: The Definitive Guide, 2nd edn., pp. 41–45. O’Reilly Media/Yahoo Press, Sebastopol (2009)
HDFS Architecture Guide (2009). http://hadoop.apache.org/common/docs/current/hdfs_design.html
Acknowledgement
This research was supported by the National Natural Science Foundation (No. 61170113), P.R. China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, G., Zuo, M., Liu, X., Xia, F. (2014). Improving the Efficiency of Storing SNS Small Files in HDFS. In: Yuan, Y., Wu, X., Lu, Y. (eds) Trustworthy Computing and Services. ISCTCS 2013. Communications in Computer and Information Science, vol 426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43908-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-662-43908-1_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43907-4
Online ISBN: 978-3-662-43908-1
eBook Packages: Computer ScienceComputer Science (R0)