Abstract
In this paper, we propose a fast and accurate block-level operation (writing or reading) and transferred size prediction method based on Regularized Extreme Learning Machine, which represents a key component towards sustainable, green data center. The proposed RELM-based method can produce competitive performance at fast learning speed. Benefitting from the random weights of RELM, these two prediction tasks can be unified as one, thus reducing the training time to half. Experiments on SNIA shows that block-level operation type prediction can reach an accuracy of 99.04%, while the transferred size prediction is at 0.0234 NRMSE.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
DCD Intelligence, DCD Industry Census 2013: Data Center Power - Is the data center industry getting better at using power? (2014)
National Environment Agency (NEA), Data Center Energy Efficiency Benchmarking (2012)
Uptime Institute - The Global Data Center Authority, 2013 Data Center Industry Survey (2013)
National Renewable Energy Laboratory, Best Practices Guide for Energy-Efficient Data Center Design (2011)
Data Storage Technology brings Data Center Power Consumption benefits
Understanding the Behaviour of Solid State Disk
SAS and SATA, Solid-State Storage Lower Data Center Power Consumption
Leong, Y.K.: Future of Data Centers: Next Generation NVM and Hybrid Integration, Data Center Technologies Division (2012)
Google is improving its data centers with the power of machine learning
Devarakonda, M.V., Iyer, R.K.: Predictability of process resource usage: a measurement-based study on UNIX. IEEE Trans. Softw. Eng. 15(12), 1579–1586 (1989)
Faerman, M., Su, A., Wolski, R., Berman, F.: Adaptive Performance Prediction for Distributed Data-intensive Applications. In: Proceedings of the 1999 ACM/IEEE Conference on Supercomputing, New York, NY, USA (1999)
Wang, M., Au, K., Ailamaki, A., Brockwell, A., Faloutsos, C., Ganger, G.R.: Storage device performance prediction with CART models. In: Proceedings of the IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS 2004), pp. 588–595 (2004)
Senger, L.J., Fernandes de Mello, R., Santana, M.J., Helena, R., Santana, C., Yang, L.T.: An On-Line Approach for Classifying and Extracting Application Behavior on Linux. In: Yang, L.T., Guo, M. (eds.) High-Performance Computing, pp. 381–401. John Wiley & Sons, Inc. (2005)
Senger, L.J., et al.: An Instance-based Learning Approach for Predicting Execution Times of . . . (2004)
Oldfield, R., Kotz, D.: Improving Data Access for Computational Grid Applications. Clust. Comput. 9(1), 79–99 (2006)
Kim, J., Chandra, A., Weissman, J.B.: Using Data Accessibility for Resource Selection in Large-Scale Distributed Systems. IEEE Trans. Parallel Distrib Syst. 20(6), 788–801 (2009)
AL-Mistarihi, H.H.E., Yong, C.H.: On Fairness, Optimizing Replica Selection in Data Grids. IEEE Trans. Parallel Distrib. Syst. 20(8), 1102–1111 (2009)
Ishii, R.P., Fernandes de Mello, R.: An Online Data Access Prediction and Optimization Approach for Distributed Systems. IEEE Trans. Parallel Distrib. Syst. 23(6), 1017–1029 (2012)
Jung, C., Woo, D.-K., Kim, K., Lim, S.-S.: Performance Characterization of Prelinking and Preloadingfor Embedded Systems. In: Proceedings of the 7th ACM & IEEE International Conference on Embedded Software, New York, NY, USA, pp. 213–220 (2007)
Spillane, R.P., Wright, C.P., Sivathanu, G., Zadok, E.: Rapid File System Development Using Ptrace. In: Proceedings of the 2007 Workshop on Experimental Computer Science, New York, NY, USA (2007)
Rosenstein, M.T., Collins, J.J., De Luca, C.J.: Reconstruction Expansion As a Geometry-based Framework for Choosing Proper Delay Times. Phys. D 73(1-2), 82–98 (1994)
Fraser, A.: Information and entropy in strange attractors. IEEE Trans. Inf. Theory 35(2), 245–262 (1989)
Abarbanel, H., Brown, R., Sidorowich, J., Tsimring, L.: The analysis of observed chaotic data in physical systems. Rev. Mod. Phys. 65(4), 1331–1392 (1993)
Robinson, J.C.: A topological delay embedding theorem for infinite-dimensional dynamical systems. Nonlinearity 18(5), 2135 (2005)
Box, G.E.P., Jenkins, G.: Time Series Analysis, Forecasting and Control. Holden-Day, Incorporated (1990)
Marwan, N., Carmen Romano, M., Thiel, M., Kurths, J.: Recurrence plots for the analysis of complex systems. Phys. Rep. 438(5-6), 237–329 (2007)
Ishii, R.P., Rios, R.A., Mello, R.F.: Classification of time series generation processes using experimental tools: a survey and proposal of an automatic and systematic approach. Int. J. Comput. Sci. Eng. 6(4), 217 (2011)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Chang, C.-C., Lin, C.-J.: LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Deng, WY., Su, C.L., Ong, YS. (2015). Access Behavior Prediction in Distributed Storage System Using Regularized Extreme Learning Machine. In: Cao, J., Mao, K., Cambria, E., Man, Z., Toh, KA. (eds) Proceedings of ELM-2014 Volume 2. Proceedings in Adaptation, Learning and Optimization, vol 4. Springer, Cham. https://doi.org/10.1007/978-3-319-14066-7_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-14066-7_33
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14065-0
Online ISBN: 978-3-319-14066-7
eBook Packages: EngineeringEngineering (R0)