Abstract
Storage device performance prediction is a key element of self-managed storage systems and application planning tasks, such as data assignment and configuration. Based on bagging ensemble, we proposed an algorithm named selective bagging classification and regression tree (SBCART) to model storage device performance. In addition, we consider the caching effect as a feature in workload characterization. Experiments indicate that caching effect added in feature vector can substantially improve prediction accuracy and SBCART is more precise and more stable compared to CART.
Chapter PDF
Similar content being viewed by others
References
Ruemmler, C., Wilkes, J.: An introduction to disk drive modeling. IEEE Computer 27(3), 17–18 (1994)
Worthington, B., Ganger, G., Patt, Y.: Scheduling algorithms for modern disk drives. In: Proc. of the ACM SIGMETRICS Conference, vol. 22, pp. 241–251. ACM, New York (1994)
The DiskSim Simulation Environment (v3.0), Parallel Data Lab, http://www.pdl.cmu.edu/DiskSim/
Griffin, J.L., Schindler, J., Schlosser, S.W., Bucy, J.S., Ganger, G.R.: Timing-accurate storage emulation. In: FAST 2002 on File and Storage Technologies, pp. 75–88. USENIX Assoc., Monterey (2002)
Barve, R., Shriver, R., Gibbons, P.B., Hillyer, B.K., Matias, B.K., Vitter, J.S.: Modeling and optimizing i/o throughput of multiple disks on a bus. In: ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 83–92. ACM, New York (1999)
Uysal, M., Alvarez, M., Merchant, A.: A modular, analytical throughput model for modern disk arrays. In: 9th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems Conference, MASCOTS, Cinncinnati, pp. 183–192 (2001)
Wilkes, J.: The Pantheon storage-system simulator. Technical Report HPL-SSP-95-14, Storage Systems Program, Hewlett-Packard Laboratories (1996)
Aicheler, U.: A visual user interface for the pantheon storage system simulator. Technical Report HPLSSP961, Storage Systems Program, Hewlett-Packard Laboratories (1996)
Wilkes, J., Golding, R., Staelin, C., Sullivan, T.: The HP AutoRAID hierarchical storage system. ACM Transactions on Computer Systems 14(1), 108–136 (1996)
Cao, P., Lim, S.B., Venkataraman, S., Wilkes, J.: The TickerTAIP parallel RAID architecture. ACM Transactions on Computer Systems 12(3), 236–269 (1994)
Schindler, J., Ganger, G.R.: Automated disk drive characterization. CMU SCS Technical Report CMU-CS-99-176 (1999)
Andenson, E.: Simple table-based modeling of storage devices. Technical Report HPL-SSP-2001-04, HP Laboratories (2001)
Kelly, T., Cohen, I., Goldszmidt, M., Keeton, K.: Inducing models of black-box storage arrays. Technical Report HPL-SSP-2004-108, HP Laboratories (2004)
Mesnier, M.P., Wachs, M., Sambasivan, R.R., Zheng, A.X., Ganger, G.R.: Modeling the relative fitness of storage. In: Joint International Conference on Measurement and Modeling of Computer Systems. ACM, New York (2007)
Wang, M., Au, K., Ailamaki, A., Brockwell, A., Faloutsos, C., Ganger, G.R.: Storage device performance prediction with cart models. In: 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS, USA (2004)
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. Chapman and Hall CRC, Boca Raton (1984)
Kohavi, R., Kunz, C.: Option decision trees with majority votes. In: 14th International Conference on Machine Learning, Morgan Kaufman, San Francisco (1997)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36(1-2), 105–139 (1999)
Breiman, L.: Bagging predictors. Machine learning 24(1), 123–140 (1996)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: 13th International Conference on Machine Learning. Morgan Kaufmann, San Francisco (1996)
Zhou, Z.H., Tang, W.: Ensembling neural networks: Many could be better than all. Artificial Intelligence 137(1-2), 239–263 (2003)
Bakker, B., Heskes, T.: Clustering ensembles of neural networks. Neural Networks 16(2), 261–269 (2003)
Mart nez2mu noz, G., Su rez, A.: Pruning in ordered bagging ensembles. In: 23th International Conference on Machine Learning, pp. 1266–1273. IEEE, Piscataway (2006)
Umass trace repository, http://traces.cs.umass.edu/index.php/Storage/Storage
Jiang, S., Zhang, X.: LIRS: an effcient low inter-reference recency set replacement to improve buffer cache performance. In: ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 31–42. ACM, New York (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 IFIP International Federation for Information Processing
About this paper
Cite this paper
Zhang, L., Liu, G., Zhang, X., Jiang, S., Chen, E. (2010). Storage Device Performance Prediction with Selective Bagging Classification and Regression Tree. In: Ding, C., Shao, Z., Zheng, R. (eds) Network and Parallel Computing. NPC 2010. Lecture Notes in Computer Science, vol 6289. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15672-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-15672-4_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15671-7
Online ISBN: 978-3-642-15672-4
eBook Packages: Computer ScienceComputer Science (R0)