Machine Learning

, Volume 101, Issue 1–3, pp 137–161 | Cite as

Selective switching mechanism in virtual machines via support vector machines and transfer learning



Virtualization is an essential technology in data centers allowing for a single machine to be used for multiple applications or users. With memory virtualization, two approaches, shadow paging (SP) and hardware-assisted paging (HAP), are taken by modern virtual machine memory managers. Neither memory mode is always preferred; previous studies have proposed to exploit the advantages of both modes by dynamically switching between these two paging modes based on the on-the-fly system behavior. However, the existing scheme makes the switching decision based on manual rules summarized for a specific architecture. This paper employs a machine learning approach that learns a decision model automatically and thus can adapt to different systems. Experimental results show that the performance of our switching mechanism can match or outperform either SP or HAP alone. Also, the results demonstrate that a machine learning-based decision model can match the performance of the hand-tuned model. Moreover, we further show that different hardware/software settings can affect on-the-fly system behavior and thus demand different decision models. Our scheme yields two effective decision models on two different machines. Additionally, transfer learning was used in order to efficiently train a model when faced with a new hardware configuration with only a limited number of training samples from the new machine.


Memory virtualization Support vector machines Transfer learning 



We thank the anonymous reviewers and the editors for their constructive comments. We also thank Yingwei Luo, Xiaolin Wang, Lingmei Weng, ang Jiarui Zang for their comments and help on this work. This work is supported in part by NSF Career CCF0643664 and the National Science Foundation of China Grant No. 61232008, 61272158 and 61328201.


  1. Adams, K., & Agesen, O. (2006). A comparison of software and hardware techniques for x86 virtualization. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XII, (pp. 2–13), New York, ACM.Google Scholar
  2. Bae, C.S., Lange, J.R., & Dinda, P.A. (2011). Enhancing virtualized application performance through dynamic adaptive paging mode selection. In Proceedings of the 8th ACM International Conference on Autonomic Computing, ICAC ’11, (pp. 255–264), New York, ACM.Google Scholar
  3. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., et al. (2003). Xen and the art of virtualization. SIGOPS Operating Systems Review, 37(5), 164–177.CrossRefGoogle Scholar
  4. Bhargava, R., Serebrin, B., Spadini, F., & Manne, S. (2008). Accelerating two-dimensional page walks for virtualized systems. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIII, (pp. 26–35), New York, ACM.Google Scholar
  5. Blitzer, J., Dredze, M., & Pereira, F. (2007). Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classificiation. In Proceedings of the Association for Computational Linguistics, ACL’07. ACL.Google Scholar
  6. Bonilla, E.V., Chai, K.M.A., & Williams, C.K. (2008). Multi-task gaussian process prediction. In Proceedings of the Conference on Neural Information Processing Systesms, NIPS’08, (pp. 153–160).Google Scholar
  7. Boser, B.E., Guyon, I.M., & Vapnik, V.N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual ACM Workshop on COLT, (pp. 144–152) 1992.Google Scholar
  8. Brown, L. E., Tsamardinos, I., & Hardin, D. P. (2012). To feature space and back: Identifying top-weighted features in polynomial support vector machine models. Intelligent Data Analysis, 16(4), 551–579.Google Scholar
  9. Chang, C.-C., & Lin, C-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3), 27:1–27:27Google Scholar
  10. Collobert, R., & Bengio, S. (2001). Svmtorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research, 1, 143–160.MathSciNetGoogle Scholar
  11. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.Google Scholar
  12. Dai, W., Chen, Y., Xue, G.-R., Yang, Q., & Yu, Y. (2008). Translated learning: Transfer learning across different feature spaces. In Proceedings of the Conference on Neural Information Processing Systems, (pp. 353–360), 2008.Google Scholar
  13. Dai, W., Xue, G.-R., Yang, Q., & Yu, Y. (2007). Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’07, (pp. 210–219), SIGKDD: ACM.Google Scholar
  14. Dai, W., Xue, G.-R., Yang, Q., & Yu, Y. (2007). Transferring naive bayes classifiers for text classification. In Proceedings of the 22nd National Conference on Artificial Intelligence, AAAI’07, (pp. 540–545), AAAI: AAAI Press.Google Scholar
  15. Dai, W., Yang, Q., Xue, G.-R., & Yu, Y. (2007). Boosting for transfer learning. In Proceedings of 24th International Conference on Machine Learning, ICML’07. ICML, 2007.Google Scholar
  16. Devin, S., Bugnion, E., & Rosenblum, M. (1998). Virtualization system including a virtual machine monitor for a computer with a segmented architecture, Oct. 1998. US Patent, 6397242.Google Scholar
  17. Duan, L., Tsang, I.W., Xu, D., & Maybank, S.J. (2009). Domain transfer svm for video concept detection. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR’09, (pp. 1375–1381), IEEE, 2009.Google Scholar
  18. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.MathSciNetCrossRefMATHGoogle Scholar
  19. Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46(1–3), 389–422.CrossRefMATHGoogle Scholar
  20. Jiang, J., & Zhai, C. (2007). Instance weighting for domain adaptation in nlp. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, ACL’07, (pp. 264–271), ACL, 2007.Google Scholar
  21. Jiang, W., Zavesky, E., Chang, S.-F., & Loui, A. (2008). Cross-domain learning methods for high-level visual concept classification. In Proceedings of the 15th IEEE International Conference on Image Processing, ICIP’08, (pp. 161–164), IEEE, 2008.Google Scholar
  22. Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning, EMCL’98 (pp. 137–142). London, UK:Springer-Verlag.Google Scholar
  23. Joachims, T. (2002). Learning to classify text using support vector machines: Methods, theory and algorithms. Norwell: Kluwer Academic Publishers.CrossRefGoogle Scholar
  24. Liao, S.-W., Hung, T.-H., Nguyen, D., Chou, C., Tu, C., & Zhou, H. (2009). Machine learning-based prefetch optimization for data center applications. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC ’09, (pp. 56:1–56:10), New York, 2009. ACM.Google Scholar
  25. Gillespie, M. (2009). Best practice for paravirtualization enhancements from intel virtualization technology: Ept and vt-d. Technical report, 2009.Google Scholar
  26. Pan, S. J., Kwok, J. T., Yang, Q., & Pan, J. J. (2007). Adaptive localization in a dynamic wifi environment through multi-view learning. In Proceedings of the 22nd National Conference on Artificial Intelligence, AAAI’07, (pp. 1108–1113). AAAI: AAAI Press.Google Scholar
  27. Pan, S. J., Shen, D., Yang, Q., & Kwok, J. T. (2008). (2008) Transferring localization models across space. In Proceedings of 23rd National Conference on Artificial Intelligence, AAAI’08, (pp. 1383–1388). AAAI: AAAI Press.Google Scholar
  28. Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.CrossRefGoogle Scholar
  29. Shen, X., Zhong, Y., & Ding, C. (2004). Locality phase prediction. In Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, ASPLOS XI, (pp. 165–176), New York, 2004. ACM.Google Scholar
  30. Sherwood, T., Perelman, E., & Calder, B. (2001) Basic block distribution analysis to find periodic behavior and simulation points in applications. In Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, (pp. 3–14), Barcelona, 2001.Google Scholar
  31. Sherwood, T., Sair, S., & Calder, B. (2003). Phase tracking and prediction. In Proceedings of the 30th Annual International Symposium on Computer architecture, ISCA ’03, (pp. 336–349), New York, 2003. ACM.Google Scholar
  32. VMware. (2008). Large page performance: Esx server 3.5 and esx server 3i v3.5. Technical Report.Google Scholar
  33. VMware. (2009). Performance evaluation of intel ept hardware assist. Technical Report.Google Scholar
  34. Waldspurger, C. A. (2002). Memory resource management in vmware esx server. SIGOPS Operating Systems Review, 36(SI), 181–194.CrossRefGoogle Scholar
  35. Wang, X., Zang, J., Wang, Z., Luo, Y., & Li, X. (2011). Selective hardware/software memory virtualization. In Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, VEE ’11, (pp. 217–226), New York, 2011. ACM.Google Scholar
  36. Yao, Y., & Doretto, G. (2010). Boosting for transfer learning with multiple sources. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, volume i of CVPR’10, (pp. 1855–1862). CVPR, 2010.Google Scholar
  37. Zhao, W., Jin, X., Wang, Z., Wang, X., Luo, Y., & Li, X. (2011). Low cost working set size tracking. In Proceedings of the 2011 USENIX Conference on USENIX Annual Technical Conference, USENIXATC’11, (p 17), Berkeley, 2011. USENIX Association.Google Scholar
  38. Zheng, V. W., Xiang, E. W., Yang, Q., & Shen, D. (2008). (2008). Transferring localization models over time. In Proceedings of the 23rd National Conference on Artificial Intelligence, AAAI’08, (pp. 1421–1426). AAAI: AAAI Press.Google Scholar

Copyright information

© The Author(s) 2014

Authors and Affiliations

  1. 1.Department of Computer ScienceMichigan Technological UniversityHoughtonUSA

Personalised recommendations