Abstract
The cloud operating system (cloud OS) is used for managing the cloud resources such that they can be used effectively and efficiently. And also it is the duty of cloud OS to provide convenient interface for users and applications. However, these two goals are often conflicting because convenient abstraction usually needs more computing resources. Thus, the cloud OS has its own characteristics of resource management and task scheduling for supporting various kinds of cloud applications. The evolution of cloud OS is in fact driven by these two often conflicting goals and finding the right tradeoff between them makes each phase of the evolution happen. In this paper, we have investigated the ways of cloud OS evolution from three different aspects: enabling technology evolution, OS architecture evolution and cloud ecosystem evolution. We show that finding the appropriate APIs (application programming interfaces) is critical for the next phase of cloud OS evolution. Convenient interfaces need to be provided without scarifying efficiency when APIs are chosen. We present an API-driven cloud OS practice, showing the great capability of APIs for developing a better cloud OS and helping build and run the cloud ecosystem healthily.
Similar content being viewed by others
References
Armbrust M, Fox A, Griffith R et al. A view of cloud computing. Communications of the ACM, 2010, 53(4): 50-58.
Tanenbaum A S, Woodhull A S. Operating Systems Design and Implementation (3rd edition). Pearson, 2006.
Auslander M A, Larkin D C, Scherr A L. The evolution of the MVS operating system. IBM Journal of Research and Development, 1981, 25(5): 471-482.
Deitel H M, Deitel P J, Choffnes D. Operating Systems. Pearson/Prentice Hall, 2004.
Bic L F, Shaw A C. Operating Systems Principles. Prentice Hall, 2003.
Silberschatz A, Galvin P B, Gagne G. Operating System Concepts. John Wiley & Sons Ltd., 2008.
Hu T H. A Prehistory of the Cloud. MIT Press, 2016.
Mell P, Grance T. SP800-145. The NIST definition of cloud computing. Communications of the ACM, 2010, 53(6): 50.
Zheng W. An introduction to Tsinghua cloud. Science China Information Sciences, 2010, 53(7): 1481-1486.
Hindman B, Konwinski A, Zaharia M et al. Mesos: A platform for fine-grained resource sharing in the data center. In Pro. USENIX Conference on Networked Systems Design and Implementation, Mar.31-Apr.1, 2013, pp.429-483.
Schwarzkopf M, Konwinski A, Abd-El-Malek M et al. Omega: Flexible, scalable schedulers for large compute clusters. In Proc. ACM European Conference on Computer Systems, Apr. 2013, pp.351-364.
Verma A, Pedrosa L, Korupolu M et al. Large-scale cluster management at Google with Borg. In Proc. the 10th European Conference on Computer Systems, Apr. 2015, pp.18:1-18:17.
Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. In Proc. the 6th Symposium on Operating Systems Design & Implementation, Dec. 2004, pp.137-150.
Ghemawat S, Gobioff H, Leung S T. The Google file system. ACM SIGOPS Operating Systems Review, 2003, 37(5): 29-43.
Chang F, Dean J, Ghemawat S et al. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 2008, 26(2): 205-218.
Baker J, Bond C, Corbett J et al. Megastore: Providing scalable, highly available storage for interactive services. In Proc. the 5th Biennial Conference on Innovative Data Systems Research, January 2011, pp.223-234.
Corbett J C, Dean J, Epstein M et al. Spanner: Google’s globally distributed database. ACM Transactions on Computer Systems (TOCS), 2013, 31(3): 8:1-8:22.
Yu Y, Isard M, Fetterly D et al. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In Proc. the 8th USENIX Symposium on Operating Systems Design and Implementation, Dec. 2008, pp.1-14.
Isard M, Budiu M, Yu Y et al. Dryad: Distributed dataparallel programs from sequential building blocks. ACM SIGOPS Operating Systems Review, 2007, 41(3): 59-72.
Zaharia M, Chowdhury M, Das T et al. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proc. the 9th USENIX Conference on Networked Systems Design and Implementation, Apr. 2012, pp.141-146.
Power R, Li J. Piccolo: Building fast, distributed programs with partitioned tables. In Proc. the 9th USENIX Symposium on Operating Systems Design and Implementation, October 2010, pp.293-306.
Melnik S, Gubarev A, Long J J et al. Dremel: Interactive analysis of web-scale datasets. Communications of the ACM, 2011, 54(6): 114-123.
Peng D, Dabek F. Large-scale incremental processing using distributed transactions and notifications. In Proc. the 9th USENIX Symposium on Operating Systems Design and Implementation, October 2010, pp.251-264.
Neumeyer L, Robbins B, Nair A et al. S4: Distributed stream computing platform. In Proc. the 10th IEEE International Conference on Data Mining Workshops, Dec. 2010, pp.170-177.
Viglas S, Naughton J F. Rate-based query optimization for streaming information sources. In Proc. ACM SIGMOD International Conference on Management of Data, Jun. 2002, pp.37-48.
Shen H, Zhang Y. Improved approximate detection of duplicates for data streams over sliding windows. Journal of Computer Science and Technology, 2008, 23(6): 973-987.
Li Y, Chen F H, Sun X et al. Self-adaptive resource management for large-scale shared clusters. Journal of Computer Science and Technology, 2010, 25(5): 945-957.
Hunt P, Konar M, Junqueira F P et al. ZooKeeper: Wait-free coordination for Internet-scale systems. In Proc. USENIX Annual Technical Conference, Jun. 2010.
Ongaro D, Ousterhout J. In search of an understandable consensus algorithm. In Proc. USENIX Annual Technical Conference, Jun. 2014, pp.305-319.
Lamport L. Paxos made simple. ACM SIGACT News, 2001, 32(4): 18-25.
Barham P, Dragovic B, Fraser K et al. Xen and the art of virtualization. ACM SIGOPS Operating Systems Review, 2003, 37(5): 164-177.
Ben-Yehuda M, Day M D, Dubitzky Z et al. The turtles project: Design and implementation of nested virtualization. In Proc. the 9th USENIX Conference on Operating Systems Design and Implementation, Oct. 2010, pp.423-436.
Xiao Z, SongW, Chen Q. Dynamic resource allocation using virtual machines for cloud computing environment. IEEE Transactions on Parallel and Distributed Systems, 2013, 24(6): 1107-1117.
Kivity A, Laor D, Costa G et al. OSv — Optimizing the operating system for virtual machines. In Proc. USENIX Annual Technical Conference, June 2014, pp.61-72.
Ren S, Tan L, Li C et al. Samsara: Efficient deterministic replay in multiprocessor environments with hardware virtualization extensions. In Proc. USENIX Annual Technical Conference, June 2016, pp.551-564.
Chen H, Wang X, Wang Z et al. DMM: A dynamic memory mapping model for virtual machines. Science China Information Sciences, 2010, 53(6): 1097-1108.
Zhao X, Yin J, Chen Z et al. vSpec: Workload-adaptive operating system specialization for virtual machines in cloud computing. Science China Information Sciences, 2016, 59(9): 92-105.
Wang X, Sun Y, Luo Y et al. Dynamic memory paravirtualization transparent to guest OS. Science China Information Sciences, 2010, 53(1): 77-88.
Lu L, Zhang Y, Do T et al. Physical disentanglement in a container-based file system. In Proc. the 11th USENIX Symposium on Operating Systems Design and Implementation, Oct. 2014, pp.81-96.
Arnautov S, Trach B, Gregor F et al. SCONE: Secure Linux containers with Intel SGX. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Nov. 2016, pp.689-704.
Banga G, Druschel P, Mogul J C. Resource containers: A new facility for resource management in server systems. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Feb. 1999, pp.45-58.
Pedro G L, Alberto M, Dick E et al. Edge-centric computing: Vision and challenges. ACM SIGCOMM Computer Communication Review, 2015, 45 (5): 37-42.
Shi W, Cao J, Zhang Q et al. Edge computing: Vision and challenges. IEEE Internet of Things Journal, 2016, 3(5): 637-646.
Dragojević A, Narayanan D, Castro M et al. FaRM: Fast remote memory. In Proc. USENIX Symposium on Networked Systems Design and Implementation, Apr. 2014, pp.401-414.
Mitchell C, Geng Y, Li J. Using one-sided RDMA reads to build a fast, CPU-efficient key-value store. In Proc. USENIX Annual Technical Conference, June 2013, pp.103-114.
Jose J, Subramoni H, Luo M et al. Memcached design on high performance RDMA capable interconnects. In Proc. International Conference on Parallel Processing, Sept. 2011, pp.743-752.
Greenberg A, Hamilton J R, Jain N et al. VL2: A scalable and flexible data center network. ACM SIGCOMM Computer Communication Review, 2009, 39(6): 51-62.
Paraiso F, Haderer N, Merle P et al. A federated multi-cloud PaaS infrastructure. In Proc. the 5th IEEE International Conference on Cloud Computing, Jun. 2012, pp.392-399.
Eguro K, Venkatesan R. FPGAs for trusted cloud computing. In Proc. the 22nd International Conference on Field Programmable Logic and Applications, Aug. 2012, pp.63-70.
Hutchings B L, Franklin R, Carver D. Assisting network intrusion detection with reconfigurable hardware. In Proc. the 10th IEEE Symposium on Field-Programmable Custom Computing Machines, Apr. 2002, pp.111-120.
Chalamalasetti S R, Lim K, Wright M et al. An FPGA Memcached appliance. In Proc. ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Feb. 2013, pp.245-254.
Huang M, Wu D, Yu C H et al. Programming and runtime support to blaze FPGA accelerator deployment at datacenter scale. In Proc. ACM Symposium on Cloud Computing, Oct. 2016, pp.456-469.
Wang X M, Thota S. A resource-efficient communication architecture for chip multiprocessors on FPGAs. Journal of Computer Science and Technology, 2011, 26(3): 434-447.
Dong Y, Xue M, Zheng X et al. Boosting GPU virtualization performance with hybrid shadow page tables. In Proc. USENIX Annual Technical Conference, July 2015, pp.517-528.
Zhang K, Chen R, Chen H. NUMA-aware graph-structured analytics. In Proc. the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Feb. 2005, pp.183-193.
Mao Y, Kohler E, Morris R T. Cache craftiness for fast multicore key-value storage. In Proc. ACM European conference on Computer Systems, Apr. 2012, pp.183-196.
Tu S, Zheng W, Kohler E et al. Speedy transactions in multicore in-memory databases. In Proc. ACM Symposium on Operating Systems Principles, Nov. 2013, pp.18-32.
Zhang G, HornW, Sanchez D. Exploiting commutativity to reduce the cost of updates to shared data in cache-coherent systems. In Proc. IEEE/ACM International Symposium on Microarchitecture, Dec. 2015, pp.13-25.
Wang Z, Qian H, Li J et al. Using restricted transactional memory to build a scalable in-memory database. In Proc. the 9th European Conference on Computer Systems, Apr. 2014, Article No. 26.
Russell R M. The CRAY-1 computer system. Communications of the ACM, 1978, 21(1): 63-72.
Barik R, Zhao J, Sarkar V. Efficient selection of vector instructions using dynamic programming. In Proc. IEEE/ACM International Symposium on Microarchitecture, Dec. 2010, pp.201-212.
Klimovitski A. Using SSE and SSE2: Misconceptions and reality. Intel Developer Update Magazine, Mar. 2001. http://saluc.engr.uconn.edu/refs/process/intel/sse sse2.pdf, Feb.2017.
Intel I. Intel® SSE4 Programming Reference, D91561- 103, 2007. http://software.intel.com/sites/default/files/m/8/6/8/D9156103.pdf, Feb. 2017.
Tian C, Zhou H, He Y et al. A dynamic Mapreduce scheduler for heterogeneous workloads. In Proc. International Conference on Grid and Cooperative Computing, Aug. 2009, pp.218-224.
Sun N, Liu W, Liu H et al. Dawning-1000 PROOS distributed operating system. Journal of Computer Science and Technology, 1997, 12(2): 160-166
Zhang L, Litton J, Cangialosi F et al. Picocenter: Supporting long-lived, mostly-idle applications in cloud environments. In Proc. the 11th European Conference on Computer Systems, Apr. 2016, pp.37:1-37:16.
Canali C, Lancellotti R. Improving scalability of cloud monitoring through PCA-based clustering of virtual machines. Journal of Computer Science and Technology, 2014, 29(1): 38-52.
Le K, Bianchini R, Zhang J et al. Reducing electricity cost through virtual machine placement in high performance computing clouds. In Proc. International Conference for High Performance Computing, Networking, Storage and Analysis, Nov. 2011.
Chun B G, Ihm S, Maniatis P et al. CloneCloud: Elastic execution between mobile device and cloud. In Proc. the 6th European Conference on Computer Systems, Apr. 2011, pp.301-314.
Jin H, Deng L, Wu S et al. Live virtual machine migration with adaptive, memory compression. In Proc. IEEE International Conference on Cluster Computing and Workshops, Aug. 2009.
Ye K, Jiang X, Huang D et al. Live migration of multiple virtual machines with resource reservation in cloud computing environments. In Proc. IEEE International Conference on Cloud Computing, Jul. 2011, pp.267-274.
Malewicz G, Austern M H, Bik A J et al. Pregel: A system for large-scale graph processing. In Proc. ACM SIGMOD International Conference on Management of Data, Jun. 2010, pp.135-146.
Kyrola A, Blelloch G, Guestrin C. GraphChi: Large-scale graph computation on just a PC. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Oct. 2012, pp.31-46.
Girod L, Mei Y, Newton R et al. XStream: A signaloriented data stream management system. In Proc. the 24th IEEE International Conference on Data Engineering, Apr. 2008, pp.1180-1189.
Low Y, Bickson D, Gonzalez J et al. Distributed GraphLab: A framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 2012, 5(8): 716-727.
Chen R, Shi J, Chen Y et al. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In Proc. European Conference on Computer Systems, Apr. 2015.
Zhang M, Wu Y, Chen K et al. Exploring the hidden dimension in graph processing. In Proc. the 12th USENIX Symposium on Operating Systems Design and Implementation, Nov. 2016, pp.285-300.
Zhu X, Chen W, Zheng W et al. Gemini: A computationcentric distributed graph processing system. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Nov. 2016, pp.301-316.
Gonzalez J E, Xin R S, Dave A et al. GraphX: Graph processing in a distributed dataflow framework. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Oct. 2014, pp.599-613.
Abadi M, Barham P, Chen J et al. TensorFlow: A system for large-scale machine learning. In Proc. the 12th USENIX Symposium on Operating Systems Design and Implementation, Nov. 2016, pp.265-283.
Nesbit K J, Moreto M, Cazorla F J et al. Multicore resource management. IEEE Micro, 2008, 28(3): 6-16.
Bolte M, Sievers M, Birkenheuer G et al. Non-intrusive virtualization management using libvirt. In Proc. European Design and Automation Association Conference on Design, Automation and Test in Europe, Mar. 2010, pp.574-579.
Tanenbaum A S, Kaashoek M F, van Renesse R et al. The Amoeba distributed operating system — A status report. Computer Communications, 1991, 14(6): 324-335
Vavilapalli V K, Murthy A C, Douglas C et al. Apache Hadoop YARN: Yet another resource negotiator. In Proc. ACM Symposium on Cloud Computing, Oct. 2013, pp.5:1-5:16.
Burns B, Grant B, Oppenheimer D et al. Borg, Omega, and Kubernetes. ACM Queue, 2016, 14(1): 70-93
Zhang Z, Li C, Tao Y et al. Fuxi: A fault-tolerant resource management and job scheduling system at Internet scale. Proceedings of the VLDB Endowment, 2014, 7(13): 1393-1404
Harter T, Salmon B, Liu R et al. Slacker: Fast distribution with lazy docker containers. In Proc. USENIX Conference on File and Storage Technologies, February 2016.
Singh B, Srinivasan V. Containers: Challenges with the memory resource controller and its performance. In Proc. Ottawa Linux Symposium, June 2007.
Nikolaev R, Back G. VirtuOS: An operating system with kernel virtualization. In Proc. ACM Symposium on Operating Systems Principles, Nov. 2013, pp.116-132.
Soltesz S, Pötzl H, Fiuczynski M E et al. Containerbased operating system virtualization: A scalable, highperformance alternative to hypervisors. ACM SIGOPS Operating Systems Review, 2007, 41(3): 275-287.
Steinberg U, Kauer B. NOVA: A microhypervisor-based secure virtualization architecture. In Proc. European Conference on Computer Systems, Apr. 2010, pp.209-222.
Boyd-Wickizer S, Clements A T, Mao Y et al. An analysis of Linux scalability to many cores. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Oct. 2010, pp.86-93.
Colmenares J A, Bird S, Eads G et al. Tessellation operating system: Building a real-time, responsive, high-throughput client OS for many-core architectures. In Proc. IEEE Hot Chips Symposium, Aug. 2011.
Baumann A, Peter S, Sch¨upbach A et al. Your computer is already a distributed system. Why isn’t your OS? In Proc. the 12th Conference on Hot Topics in Operating Systems, May 2009.
Wentzlaff D, Agarwal A. Factored operating systems (FOS): The case for a scalable operating system for multicores. ACM SIGOPS Operating Systems Review, 2009, 43(2): 76-85.
Grandl R, Chowdhury M, Akella A et al. Altruistic scheduling in multi-resource clusters. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Nov. 2016, pp.65-80.
Grandl R, Kandula S, Rao S et al. GRAPHENE: Packing and dependency-aware scheduling for data-parallel clusters. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Nov. 2016, pp.81-98.
Gog I, Schwarzkopf M, Gleave A et al. Firmament: Fast, centralized cluster scheduling at scale. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Nov. 2016, pp.99-115.
Jyothi S A, Curino C, Menache I et al. Morpheus: Towards automated SLOs for enterprise clusters. In Proc. USENIX Symposium on Operating Systems Design and Implementation, Nov. 2016, pp.117-134.
Zhou F F, Ma R H, Li J et al. Optimizations for high performance network virtualization. Journal of Computer Science and Technology, 2016, 31(1): 107-116.
Tang H, Mu S, Huang J et al. Zip: An algorithm based on loser tree for common contacts searching in large graphs. Journal of Computer Science and Technology, 2015, 30(4): 799-809.
Ma C, Yan D, Wang Y et al. Advanced graph model for tainted variable tracking. Science China Information Sciences, 2013, 56(11): 1-12.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
ESM 1
(PDF 91 kb)
Rights and permissions
About this article
Cite this article
Chen, ZN., Chen, K., Jiang, JL. et al. Evolution of Cloud Operating System: From Technology to Ecosystem. J. Comput. Sci. Technol. 32, 224–241 (2017). https://doi.org/10.1007/s11390-017-1717-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-017-1717-z