Advances in Big Data and Cloud Computing pp 135145  Cite as
Clustered Queuing Model for Task Scheduling in Cloud Environment
Abstract
With the advent of big data and Internet of Things (IoT), optimal task scheduling problem for heterogeneous multicore virtual machines (VMs) in cloud environment has garnered greater attention from researchers around the globe. The queuing model used for optimal task scheduling is to be tuned according to the interactions of the task with the cloud resources and the availability of cloud processing entities. The queue disciplines such as FirstIn FirstOut, LastIn FirstOut, Selection In Random Order, Priority Queuing, Shortest Job First, Shortest Remaining Processing Time are all wellknown queuing disciplines applied to handle this problem. We propose a novel queue discipline which is based on kmeans clustering called clustered queue discipline (CQD) to tackle the abovementioned problem. Results show that CQD performs better than FIFO and priority queue models under high demand for resource. The study shows that: in all cases, approximations to the CQD policies perform better than other disciplines; randomized policies perform fairly close to the proposed one and, the performance gain of the proposed policy over the other simulated policies, increase as the mean task resource requirement increases and as the number of VMs in the system increases. It is also observed that the time complexity of clustering and scheduling policies is not optimal and hence needs to be improved.
Keywords
Cloud computing Load balancing Clustering Queuing discipline1 Introduction
The task scheduling problem in cloud environment is a wellknown NP hard problem where the queuing strategy adopted to schedule tasks plays a vital role. The existing queuing disciplines follow strategies that are not well suited for cloud environments. Based on the extensive study carried out, it is found that the queuing discipline that is best suitable for cloud environment should posses the following characterization: must possess multiple queues that can be quickly forwarded to appropriate nodes [1]; must have single entry point for tasks; multiple points of VM buffer and after processing finally exiting the system; joint probability distributions are to be computable; fair queuing with minimum waiting time of tasks and maximum utilization of virtual machines (VMs).
The primary contribution of this paper is the novel queuing model proposed to further improve task scheduling performance in cloud data centers. The steps to derive the joint probability distribution based on the proposed CQD model are given in detail. The model is further compared with other disciplines such as FIFO and priority queuing disciplines to reveal the improved performance by adopting CQD. It is rarely found that the probability distribution measures are computed for such realtime complex queuing systems. [2]
2 Literature Study
Few of the wellknown service disciplines in queuing models are FirstIn FirstOut (FIFO), LastIn FirstOut (LIFO), Selection In Random Order (SIRO), Priority, Shortest Job First (SJF), and Shortest Remaining Processing Time (SRPT). In cloud systems, priority of jobs, job lengths, job interdependencies, processing capacity and the current load on the VMs are the factors on deciding the next task to be scheduled [3], whereas above disciplines do not consider these.
In [4], the authors modeled the cloud center as a classic open network; they obtained the distribution of response time based on assumption that exponential interarrival time and service time distributions occurred. The response time distribution revealed the relationship between maximal number of tasks and minimal resource requirements against the required level of service. Generally, the literature in this area elaborately discusses M/G/m queuing systems, as outlined in [5, 6, 7]. The response time distributions and queue length analysis for M/G/m systems are insufficient for cloud environment as it requires two stages of processing where workload characterization and characterized task allocations are a must. As solutions for distribution of response time and queue length in M/G/m systems cannot be obtained in closed form, suitable approximations were sought in these papers. However, most of the above models lead to proper estimates of mean response time with constraint that lesser VMs are present [8]. Approximation errors are particularly large when the offered load is small and the number of servers m are more [9]. Hence, these results are not directly applicable to performance analysis of cloud computing environments where generally number of VMs are large and service distributions and arrival distributions are not known.
3 Clustered Queuing Model
As observed, the existing queuing models do not directly fit in cloud environment [10]. Hence, a suitable queuing model that best depicts the cloud scenario is developed. In M/G/∞ networks, the analysis of waiting time and response time distributions is already known and well established, but the determination of the joint distribution of the queue lengths at the various servers at the arrival epochs of a submitted task in those nodes presents an important problem. This paper is devoted to this problem. The following subsection discusses the proposed queueing model in detail.
3.1 Statistical Model
The focus is to derive a queuing model with the above characteristics in order to effectively schedule incoming task requests in cloud environment. Jockeying is allowed when load imbalance among VMs becomes high. Here, balking or reneging scenarios are not considered to maintain simplicity.
It represents that arrival follows Markovian arrival distribution (M) and the service follows twostage general service distribution (G2) with m number of VMs in an infinite capacity system (∞). The model follows clustered queue discipline (CQD) as the queue discipline which is the proposed work. Here, general distribution means an arbitrary distribution with E(x) and E(x2) and the service times are independent and identically distributed (IID) [11].
3.2 Performance Parameters
Generally, performance parameters involve terms such as server utilization, throughput, waiting time, response time, queue length, and number of customers in the system at any given time [2]. Here, λ is the arrival rate; μ_{1} and μ_{2} are clustering and scheduling service rates.
3.3 Joint Probability Distribution for Tandem Queues in Parallel
We approach the model as two queuing phases in series. The first phase is considered to follow singleserver singlequeue model with infinite service capacity, whereas the second phase involves tandem queues in parallel with single global scheduler as the server with infinite capacity. As the first part is a wellknown singleserver singlequeue model, it does not require any further investigation. The second phase of the model with tandem queues in parallel is of major concern.
Some notations used to model the queuing discipline: [12].
LQ_{1}, LQ_{2},…LQ_{k} are K queues in parallel and the tasks arrive at the queues in Poisson fashion with λ as arrival rate and service times at LQ_{1}, LQ_{2},…LQ_{k} are independent, identically distributed stochastic variables with distribution B_{1}(.), B_{2}(.), …, B_{k}(.) with first moment β_{1}, β_{2},… β_{k}. In the following, it will be assumed that B_{1} (0 +) = 0 and β_{1} < ∞, i = 1, …,k.
On deriving the queue characteristics of the second phase of the model, we shall compound the results with the already known parameters of M/M/1 model [13].
Now, the derivation steps for tandem queues in parallel are discussed below. Our approach initially considers 2 queues in parallel and then extends the results to k queues in parallel.

Determine a productform expression of the type determining joint stationary queue length distribution of a submitted task at its arrival epochs at two queues of a general network of M/G/∞ queues;

Apply the PASTA property which states that ‘Poisson Arrivals See Time Averages’ [14];
 Decompose the queue length term X_{m}(t) into independent terms corresponding to the position of a task at time instant 0.
In Fig. 2, let x_{1}(t) and x_{2}(t) denote the queue length of LQ1 and LQ2 at time t such that x_{1}(t) = l_{1} and x_{2}(t) = l_{2}. Let σ _{1} ^{(1)} , σ _{2} ^{(1)} , …, σ _{l1} ^{(1)} , σ _{1} ^{(2)} , σ _{2} ^{(2)} , …, σ _{l2} ^{(2)} denote residual service times of the tasks in service, i.e., remaining service time required by each task to complete. Hence, (x_{1}(t), x_{2}(t), σ _{1} ^{(1)} , σ _{2} ^{(1)} , …, σ _{l1} ^{(1)} , σ _{1} ^{(2)} , σ _{2} ^{(2)} , …, σ _{l2} ^{(2)} ) is evidently a Markov process.
Theorem
Proof
Assuming that in the interval (0, t), \(n\) tasks arrive where \(n \ge l_{1} + l_{2}\). It is trivial that in a Poisson process of arrival between (0, t) the joint probability distribution of the epochs of these arrivals agrees with the joint distribution of \(n\) independent random points distributed uniformly in (0, t).
If we put \(t \to \infty\), we obtain Eq. 6 after integration rearrangements. The argument can be extended to generalize that limiting distribution is independent of the initial distribution.
Remarks
The above equation gives the joint stationary distribution of the two local queues considered. On extending the above argument to n arbitrary local queues, we can arrive at the final joint distribution. The above scenario of n different classes of parallel queues is simulated, and analysis is given in the following section.
4 Experimental Analysis
The CQD policy is analyzed against other wellknown policies such as priority and FIFO mechanisms. Here, M/G/m parallel queues are considered for experimentation. Existing literature [16, 17] deals with performance analysis of queuing systems based mainly on mean response time which is highly critical in cloud environment to provide necessary quality of service (QoS) [18].
4.1 Performance Metrics
4.2 Analysis and Discussion
Amortized analysis. Not each of the n tasks takes equally much time. Basic idea in CQM is to do a lot of ‘prework’ by clustering apriori. This pays off as a result of the prework done, and the scheduling operation can be carried out so fast that a total time of O(g(n)) is not exceeded where g(n) is a sublinear function of n tasks. So, the investment in the prework amortizes itself.
5 Conclusion and Future Work
This paper outlines the need for efficient queuing model which is best suited for cloud computing. A novel method involving clustering technique is proposed. The queuing model derivation steps are outlined and validated against existing queues in series derivation. Analytical discussion proves the efficiency of the above method. The proposed work is found to perform better than existing disciplines such as FIFO and priority in situations such as high resource requirement and when large number of VMs are present. Major work in the future shall be devoted to applying the model in real time and mathematically deriving and analyzing the efficiency in terms of energy complexity.
Notes
Acknowledgements
We acknowledge Visvesvaraya PhD scheme for Electronics and IT, DeitY, Ministry of Communications and IT, Government of India’s fellowship grant through Anna University, Chennai for their support throughout the working of this paper.
References
 1.Boxma, O.J.: M/G/∞ tandem queues. Stoch. Process. Appl. 18, 153–164 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
 2.Sztrik, J.: Basic Queueing Theory. University of Debrecen (2012)Google Scholar
 3.Buyya, R., Sukumar, K.: Platforms for Building and Deploying Applications for Cloud Computing, pp. 6–11. CSI Communication (2011)Google Scholar
 4.Xiong, K., Perros, H.: Service performance and analysis in cloud computing. In: Proceedings of the 2009 Congress on Services—I, Los Alamitos, CA, USA, pp. 693–700 (2009)Google Scholar
 5.Ma, B.N.W.: Mark. J.W.: Approximation of the mean queue length of an M/G/c queueing system. Oper. Res. 43, 158–165 (1998)CrossRefGoogle Scholar
 6.Miyazawa, M.: Approximation of the queuelength distribution of an M/GI/s queue by the basic equations. J. Appl. Probab. 23, pp. 443–458 (1986)Google Scholar
 7.Yao, D.D.: Refining the diffusion approximation for the M/G/m queue. Oper. Res. 33, 1266–1277 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
 8.Tijms, H.C., Hoorn, M.H.V., Federgru, A.: Approximations for the steadystate probabilities in the M = G=c queue. Adv. Appl. Probab. 13, 186–206 (1981)MathSciNetCrossRefzbMATHGoogle Scholar
 9.Kimura, T.: Diffusion approximation for an M = G=m queue. Oper. Res. 31, 304–321 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
 10.Vilaplana, Jordi, Solsona, Francesc, Teixidó, Ivan, Mateo, Jordi, Abella, Francesc, Rius, Josep: A queuing theory model for cloud computing. J Supercomput. 69(1), 492–507 (2014)CrossRefGoogle Scholar
 11.Boxma, O.J., Cohen, J.W., Huffel, N.: Approximations of the Mean waiting time in an M = G=s queueing system. Oper. Res. 27, 1115–1127 (1979)CrossRefzbMATHGoogle Scholar
 12.Kleinrock, L.: Queueing Systems: Theory, vol. 1. WileyInterscience, New York (1975)zbMATHGoogle Scholar
 13.Adan, I.J.B.F., Boxma, O.J., Resing, J.A.C.: Queueing models with multiple waiting lines. Queueing Syst Theory Appl 37(1), 65–98 (2011)MathSciNetzbMATHGoogle Scholar
 14.Wolff, R.W.: Poisson arrivals see time averages. Oper. Res. 30, 223–231 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
 15.Cohen, J.W.: The multiple phase service network with generalized processor sharing. Acta Informatica 12, 245–284 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
 16.Khazaei, H., Misic, J., Misic, V.: Performance analysis of cloud computing centers using M/G/m/m + r. Queuing Systems. IEEE Trans. Parallel Distrib. Syst. 23(5) (2012)Google Scholar
 17.Slothouber, L.: A model of web server performance. In: Proceedings of the Fifth International World Wide Web Conference (1996)Google Scholar
 18.Yang, B., Tan, F., Dai, Y., Guo, S.: Performance evaluation of cloud service considering fault recovery. In: Proceedings of the First International Conference on Cloud, Computing (CloudCom’09), pp. 571–576 (2009)Google Scholar
 19.Borst, S., Boxma, O.J., Hegde, N.: Sojourn times in finitecapacity processorsharing queues. Next Gener. Internet Netw IEEE 55–60 (2005)Google Scholar