A Two-Stage Queue Model for Context-Aware Task Scheduling in Mobile Multimedia Cloud Environments

  • Durga S
  • Mohan S
  • J. Dinesh Peter
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 645)


Multimedia cloud is an emerging computing paradigm that can effectively process media services and provide adequate quality of service (QoS) for multimedia applications from anywhere and on any device at lower cost. However, the mobile clients are still not getting their services in full due to its intrinsic nature such as limited battery life, disconnection, and mobility. In this paper, we propose a context-aware task scheduling algorithm that efficiently allocates the suitable resources to the clients. A queuing-based system model is presented with heuristic resource allocation. The simulation results showed that the proposed solutions provide better performance as compared to the state-of-the-art approaches.


Mobile cloud Multimedia cloud Queuing model Cuckoo search Resource provisioning Context awareness 

1 Introduction

During the past decade, the benefits of hosting multimedia applications in the cloud are becoming increasingly attractive to both people and organizations. Recently, multimedia cloud computing [1] witnessed that there are few novel services, such as cloud-based online photo and video editing, photo and video sharing, online gaming, video surveillance are springing up. A key problem in the multimedia cloud is to deal with the diverse and continuous changes of mobile context in fulfilling the QoS demands of users. In this paper, we tackle the aforementioned problems with the proposed adaptive resource provisioning algorithm which includes context awareness-based request scheduler and resource allocator. Queuing theory is used to model the proposed system model and cuckoo search [2] based optimization algorithm is presented that can allocate suitable physical resources for the request.

The rest of the paper is organized as follows: Related work is reviewed in Sect. 2. Section 3 describes the model description, mathematical formulation, and an optimized resource allocation algorithm. Section 4 presents the insight into implementation, performance evaluation, and discussion. Finally, Sect. 5 concludes the paper with future directions.

2 Related Work

Recent years have seen various proposals for frameworks and techniques for dynamic resource provisioning (DRP) in multimedia cloud environments. We identify two common DRP techniques, that we believe merit special attention, such that SLA-based RP (SBRP), deadline-based RP (DBRP). The main aspect of SBRP [3, 4] is to satisfy SLAs, the cloud provider had agreed with cloud users regarding the quantitative terms of functional and non-functional aspects of the service being offered, whereas in the DBRP [5, 6], the deadline for application completion, the time left for the deadline, and the average execution time of tasks that compose an application are considered to determine the no. of resources required by it. In 2013, Park et al. [7] presented a two-phase mobile device group creation mechanism for resource provisioning in mobile cloud for big data applications, where groups are created for n cut-off points of entropy values. In 2014, Sood et al. [8] discussed an adaptive model, in which an independent authority is employed to predict and store resources using ANN. However, the processing time and the communication cost involved with the independent authority are not considered. Biao et al. [9] presented a two-stage resource allocation strategy for handling multimedia tasks in the cloud. Nevertheless, this paper only focused on the minimization of the response time and cost. In our previous work [10], the cuckoo-based optimized resource allocation has been tested.

In the above-mentioned RP techniques, the priority of the workloads generated by the end users can be decided based on memory requirements, time requirements, or any other resource requirements of the workload [11]; i.e., the workloads are handled in such a way that, the size of the workload, deadline of each task that fulfill the workload, available resources, and other management objectives are considered. But the local context information that affects the service provision cannot be influenced by the resource management decisions of the cloud provider.

3 System Model

3.1 Proposed System Architecture

The proposed context-aware task scheduling and allocation model is shown in Fig. 1, which is originated from the traditional client–server model, where the mobile device acts as a service consumer. This model consists of two queues, which are request handler (RH) queue and media processing servers (MS) queue. The RH queue is maintained for scheduling the requests to the corresponding pool of media servers. Each pool of media servers is assigned with a MS queue which is responsible for optimal allocation of the resources. In multimedia cloud, the resource requests, to handle media tasks are translated into virtual machine resource requests, which in turn mapped into the allocation of suitable physical servers hosting the VMs. Majority of the clouds are constructed in the form of datacenter. In this model, we consider the data center architecture is built as large, medium, and small media server pools as per its compute, storage, memory, and bandwidth capacities and managed by the resource provisioner unit. When a mobile client uses a cloud, the multimedia requests are sent to the data center, where the requests are stored in the RH queue. The requests are embedded with user context information as shown in Fig. 1.
Fig. 1

Proposed resource allocation model

In our proposed scheme, the client contexts include cell ID (Cellid), connection quality (C T ), and mobile battery level (B s ) that are used to make requests routing decision. We assume that the connection quality falls into any one of the two contexts: Context C1 represents Wi-fi connection, and context C2 represents the 3G/4G connection. We assume that the Bs falls into any one of the three values as shown in Fig. 1.

3.2 Problem Formulation

The VM requests are analyzed by RH based on the clients’ context information and are labeled as high, medium, and low priority requests. Then, these requests are scheduled to the corresponding MS queue that maps the tasks to the resources while minimizing response time experienced by the client and the cost of running the resources in the cloud. This model has a single entry point RH which is represented by a M/M/1 queue, with an arrival and service rate modeled as an exponential random variables λ and µ, respectively, in cloud computing where λ < µ. The main purpose of RH is to schedule the requests to the MS based on the criticality of the client context information that is periodically sent by the client’s context API. RH uses algorithm 1 for prioritizing the requests based on client context information. The service time of the RH queue is assumed to be exponentially distributed with mean service time \( \mu^{ - 1} \), where µ is the scheduling capacity of the RH.

The response time of the RH queue is given by \( \text{T}_{{\text{RH}}} = \frac{1/\mu }{{1\varvec{ } - \lambda /\mu }} \). Since there is no request loss in the previous system, the arrival of priority tagged requests at these three MS queue also follows Poisson process and the service time is assumed to be exponentially distributed with mean service time \( MS_{i}^{ - 1} \). In this paper, we assume that, the possibility p i of requests sent to each of the three MS i is randomly generated. Thus, each of the three MS queue is modeled as an M/M/1 queuing system. According to the decomposition property of Poisson process, the possibility p i of directing the request tasks with the priority i to the CS queue MS i , where i = 1, 2, 3…, impacts arrivals in each CS queue.

Open image in new window

Hence, the mean arrival rate is p i λ. The response time at each computing server is \( {\text{T}}_{\text{CS}} = \sum\nolimits_{\text{i = 1}}^{ 3} {\frac{{\frac{ 1}{{{\text{MS}}_{\text{i}} }}}}{{ 1- {\text{p}}_{\text{i}} \varvec{ }\lambda /{\text{MS}}_{\text{i}} }}} \). After processing the requests at MS, the service results are sent back to the customers. Average time a customer spends in the system (W) is defined as
$$ {\text{W}} = {\text{T}}_{\text{RH}} + {\text{T}}_{\text{CS}} = \frac{ 1/\mu }{ 1- \lambda /\mu } + \varvec{ }\frac{{ 1/{\text{MS}}_{\text{i}} }}{{ 1- \varvec{p}_{\varvec{i}} \lambda /{\text{MS}}_{\text{i}} }} $$

The mean server utilization is derived as ρ = λe/µ, where λe is the mean arrival rate. We also drive P0 the probability that there are no customers in the system = 1−ρ

And Pn, the probability that there are n customers in the system is Pn = ρn P0 Pn+1.

We assume that the load balanced resource allocation is achieved with the following P0 < ρ < Pn. The total cost is calculated according to the utilized resources by the time. The resources include the resources at the request handler queue and media compute servers. The total cost is derived as
$$ \text{C} = \,\left( {\phi 1 RH + \phi 2\sum\nolimits_{i = 1}^{n} {MS_{i} } } \right)t $$
where RH, MS is the service rate of the request handler and allocator at time t.
\( \phi 1\,and\,\phi 2 \) are the costs of RH and MS per request, respectively.
$$ \text{The}\,\text{transmission}\,\text{time}\,\text{is}\,\text{calculated}\,\text{as}\,\text{T} = \text{DR/Task size} $$
where DR is the data rate of the mobile device at the submission of job to the cloud
The turnaround time (TAT) is calculated as
$$ TAT = W + T_{Mobile}^{Cloud} + T_{Cloud}^{Mobile} $$
where \( T_{Mobile}^{Cloud} \) and \( T_{Cloud}^{Mobile} \) are the transmission times from the mobile device to the cloud and the cloud output task to the mobile device, respectively.

Open image in new window

The objective function of the optimization problem is as follows
$$ \varvec{F}\left( \varvec{x} \right) = \varvec{Max }\sum\nolimits_{1}^{n} {\left( {\varvec{x}\rho - \varvec{yW} - \varvec{zC}} \right)} \varvec{ } $$
Subject to
$$ TAT \le Deadline $$
$$ P_{0} < \rho < P_{n} $$
$$ \Psi \left( {C_{i} \left( t \right)} \right),\Psi \left( {M_{i} \left( t \right)} \right),\Psi \left( {B_{i} \left( t \right)} \right) \ge \,0,\forall \,\text{media}\,\text{server}\,\text{i}\,\text{at}\,\text{time}\,\text{t} $$
$$ \begin{aligned} \Psi \left( {C_{i} \left( t \right)} \right)\, \ge rC_{j} \,\text{,}\Psi \left( {M_{i} \left( t \right)} \right) \ge rM_{j} \text{,}\Psi \left( {B_{i} \left( t \right)} \right) \ge rB_{j} \hfill \\ \forall \,\rm{Media\,server\,i\,at\,time\,t.} \hfill \\ \end{aligned} $$
where \( \Psi \,\left( {C_{i} \left( t \right)} \right) \), \( \Psi \,\left( {M_{i} \left( t \right)} \right) \), \( \Psi \,\left( {B_{i} \left( t \right)} \right) \) are the percentage of the free capacity, memory, and bandwidth resources on the media server and rC j , rM j , rB j are the resource requirements of the VM request i at time t.

3.3 Cuckoo-Based Resource Allocation Algorithm

Once the requests are scheduled to the corresponding MS, the optimal allocation of resources is done with cuckoo search-based allocation algorithm (CSA) as described in algorithm 2. CSA is a new meta-heuristic algorithm inspired by the obligate interspecific brood parasitism of some cuckoo species that lay their eggs in the nests of other host birds [2]. In this paper, we assume that a task reaching a MS pool may be processed locally and not be migrated to another MS. At each MS server pool, the CSA is applied to map the VM task to the suitable physical server. Here, we map cuckoo nests as the physical resources, cuckoo as the MS, and cuckoo’s egg as the newly arrived task. This algorithm considers the following three rules
  1. 1.

    Each cuckoo lays one egg at a time and dumps it in a randomly chosen nests

  2. 2.

    The best nests with high quality of solutions will carry over to the next generations

  3. 3.

    The number of available nests are fixed, and a host can discover an alien egg with a probability paϵ [0, 1]


The generations of new solutions is done using Levy flights [2].

4 Performance Evaluation

This section presents simulation-based performance study of our proposed algorithm. In our simulation set up, there are two major simulation components: workload generator and cloud simulator. The workload generator is responsible for generating the workload, arbitrarily included mobile context information such as mobile energy level, connection quality, cell ID, time at which the context information is gathered. Google cloud traces from a cell of 12000 machines over about a month period in May 2011 was considered as the VM requests [12].

The simulation was done using the discrete event simulator Cloud Sim 3.0.3 executed on NetBeans. All two proposed algorithms have been implemented in the simulated environment. Table 1 shows the characteristics of the resources and the workload that have been used for all the experiments. We assume that the scheduling probability for the three MS servers is set as P = {0.2, 0.3, 0.4}. The RH server is charged by x = 0.12$/request. The resource cost constraint is set to $50.
Table 1

Simulation parameters



MS large

MS medium

MS small

Mean request arrival rate

500–600 requests/h

Bandwidth (B/S)




Size of workloads

15000 MB

10000 MB

7000 MB

No. of Pes per machine

4 (40000 MIPS)

3 (30000 MIP)

2 (1000–20000 MIPS)

Cost per workload


Memory size (MB)




Cloud workload output size (MB)

300 + (10–50%)

300 + (10–50%)

300 + (10–50%)

We first compare the performance between the proposed adaptive resource provisioning scheme, in which the resources for the RH queue and MS queue are optimally allocated by solving the optimization problem and the FIFO-based equal resource allocation scheme, in which the requests are scheduled based on FIFO policy and the deadline-based allocation scheme. The comparison of the mean service response time between the proposed algorithm and other two state-of-the-art algorithms are shown in Fig. 2. In that, we can see that the proposed scheme achieves much lower response time compared to the other two. We next evaluate the percentage of deadline met, budget, and the server utilization level under different request arrival rate. The influences of change in no. of requests on the percentage of deadline met, resource cost, and the server utilizations are shown in Fig. 3, Fig. 4, and Fig. 5, respectively. The resource cost and percentage of deadline violation by various workload tasks in ARPM is 30–40% lesser than the other two approaches where the critical mobile clients may severely facing the deadline violation. This time deviation is relatively substantial. The variation in average resource utilization is also noticeable.
Fig. 2

Average service response time versus arrival rate

Fig. 3

Comparison of the percentage of requests that met the deadline

Fig. 4

Comparison of the customer satisfaction level

Fig. 5

Resource utilization versus arrival rate

5 Conclusion

This paper proposes an adaptive resource provisioning method for enhancing the quality of user experience. In addition to the regular workload parameters, the proposed model uses the client context information to solve the diverse mobile cloud parameters, which affects the cloud resource allocation. We model the system with queueing model to capture the relationship between service response time, cost-and the server utilization. The resource allocation problem is solved using cuckoo-based algorithm. The experiments conducted aim to evaluate and analyze the comparison of ARPM with other two algorithms. Further, this model can be improved to adapt the context changes.


  1. 1.
    Dinh, H.T., Lee, C., et al.: A survey of mobile cloud computing: architecture, applications and approaches. In: Proceedings of Wireless Communications and Mobile Computing 13(18): 1587–1611 (2013)Google Scholar
  2. 2.
    Yang, X.S., Deb, S.: Engineering optimization by cuckoo search. Int. J. Math. Model. Numer. Optim. 1, 330–343 (2010)zbMATHGoogle Scholar
  3. 3.
    Garg, S.K., Gopalaiyengar, S.K., Buyya, R.: SLA-based resource provisioning for heterogeneous workloads in a virtualized cloud datacenter. IEEE ICA3PP 2011, Melbourne, Australia (2011)Google Scholar
  4. 4.
    Buyya, R., Garg, S.K., et al.: SLA-oriented resource provisioning for cloud computing: challenges, architecture, and solutions. Proc. IEEE Int. Conf. Cloud Serv Comput. (2011)Google Scholar
  5. 5.
    Christian, V., et al.: Deadline-driven provisioning of resources for scientific applications in hybrid clouds with Aneka. Elsevier J. Future Gener. Comput. Syst. 58–65 (2012)Google Scholar
  6. 6.
    Calheiros, R.N., Vecchiola, C., et al.: The Aneka platform and QoS-driven resource provisioning for elastic applications on hybrid clouds. Future Gener. Comput. Syst. (2012)Google Scholar
  7. 7.
    Park, J., Kim, Y.S., Jeong, E.: Two-phase grouping-based resource management for big data processing in mobile cloud. Int. J. Commun. Syst. (2013)Google Scholar
  8. 8.
    Sood, S.K., et al.: Matrix based proactive resource provisioning in mobile cloud environment. Elsevier J. Simul. Model. Pract. Theory (2014)Google Scholar
  9. 9.
    Song, B., et al.: A two stage approach for task and resource management in multimedia cloud environment. Springer Comput. 98, 119–145 (2016)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Durga, S., Mohan, S., et al.: Cuckoo based resource allocation for mobile cloud environments. Comput. Intell. Cyber Secur. Comput. Models 412, 543–550 (2016)Google Scholar
  11. 11.
    Brendan, J, Rolf, S.: Resource management in clouds: survey and research challenges. J. Netw. Syst. Manage. 1–53 (2014)Google Scholar
  12. 12.
    Wilkes, J., Reiss, C.: Details of the ClusterData-2011-1 trace. [Online]. (2011)

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Karunya UniversityCoimbatoreIndia
  2. 2.CCIS, Al Yamamah University, KSARiyadhSaudi Arabia

Personalised recommendations