Keywords

1 Introduction

Low resource utilization is a common issue in cloud platforms. Reiss et al. [5] shows that a Google cluster achieves CPU utilization of 25–35% and memory utilization of 40%. Quasar [3] indicates that the CPU utilization is consistently below 20%, and the memory utilization is (40–50%) on a production cluster at Twitter. Fluctuation of resource consumption and complex heterogeneous environment bring much more challenges to resource allocation in cloud cluster. It is difficult to match resource allocation precisely with resource consumption, and resources are usually over allocated to guarantee task execution.

In this paper, we propose PDRM, a resource management scheme based on task resource consumption probability distribution. The main idea of PDRM is to adopt the probability distribution of resource consumption to quantify the fluctuation of resource consumption for accurate resource allocation, so as to improve resource utilization and reduce task running time. We evaluate PDRM, and results show that it improves job execution efficiency and resource utilization.

2 Related Work

Several recent researches have tried to address the issue of improving efficiency in allocating resources to applications with varying degree of success. (1) Dynamic Resource Provisioning. Mohan et al. [4] have proposed dynamic resource management solutions for applications. These research works are used for resource management of long term services, but not suitable for batch workloads. However the batch workload, which consists of a large number of short tasks usually completed in minutes or seconds, is too important to be ignored. (2) Resource Provisioning with an Appropriate Configuration. CherryPick [1] builds a performance model with Bayesian Optimization to distinguish the optimal or a near-optimal configuration from the rest. MrMoulder [2] adopts optimization technique to tuning Hadoop configuration parameter settings. They mostly focus on improving of application performance and pay less attention to the resource utilization of the cluster. (3) Harvesting Idle Resource from Colocated Jobs. Zhang et al. [6] schedules related batch tasks on servers to colocate with latency-critical jobs. The main idea is harvesting idle resource from other jobs, but it can’t harvest idle resources from the job itself. (4) Characterizing and Classifying Workloads. Quasar [3] classifies any new incoming application, assign the application proper resources in a datacenter. Classification techniques cannot fully reflect the differences in resource consumption of various tasks.

3 Motivation

Existing research assumes that the resource consumption of the task is the same as the historical consumption. However, the resource consumption of repeated tasks will fluctuate instead of being absolutely the same. The main reason for fluctuations in resource consumption is that the complexity of the algorithm on different input data content is different. It may cause deviations in resource consumption prediction. Therefore, we have conducted in-depth research on the similarity of resource consumption. We repeat running a variety of different batch workloads with different data sets. We extract resource consumption for all types of tasks when the application is running. It shows that the resource consumption of the same tasks on the same cluster node are similar, and the resource consumption fluctuates within a certain range. By counting the number of tasks in different resource consumption intervals, we can obtain the task resource consumption probability distribution. The probability distribution of resource consumption is in accordance with the Gaussian distribution, and the Gaussian distribution is fitted well.

4 PDRM Design

By extracting resource consumption of big data applications, we can get the task consumption probability distribution. Based on the distribution function of resource consumption, we propose an accurate resource allocation scheme, called PDRM.

We use \(Task_{0}\) to denote a type of task, the resource allocation vector of \(Task_{0}\) is expressed as \([ra_{10}, ra_{20}, ..., ra_{m0}]^{T}\) and the resource consumption vector of \(Task_{0}\) is expressed as \([rc_{10}, rc_{20}, ..., rc_{m0}]^{T}\), where m is the total number of the types of resource, \(ra_{i0}(i \in N,1 \le i \le m)\) is the amount of the class i resource allocated to \(Task_{0}\) and \(rc_{i0}(i \in N,1 \le i \le m)\) is the class i resource consumed by \(Task_{0}\). We perform a Gaussian fitting on the class i resource consumption of \(Task_{0}\), and the Gaussian distribution satisfied by \(rc_{i0}\) is expressed as \(N_{io}(\mu _{io}, \sigma _{io}^{2})\), where \(\mu _{io}\) is the mean of \(rc_{i0}\) and \(\sigma _{io}^{2}\) is the variance of the probability distribution of \(rc_{i0}\). The cumulative distribution function of the probability distribution of \(rc_{i0}\) is

$$\begin{aligned} F_{i0}(x) = \int _{-\infty }^x\dfrac{1}{\sqrt{2\pi \sigma _{i0}^{2}}}e^{-\dfrac{(x-\mu _{i0})^{2}}{2\sigma _{i0}^{2}}}dx. \end{aligned}$$
(1)

The probability that \( rc_{i0} \) is less than \( ra_{i0} \) can be denoted as \( P_{i0} = F_{i0}(ra_{i0})\). The success ratio of \( Task_{0} \) can be expressed as \( min\{P_{10}, P_{20}, ..., P_{m0}\}\). Conversely, only when the resource allocated for the class i resource \( ra_{i0} \) is not less than \( F_{i0}^{-1}(P_{\text {success}})\), the success rate of \( Task_{0} \) could reach \( P_{\text {success}} \).

The resource allocation of class i resource for \( Task_j \) is \( ra_{ij}\). \( P_{\text {success}} \) is the probability that \( Task_j \) can be successfully completed. The failed task will be restarted with the default resource allocation which is much larger than the actual consumption of the task to ensure successful execution. The Expectation resource allocation of class i resource for \( Task_j \) expressed as \(E(ra_{ij}) = ra_{ij} + (1-P_{\text {success}})ra_{ij\_default}\), where \( ra_{ij\_default} \) is the default resource allocation. The average resource utilization of class i resource on a node is \( \overline{ut_i} = \dfrac{\sum _{j=1}^n\mu _{ij}}{\sum _{j=1}^nE(ra_{ij})}. \)

When the derivative of \( \overline{ut_i} \) is 0, \( \overline{ut_i} \) takes the maximum value. By solving the formula \( (\overline{ut_i})' = 0\), we can get the solution of optimal resource utilization and set the values of \( P_{\text {success}} \). Then, we get the resource allocation vector for each type of task.

5 Evaluation

We implement PDRM as a component on Hadoop Yarn. In this section, we demonstrate the effectiveness of our approach on a heterogeneous cluster.

We choose four representative applications on Hadoop to show different resources requirement: Terasort, WordCount, TextSearch, and TriangleOfOriented. We select 6 physical nodes to build a heterogeneous environment, named NODE0-NODE5. NODE0 is the master node, NODE1-NODE5 are the slave nodes. NODE0-NODE3 have the same physical configuration (two Intel Xeon E2620 6x cores 2.1 GHz CPUs, 16 GB memory). The number of virtual cores available for container allocation on each node is 8. As a comparative instance of CPU heterogeneity, NODE4 has two Intel Xeon E5620 4x cores 2.4 GHz CPUs. As a comparative instance of Memory heterogeneity, the memory available for container allocation on NODE5 is 3 GB, while the memory available on other nodes are 8 GB.

Fig. 1.
figure 1

Job completion time.

5.1 Job Completion Time of Heterogeneous Applications

In this experiment, we evaluate the effectiveness of PDRM for reducing job completion time. Figure 1 shows the job completion times with different resource allocation schemes. It can be observed that PDRM reduces the job completion time by 30.4%, 24.3%, 25.1%, 24.7% compared to the default for Terasort, WordCount, TextSearch, TriangleOfOrineted, respectively. PDRM resource allocation schemes can effectively reduce job completion time.

5.2 Resource Allocation Ratio and Resource Consumption Ratio of Heterogeneous Cluster

In this experiment, we compare the resource allocation ratio and the resource consumption ratio of different nodes in heterogeneous clusters under the default and PDRM. We normalize the node available resources to 1. We run Terasort on the heterogeneous cluster, the resource allocation ratio and resource consumption ratio of NODE1, NODE4 and NODE5 are shown in Figs. 2 and 3, respectively. It can be seen that the resource allocation ratio of PDRM is less than the that of the default, but the resource consumptions ratio of PDRM are greater that of the default.

Fig. 2.
figure 2

CPU resource ratio.

Fig. 3.
figure 3

Memory resource ratio.

The performance of NODE1 and NODE4 are limited by the CPU resources. With the PDRM resource allocation scheme, the CPU resources of NODE1 and NODE4 can achieve higher utilization. The CPU processing power of NODE4 is lower than that of NODE1. The NODE4 CPU can maintain high utilization, but the NODE4 memory resource utilization is less than NODE1. The performance of NODE5 is limited by the memory resources. NODE5 has less memory resources than NODE1. With PDRM resource allocation scheme, the memory resources of NODE5 can achieve higher utilization, far greater than that of NODE1. PDRM can improve resource utilization, and the scarce resources of nodes can be efficiently utilized in heterogeneous clusters.

6 Conclusion

The resource consumption probability distribution of the task can well describe the fluctuation of resource consumption. We propose PDRM, a resource allocation scheme based on the probability distribution of task resource consumption. Through experimental verification, PDRM can reduce job completion time by over 25%. What’s more, PDRM can minimize the gap between resource allocation and resource consumption, and make efficient use of scarce resources in heterogeneous clusters.