Keywords

1 Introduction

For the past several years, cloud computing and networking technologies have been growing rapidly. Cloud computing can be referred as a set of scalable services being hosted online where user can access them via the Internet. Specifically, cloud services are services made available to users on demand via network from data centers operated by cloud computing providers [1]. Cloud service providers have developed a wide range of cloud services to cater for diverse business organizations’ information systems. Currently in the market, there are many providers such as Amazon, Google, IBM, Microsoft and others that offer business organizations variety of cloud-based services where users will pay based on service subscription and usage.

Commonly, there are three cloud service models which are Software as a service (Saas), Platform as a service (Paas) and Infrastructure as a service (Iaas). Those models have different capabilities to be offered to the organization. Cloud computing services can be deployed in three different ways including public, private, and hybrid cloud. With the growth of cloud computing, many organizations started to offer cloud services to various consumers based on their functional and non-functional requirement [2].

The Right Scale 2017 State of Cloud report [3], a survey by the cloud computing service provider company, namely Right Scale, has uncovered the cloud adoption trends. The survey that consisted of 1002 respondents from technical professionals’ background across the region of organization showed that many people are currently adopting and planning to adopt cloud computing in their organizations. The company carried out the comparison on the number of users adopting cloud and the results showed that the number has been increasing based on their surveys from 2015 until 2017. This trend leads to a scenario where the number of people adopting cloud services are predicted to be increasing in the future. There are four categories of cloud adoption mentioned on this report namely Cloud Watchers, Beginners, Explorers, and Focused.

One of the reasons why cloud computing can attract many new potential users are due to great service offered over the internet. The organization may consider whether they want to migrate their business information system to cloud thus the organization needs to choose and decide which cloud service provider offer the best service performance. However, choosing the best provider for cloud service is not a simple task due to the huge number of available providers in the list. The selection process consumes so much time because it involves evaluation on cloud service providers one by one with big amount of cloud service providers. This has become one of the motivations for the development of cloud service recommender system. Some previous works related to this system are clustering-based recommendation system [4], review on cloud recommendation [5], and real-time QoWS-based recommendation system [6]. The recommendation system may not only propose the best cloud service provider for each specific QoS, but also for the overall performance. This means that it should produce a result of the recommended provider with the best overall performance based on accumulation of several combined QoS parameters. In this work, we propose a cloud computing provider’s evaluation and recommendation model that can propose the best provider based on several QoS parameters. We limited our work so that the model only evaluates three QoS parameters.

Moreover, we argue that the QoS performance of cloud computing services is unpredictable and highly uncertain. The main reasons for this uncertainty are due to cloud computing characteristics of dynamic elasticity, loosely coupling, unstable virtualization performance, and many others [7,8,9]. Hence, getting exact QoS performance knowledge about any cloud computing provider is a challenging task, if not impossible. Therefore, we proposed a fuzzy logic approach in our model due to the technique’s ability to handle uncertainties. In this work, we collected the real QoS performance data of three cloud computing providers. As mentioned earlier, the three parameters are downlink, uplink and latency. The data were clustered so that the performance of the providers can be categorized according to several clusters. Then, the fuzzy inference system (FIS) model was developed based on the clustering outputs. The performance of the model was evaluated based on accuracy and it was compared with a non-fuzzy (crisp) model. The performance measurement showed that our proposed model performed better than the crisp model.

The objectives of this paper are to describe the formulation of the model using fuzzy logic technique, and to present the conducted performance measurements. This paper is structured as follows. The next section will present the related works, Sect. 3 will describe the formulation of the model, Sect. 4 discusses and analyzes the results, and Sect. 5 concludes our work and outlines some future works.

2 Related Works

Garg et al. propose a framework for ranking cloud service providers by using Analytical Hierarchical Process (AHP) [10]. The framework, named as SMICloud, uses all standardized method to measure and compare business service known as Service Measurement Index (SMI) by Cloud Service Measurement Index Consortium (CSMIC). They design the metrics that are reliable to use to measure the cloud service performance. This approach extracts all the qualitative values. Firstly, QoS attributes are sorted hierarchically according to their standard. Secondly, users will assign the weights for each one of the attributes that the users prefer for their cloud selection process. In the third phase, the author finds a solution for values of some attributes that don’t have the numerical value by using relative ranking matrix. In the last phase, they combine the relative ranking scheme with the assigned weights earlier.

Another related work is CloudRank, which introduces a framework for ranking prediction by evaluating the service providers that have been chosen for this research [11]. The ranking prediction is based on QoS of cloud providers at the client side. Once a cloud application user sends a request on the ranking prediction from the CloudRank, firstly the system will compute and calculate the similarity of all the current users with the past users that have been using the ranking prediction before. After that, the system will find the similar users to put in the system to apply the ranking prediction. Two ranking prediction algorithms have been projected in this work, which are CloudRank1 and CloudRank2. In this approach, the QoS values of the service provider need to be measured before comparing the service. In solving the problem of incorrect ranking based on the predicted QoS values, both ranking oriented approaches produced to predict the QoS ranking precisely without predicting the equivalent value.

Mamoun and Ibrahim propose a framework for IaaS provider selection system that combines many clouds with a service processing unit [2]. The service processing unit that consists of provision unit, ranking unit and reservation unit, aims to gratify the cloud users based on their QoS requirement needs. They improve the past research work by providing the ranking and reservation unit in their system. The system then will produce a result showing the best IaaS provider based on the preference and numeric weight set by the users on the QoS attributes such as accountability, agility, assurance, cost performance and security.

Most of the past related works were related to evaluating cloud computing providers based on cost, accountability and security. However, our work focused mainly on QoS performance, namely to evaluate the best provider in terms of uplink, downlink and latency. Our work was also unique in terms of the implementation of fuzzy logic approach, which aimed to increase the accuracy the system when uncertainty occur.

Fuzzy logic was introduced by Zadeh in 1965 [12]. It is one of the soft computing techniques used in Artificial Intelligence industries, where it is based on four constructs namely fuzzy sets, membership function, logical operations and if- then rules [13]. Fuzzy logic is a rule based model that uses mathematical to apply the truth values. For instance, fuzzy logic uses continuous range of truth values [0,1] instead of Boolean logic (True or False). This if-then rule has widely been used in many applications. For example, a simple temperature control for air conditioning, controlling washing machine timing, control of ship steering and many more. One of the main strengths of fuzzy logic is that it can deal with uncertainty and vagueness [14]. This is performed by fuzzy logic through its fuzziness in membership function. This means that during evaluation, an input is given a membership degree in the range of [0, 1], instead of 0 or 1 as in crisp technique.

In this work, it was difficult to set the boundaries of clusters that determine whether the performance of any cloud computing provider is good, moderate or poor. This is because the network performance is highly uncertain and it is unrealistic to crisply determine the clusters’ boundary values. Therefore, fuzzy logic can handle this issue by providing fuzzy boundaries where each cluster’s boundary may overlap each other [15, 16].

3 Formulation of the Model

The tasks involved in formulating the proposed model is shown in Fig. 1. It began with preparation of data sets. Then the data sets were used in clustering validation to identify the optimum number of clusters for each of them. This process was followed by clustering that was carried out to generate the membership functions for our proposed model. Then, the membership functions were used in the construction of fuzzy inference system (FIS). The final stage involved performance measurement activities, which we compared our proposed model with crisp model. The following sub-sections describe each of these tasks.

Fig. 1.
figure 1

Research methodology

3.1 Data Sets Preparation

In this research, we used the real QoS data sets of three top cloud computing providers. These data sets were collected from Cloud Harmony website [17]. The literature shows that three main network performance (QoS) measurement parameters are network download speed, network upload speed and network delays (latency) [18]. Hence, we selected downlink, uplink and latency data sets from the data source. The data sets used in this work contained 4500 data points for each of the three QoS parameters. This means that the data set of downlink contained 1500 data points of provider A’s downlink, 1500 data points of provider B’s downlink and 1500 data points of provider C’s downlink. The similar contents applied to uplink and latency data sets. These data sets were used as training data sets, which were utilized to construct the model. Other than that, we also used three testing data sets, one each for downlink, uplink and latency. Each of these data sets consisted of 1500 data points. These testing data sets were used in the performance measurements.

3.2 Clustering Validation

Clustering validation is an important and necessary step in cluster analysis [19]. As the data sets contain big amount of data, the optimum number of cluster for the training data sets need to be identified. In validating and identifying the number of cluster, each training data set was tested with several number of cluster using a clustering validity index. This process showed that our training data sets produced an optimum number of four clusters for downlink and uplink, while for the case of latency the optimum number of clusters was five.

3.3 Data Clustering

An FIS’s membership functions are constructed based on either using expert knowledge or automatic development using historical data. The literature show that the former method has disadvantages such as loss of accuracy [20] and may not always available [21]. Hence, in this work we used the latter method through clustering of the training data sets. We chose Fuzzy C-Means (FCM) algorithm for clustering because it provides some advantages such as its results can be used to construct both Mamdani or Sugeno FIS model, and it give the best output for overlapped data [22].

FCM creates several clusters, and assigns a cluster’s membership degree to each of data points [23], where each data point becomes member to more than one cluster. These membership degrees of all the data points is stored in a matrix, U, which becomes the output of FCM clustering. Another output of FCM is center of each cluster, c.

3.4 Fuzzy Inference System Construction

The cluster centers, c, and matrix of membership degrees, U, produced by FCM were used to construct membership function of the model. In this work, we chose Gaussian-typed membership function because its constructs match with the two outputs produced by FCM as mentioned above. Gaussian fuzzy sets are formed based on the function shown in (1) [24, 25]:

$$ {\text{f}}({\text{x}};{\text{w}},{\text{c}}) = {\text{e}}^{{ - ({\text{x}} - {\text{c}})^{{ \wedge }} 2/2{\text{w}}^{{ \wedge }} 2}} $$
(1)

Hence, the width of the Gaussian membership function, w, from (1) is determined as follows:

$$ {\text{w}} = \sum_{{1..{\text{n}}}} \left( {\left( { - \left( {{\text{X}}_{\text{n}} - {\text{C}}_{\text{i}} } \right)^{2} } \right) /\left( {2*\log \left( {{\text{U}}_{\text{n}} } \right)} \right)} \right)^{1/2} / {\text{n}} $$
(2)

where n is number of data points.

For downlink and uplink, the four clusters were known as “Bad”, “Poor”, “Good” and “Excellent”, while for latency it consisted of “Bad”, “Poor”, “Fair”, “Good” and “Excellent” clusters. Mamdani-type inference was used for work because it can find the centroid of a two-dimensional function [26]. The defuzzification process efficiency increases when using Mamdani-type of FIS as it simplifies the computation required. The constructed membership functions are shown in Fig. 2.

Fig. 2.
figure 2

The constructed membership functions for (a) uplink, (b) downlink and (c) latency

In this work, the total of eighty number of rules have been implement in the FIS model. The sample of rules is stated below:

  1. 1.

    If (Downlink is Excellent) and (Uplink is Excellent) and (Latency is Excellent) then (Output is Excellent).

  2. 2.

    If (Downlink is Excellent) and (Uplink is Bad) and (Latency is Excellent) then (Output is Good).

  3. 3.

    If (Downlink is Good) and (Uplink is Bad) and (Latency is Fair) then (Output is Poor).

  4. 4.

    If (Downlink is Poor) and (Uplink is Excellent) and (Latency is Fair) then (Output is Poor).

  5. 5.

    If (Downlink is Excellent) and (Uplink is Bad) and (Latency is Excellent) then (Output is Good).

               .          .

               .          .

               .          .

  6. 80.

    If (Downlink is Bad) and (Uplink is Bad) and (Latency is Bad), then (Output is Bad).

4 Results and Discussion

The FIS model that we constructed was executed with the three testing data sets, namely uplink, downlink and latency data sets. As mentioned in the previous section, each of these data sets comprised 500 data points of provider A, 500 data points of provider B and 500 data points of provider C.

Figure 3 shows the results produced by the model. According to the performance ratings set in the model, an average of 5.0–7.5 was set to “Good” with an average of 7.5–10.0 was rated as “Excellent”. These ratings were based on the training done upon the training data sets, as discussed in the previous section. It is therefore shown in Fig. 3 that all providers managed to score “Good” performance but provider B became the best provider by having the highest average performance rating i.e. 7.122. This means that the model will recommend provider B based on the input data sets used in this experiment.

Fig. 3.
figure 3

The performance rating produced by the model

4.1 Performance Measurement

In this work, the proposed model was compared with non-fuzzy (crisp) model in terms of accuracy under the state of uncertainty. We simulated the uncertainty by imposing random errors into the training data sets, which were then known as synthetic data sets [27]. In this work, there were five synthetic data sets used for performance measurement. These synthetic data sets were used to construct five fuzzy synthetic models and five crisp synthetic models. These synthetic models were developed based on the same methodology as shown in Fig. 1, except that the crisp synthetic models were clustered using K-Means algorithm instead of FCM. K-Means algorithm was chosen because its behavior is similar to FCM except that it produces hard clustering results instead of fuzzy clusters [18]. Meanwhile, the original training data sets were used to construct the crisp original model, using K-Means algorithm as mentioned above.

The original and synthetic models were executed with testing data sets, where each of the synthetic models’ results were compared with the results of the respective original model. This means that each fuzzy synthetic model’s results were compared with the fuzzy original model’s results, and each crisp synthetic model’s results were compared with the results of their original model. The difference was counted as error. Table 1 shows that fuzzy model outperformed crisp model in terms of number of errors in each of the comparisons between the original model and the synthetic models. This shows that fuzzy model produced better accuracy than the crisp model.

Table 1. Number of errors produced by the fuzzy and crisp models.

5 Conclusion

This paper proposes a model for evaluating the performance of cloud service provider so that it can recommend to the users the best provider to subscribe. This kind of recommendation system is important because of the huge number of cloud computing providers nowadays. It helps users to select the best provider, and reduces the time for them to do selection. In this paper, the focus is to evaluate three QoS parameters of cloud computing namely downlink, uplink and latency. Fuzzy logic is proposed for the implementation of model due to the uncertain nature of these QoS parameters’ values. The main objective of this paper has been achieved through the description of the model’s formulation. Another objective has also been achieved by proving that the proposed fuzzy model has outperformed non-fuzzy model under the state of uncertainty. For future work, we propose that the model is extended to use type-2 fuzzy logic, which is known to have higher ability than type-1 fuzzy (as proposed in this paper) at handling uncertainties.