Keywords

1 Introduction

The problem of QoS-Aware Web Service Composition [1,2,3,4] aims at obtaining a combination result with an optimal, single-criterion, end-to-end QoS when fulfilling a user’s request. In large-scale scenarios, for a given request, the composition of a substantial number of services may generate numerous possible solutions with the same optimal QoS but different numbers of services [5].

Minimizing the number of services of the composition while satisfying the optimal QoS is a significant challenge because it has important benefits for brokers, customers, and service providers [6, 7]. From the brokers’ viewpoint, a composition result with fewer services can facilitate maintenance and management work; from the customers’ point of view, a smaller composition ordinarily means that a lower payment is demanded for the services. Thus, a decrease in the number of services included in the composition may greatly increase the success rate of achieving the desired responses to the requests of customers. From the service providers’ viewpoint, solutions with fewer services can save resources and costs for the same task. However, a survey on QoS-aware service composition shows that the majority of studies aimed at optimizing the global QoS but rarely at minimizing the total number of services when the optimal QoS is guaranteed. To date, only a few studies have taken the optimization of both QoS and the number of services into consideration simultaneously. These existing methods are mainly divided into exact algorithms and approximate algorithms. Exact algorithms can generate compositions with a minimum number of services subject to the optimal QoS at the expense of a long running time, while approximate algorithms can only achieve near-optimal results.

As a matter of fact, minimizing the number of services while maintaining the optimal QoS leads to an NP-hard problem [8]. This will have a huge search space for optimal solutions in large-scale environments. In this paper, to make a good trade-off between quality and efficiency in large-scale scenarios, we propose a complete web service composition mechanism that effectively and efficiently minimizes the number of services in the composition result while achieving the optimal global QoS. The main contributions of this paper are as follows:

  • An equivalent transformation approach is proposed, which transforms the problem of QoS-aware web service composition into a tractable one with decreased computing complexity.

  • An optimal dynamic programming algorithm called Chain-DP is proposed, which guarantees to obtain the minimum number of services while holding the optimal global QoS based on the tractable problem after transformation.

  • A global-local strategy of pruning is proposed, which greatly improves the efficiency of the above-mentioned Chain-DP algorithm by removing the redundant services and ignoring useless search space.

Furthermore, a full validation on WSC-2010’s datasets is carried out. According to the experimental results, we can draw the conclusion that the proposed mechanism is effective and efficient in outperforming the state-of-the-art methods.

The organization of the rest of the paper is as follows. Section 2 reviews the related work. Section 3 illustrates the motivation of this research. Section 4 formally defines the composition problem. Section 5 presents the proposed mechanism in detail. Section 6 shows the experimental results.

2 Related Work

QoS-aware service composition has been studied by researchers from different perspectives. Most of the researchers merely concentrated on the optimization of the global QoS [9, 10]. There are only a few studies that attempted to optimize the total number of services while meeting the optimal QoS.

In [11], an approximate mechanism was presented to obtain close-to-optimal solutions against time. The authors proposed an on-the-fly algorithm to construct only a path of the auxiliary graph instead of the complete graph. Additionally, a deterministic approach and a probabilistic approach were discussed to find the path with the near-optimal QoS and number of services, which was chosen as the final composition. Although the algorithm had a superior composition time, the greedy strategy adopted was always stuck in local optima.

Yan et al. [12] proposed an algorithm that combined a systematic search algorithm with a planning algorithm called GraphPlan. This method could find and remove redundant services while achieving both functional goals and QoS optimization. Inspired by this, a redundant service removal mechanism was presented by Chen and Yan [13]. This method first modeled a composition problem as an integer programming problem (IP), and then obtained a composition whose global QoS was optimal by solving the IP. The next step was to remove redundant services in the composition while keeping the optimal QoS. This method is not generally applicable to the real-time situation. Thus, the authors in [14] proposed a modified approach to overcome the above drawback. In this approach, the process of redundancy removal was performed in parallel with the process of service composition, which had gone some way toward improving the efficiency. The approximate algorithms could reduce redundant services in compositions more or less, but still failed to work out optimal results.

A recent approach using exact algorithms was proposed by Rodriguez-Mier et al. [6]. A hybrid local-global strategy was presented to optimize the QoS-aware service composition problem. Although the local search strategy could only find a solution with a near-optimal number of services, it was a fast algorithm that saved plenty of time. The global search strategy could improve the solution generated by the local search, but it took a longer time to minimize the total number of services for the optimal QoS. Compared with approximate algorithms, the hybrid strategy can generate solutions with fewer services while guaranteeing the optimal QoS. However, the exhaustive combinatorial search makes it difficult to obtain the solution in a short time, especially in large-scale scenarios.

In summary, the above algorithms all suffer from many disadvantages. The approximate algorithms are fast but cannot generate results with a minimum number of services, while the exact algorithm is optimal but not adequately efficient. Therefore, there is a lack of approaches that have the ability to optimize the global QoS, as well as to minimize the number of services of the composition, effectively and efficiently.

3 Motivating Scenario and Analysis

As shown in Fig. 1, an example is described as a directed graph. The example is actually conducted to solve a practical classification problem whose goal is to predict if a client will subscribe to a term deposit in a bank. The problem of classification is treated as a problem of QoS-aware service composition whose request is R where the input is {ont1:UserID} and the output is {xsd:boolean}. Every element in the graph has its unique meaning. Each rectangle in the graph represents a web service (associated to a response time and a throughput), while each circle is an input or output of a service. In addition, the edges connecting circles represent the matching relations between services.

Fig. 1.
figure 1

Example of a Service Dependency Graph. (Color figure online)

Returning to the original problem of classification in Fig. 1, there are two different solutions of composition fulfilling the request R, which are highlighted in two different colors. The composition highlighted in orange has the optimal global response time (450 ms) and contains eight services in total (including the Source and the Sink). The composition highlighted in purple owns the same response time but contains only six services. In addition to the compositions highlighted in the graph, there are others with a response time of 450 ms, whereas their numbers of services are unexceptionally more than six. To sum up, the composition highlighted in purple is the optimal solution with a minimum number of services while guaranteeing the lowest response time.

In large-scale scenarios, the directed graph as in Fig. 1 may be exceedingly intricate and complex, which leads to a huge search space. As a result, it is formidable to extract the optimal composition from the graph. An exhaustive combinatorial search can guarantee the optima, but will take an unacceptable length of time to generate the compositions. In the process of service composition, in order to improve the efficiency, many measures can be taken to reduce the useless search space. For instance, in Fig. 1, the optimal composition highlighted in purple has a response time of 450 ms, while the response time of the service Credit Info Query Service is 460 ms. Therefore, Credit Info Query Service can be removed from the graph because it will not make any contribution to the optimal composition. In a word, we pay attention not only to the quality of the resulting composition but also to the efficiency of the composition algorithm.

4 Preliminaries

The formal definition of a web service is given as follows.

Definition 1

A Web Service (“service” for short) is defined as a tuple \(s = \{In_s, Out_s, Q_s\}\), where \(In_s = \{in^1_s, \ldots , in^n_s\}\) is the set of inputs required to invoke the service s, and \(Out_s = \{out^1_s, \ldots , out^n_s\}\) is the set of outputs generated by executing s. Each input and output is related to a semantic concept from the set Con defined in an ontology, namely, \(In_s \subseteq Con\) and \(Out_s \subseteq Con\). \(Q_s = \{q^1_s, \ldots , q^n_s\}\) is the set of nonfunctional attributes that are the measures for how well the service s serves the user.

Relevant services can be combined by connecting matched inputs and outputs.

Lemma 1

Given an output \(out_s\) of a service s, and an input \(in_{s^\prime }\) of another service \(s^\prime \), if \(out_s\) and \(in_{s^\prime }\) are equivalent concepts or \(out_s\) is a subconcept of \(in_{s^\prime }\), \(out_s\) matches \(in_{s^\prime }\) (i.e., \(in_{s^\prime }\) is matched by \(out_s\)).

There are two main kinds of structures of the composition, namely, the sequential structure and the parallel structure. The services organized as sequential structures are invoked in order, while those in parallel structures are invoked synchronously.

Definition 2

A Composition containing the set of services \(S = \{s_1, \ldots , s_n\}\) is represented as \(\varOmega \). If the services are chained in sequence, the composition is expressed as \(\varOmega ^\rightarrow = s_1\rightarrow \ldots \rightarrow s_n\); if in parallel, \(\varOmega ^\parallel = s_1\parallel \ldots \parallel s_n\). The set of services involved in \(\varOmega \) is defined as \(Servs(\varOmega ) = S\). Moreover, the length of a composition \(\varOmega \) is defined as \(Len(\varOmega ) = \vert S \vert \), namely, the number of services in \(\varOmega \). Taking the response time as an example, the global QoS of \(\varOmega \) is computed as

$$\begin{aligned} \left. \begin{aligned} RT(\varOmega ^\rightarrow ) = \sum \limits _{i=1}^{n} RT(s_i), \; s_i \in S \\ RT(\varOmega ^{\parallel }) = \mathop {\mathbf {max}} \limits _{1\le i \le n} \, RT(s_i), \; s_i \in S \end{aligned} \right\} . \end{aligned}$$
(1)

where \(RT(\varOmega )\) represents the global response time of the composition \(\varOmega \), and RT(s) represents the response time of the service s. Similarly, the global throughput \(TP(\varOmega )\) of the composition lies on the throughput TP(s) of each service \(s \in S\).

$$\begin{aligned} \left. \begin{aligned} TP(\varOmega ^\rightarrow ) = \mathop {\mathbf {min}} \limits _{1\le i \le n} \, TP(s_i), \; s_i \in S \\ TP(\varOmega ^{\parallel }) = \mathop {\mathbf {min}} \limits _{1\le i \le n} \, TP(s_i), \; s_i \in S \end{aligned} \right\} . \end{aligned}$$
(2)

Based on the above concepts, the precise definition of the QoS-aware web service composition in this paper is provided.

Definition 3

QoS-Aware Web Service Composition is defined as follows: for a given composition request \(R = \{In_R, Out_R\}\), to seek a composition \(\varOmega \) with optimization objectives of (1) \(\mathop {\mathbf {min}} RT(\varOmega )\) or \(\mathop {\mathbf {max}} TR(\varOmega )\) and (2) \(\mathop {\mathbf {min}} Len(\varOmega )\).

5 Framework

5.1 Generation of the Service Dependency Graph

For a given user request \(R = \{In_R, Out_R\}\), a service dependency graph [15, 16] similar to that in Fig. 1 is constructed to show the input-output dependencies between services. There is only a dummy service \(s_o = \{\varnothing , In_R, \{0 \ ms, +\infty \ inv/s\}\}\) in the first layer, and another dummy service \(s_k = \{Out_R, \varnothing , \{0 \ ms, +\infty \ inv/s\}\}\) is the only one contained in the last layer. The specific services in the other layers are selected from an external repository \(S_{all}\), and each layer contains the services whose inputs are all matched by the outputs generated by previous layers.

5.2 Generation of Subproblems

Description of Optimal Substructure. Once the service dependency graph G is completed for a request R, the composition problem is treated as finding a path with the optimal QoS and the minimum number of services starting from \(s_o\) and ending with \(s_k\) in the graph. Obviously, a path \(\varLambda \) in G is actually a composition \(\varOmega \) in Definition 2. An abstract path with the optimal quality from the service \(s_o\) to a service s in the graph is expressed as \(\varLambda _{s}\). As shown in Fig. 2, to obtain the path \(\varLambda _{s_k}\), the set of paths containing \(\varLambda _{X}\) and \(\varLambda _{XI}\) needs to be determined in advance. Similarly, the path \(\varLambda _{XI}\) depends on \(\varLambda _{IX}\), while the rest can be done in the same manner.

Fig. 2.
figure 2

Example graph.

Fig. 3.
figure 3

Subproblem V.

Definition 4

The set of precursors of a service \(s \in L_i\) (the i-th layer) is defined as \(Pre(s) = \{s^\prime \, \vert \, s^\prime \in L_j(\forall j<i) \wedge In_s \cap Out_{s^\prime } \not = \varnothing \}\). Specifically, \(Pre(s_o) = \varnothing \).

On the basis of the above definition, for a service s, if the paths \(\varLambda _{Pre(s)} = \{\varLambda _{s^\prime } \, \vert \, s^\prime \in Pre(s)\}\) have already been determined, the decision-making process of the optimal path \(\varLambda _s\) is regarded as a subproblem named s.

Definition 5

The set of feasible precursor-decisions of a subproblem s is defined as \(P_s = \{p_s \, \vert \, p_s \subseteq Pre(s) \wedge In_s \subseteq \mathop {\cup }\limits _{s^\prime \in p_s}Out_{s^\prime }\}\).

As a sequence, the composition problem is divided into many subproblems. The solution of a subproblem depends on the optimization results of subproblems in previous layers.

Definition and Generation of Subproblems. We bring up the abstract concept named quality of a path solely to explain the idea of the optimal substructure. Assuming that only the optimal QoS is maintained for each path, we will lose the optimal solution. For instance, there are two compositions in Fig. 2: the composition \(\varOmega = s_o\rightarrow [II\parallel III]\rightarrow V\rightarrow IX\), whose global response time \(RT(\varOmega )\) is 55 ms, as well as another composition \(\varOmega ^\prime = s_o\rightarrow IV\rightarrow VII\rightarrow IX\) that has a global response time of \(RT(\varOmega _s^\prime ) = 70\) ms. If the quality of a path is measured merely by the global response time, the optimal path \(\varLambda _{IX}\) is actually \(\varOmega \), which causes a loss of the optimal path highlighted in the graph. Accordingly, to minimize the number of services simultaneously, we design a dissimilar way to describe a path. A concrete path \(\varLambda _s^l\) starts from the service \(s_o\) and ends with a service s. In addition, \(\varLambda _s^l\) has the optimal QoS among those paths whose length (the number of services) is l. Let us reconsider the above example in this way. The path \(\varLambda _{IX}^4 = s_o\rightarrow IV\rightarrow VII\rightarrow IX\) has a response time of 70 ms. There are two different paths with the same length of 5: \(\varLambda = s_o\rightarrow [I\parallel III]\rightarrow V\rightarrow IX\) with \(RT(\varLambda ) = 60\) ms, and \(\varLambda ^\prime = s_o\rightarrow [II\parallel III]\rightarrow V\rightarrow IX\) with \(RT(\varLambda ^\prime ) = 55\) ms. Thus, \(\varLambda _{IX}^5 = \varLambda ^\prime = s_o\rightarrow [II\parallel III]\rightarrow V\rightarrow IX\) owing to \(RT(\varLambda ^\prime ) < RT(\varLambda )\). By this means, the path with minimum number of services for the optimal QoS can be kept.

Then, a subproblem s is defined as determining a cluster of concrete paths \(\varLambda _s^L = \{\varLambda _s^l \, \vert \, l \in L\}\) (\(L = \{1,\ldots ,\vert G\vert \}\)). Each path \(\varLambda _s^l\) in \(\varLambda _s^L\) is determined by several paths selected from \(\varLambda _{s^\prime }^L\), where \(s^\prime \in Pre(s)\). For a feasible precursor decision \(p_s = \{s_1,\ldots ,s_n\}\), let \(Cart(p_s) = \varLambda _{s_1}^L \times \ldots \times \varLambda _{s_n}^L\), where \(\times \) represents the operation of a Cartesian product. Taking the optimization of the response time as an example, the detailed optimization model of \(\varLambda _s^l\) is

$$\begin{aligned} RT(\varLambda _s^l) = \mathop {\mathbf {min}} \limits _{p_s \in P_s} \{\mathop {\mathbf {min}} \limits _{n_{p_s} \in N_{p_s}} \{\mathop {\mathbf {max}} \limits _{\varLambda \in n_{p_s}} RT(\varLambda )\}\} \; + \; RT(s). \end{aligned}$$
(3)

where \(N_{p_s} = \{N \, \vert \, N \in Cart(p_s) \wedge \vert \mathop {\cup }\limits _{\varLambda \in N}Servs(\varLambda )\vert =l-1\}\) is the set of feasible length-decisions of \(p_s\), and \(Servs(\varLambda )\) memorizes all the services in the path \(\varLambda \).

Table 1. Optimization process for \(\varLambda _{V}^4\).

The subproblem V is shown in Fig. 3. Table 1 shows the optimization process of \(\varLambda _{V}^4\). According to the table, \(\varLambda _{V}^4\) is generated by combining \(\varLambda _{II}^2\) and \(\varLambda _{III}^2\).

figure a

The generation process for subproblems is described in Algorithm 1, where \(out\_serv\_map\) is a precomputed table that maps each output to those services that own this output.

5.3 Transformation of Subproblems

It is inconvenient for set operations to identify whether a precursor decision is feasible for a given subproblem. Considering the subproblem s, a precursor decision \(p_s(p_s \subseteq Pre(s))\) is feasible if \(In_s \subseteq \mathop {\cup }\limits _{s^\prime \in p_s}Out_{s^\prime }\), which is too complicated to be applied to the following dynamic programming algorithm. Therefore, an equivalent transformation approach is proposed to overcome the obstacle.

Fig. 4.
figure 4

Schematic diagram.

Fig. 5.
figure 5

V after transformation.

Definition 6

As shown in Fig. 4, given a set of concepts C and another set \(C^\prime \), where \(C^\prime \cap C \not = \varnothing \), the contribution made by \(C^\prime \) to C is defined as

$$\begin{aligned} \varDelta _C(C^\prime ) = \sum _{index = 0}^{n-1} Array_{C \cap C^\prime }[index] \times 2^{index}. \end{aligned}$$
(4)

where \(\varDelta _C(C) = 2^n-1\). Conversely, if the contribution made by an unknown set X to the set C is \(\varDelta _C(X)\), the intersection of C and X is calculated as follows:

$$\begin{aligned} \left. \begin{aligned} \varDelta _C(X) = \sum _{index=0}^{n-1} a_{index} \times 2^{index}\\ \varPhi _C(\varDelta _C(X)) = \{C[index] \ \vert \ a_{index} = 1\} \end{aligned} \right\} . \end{aligned}$$
(5)

More examples are shown to further illustrate the above approach. Given two sets of concepts \(C = \{c_1,c_2,c_3,c_4\}\) and \(C^\prime = \{c_2,c_4,c_5\}\), \(\varDelta _C(C^\prime )\) equals 10 after calculation. Moreover, if an unknown set X makes a contribution \(\varDelta _C(X) = 5\) to the set C, we first complete the decimal-to-binary conversion of \(\varDelta _C(X)\), namely, \(5 = 1 \times 2^0 + 1 \times 2^2\). Then, \(C \cap X = \varPhi _C(5) = \{c_1,c_3\}\). There is a phenomenon such that \(\varDelta _C(C^\prime ) + \varDelta _C(X) = \varDelta _C(C)\). Meanwhile, \(C \subseteq (C^\prime \cup X)\).

Lemma 2

Given a set of concepts C, as well as two other sets \(C^\prime \) and \(C^{\prime \prime }\), if \(\varDelta _C(C^\prime ) + \varDelta _C(C^{\prime \prime }) = \varDelta _C(C)\), then \(C \subseteq (C^\prime \cup C^{\prime \prime })\).

Next, a concept called the flow of a path is introduced to transform each subproblem into another one, which avoids making use of set operations.

Definition 7

For a given subproblem s, the flow of a path \(\varLambda _{s^\prime }^l\) is defined as \(flow_{s^\prime } = \varDelta _{In_s}(Out_{s^\prime })\), where \(s^\prime \in Pre(s)\). Moreover, the flow of the objective path \(\varLambda _{s}^l\) is defined as \(flow_{total} = \varDelta _{In_s}(In_s)\).

The subproblem V after transformation is shown in Fig. 5. Operator \(\oplus \) in the figure is used to calculate the flow of a temporary path after composition. If a temporary path \(\varLambda _{tmp}^3\) (\(s_o\rightarrow [I\parallel II]\)) is generated by combining \(\varLambda _{I}^2\) and \(\varLambda _{II}^2\), the flow of \(\varLambda _{tmp}^3\) is \(flow_{tmp} = flow_{I} \oplus flow_{II} = \varDelta _{In_{V}}(\varPhi _{In_{V}}(flow_{I}) \cap \varPhi _{In_{V}}(flow_{II}))\). In this subproblem, we expect to obtain a set of paths \(\varLambda _{V}^L\) where each \(\varLambda _{V}^l\) owns the flow of 7 after composition.

Hence, \(P_s\) changes into , where \(\sum _\oplus \) acting on the set \(p_s = \{s_1^\prime , \ldots , s_n^\prime \}\) is short for \(flow_{s_1^\prime } \oplus \ldots \oplus flow_{s_n^\prime }\).

5.4 Chained Dynamic Programming Algorithm

For a subproblem s, there is no doubt that it is hardly desirable to explore all possible combinations of paths to get the set of feasible precursor-decisions \(P_s\), as well as the set of feasible length-decisions \(N_{p_s}\) for each \(p_s \in P_s\), especially in large-scale scenarios. However, according to Lemma 2, each subproblem can be further divided into a series of subproblems. Let \(Pre(s) = \{s_1,s_2,\ldots ,s_n\}\). Considering that the set of paths \(\varLambda _{s_n}^L\) is known, if the set of paths \(\varLambda _{s}^L\) whose flow equals \(flow_{total}-flow_{s_n}\) has been already determined by combining the paths selected from \(\varLambda _{s_i}^L(1\le i \le n-1)\), paths \(\varLambda _{s}^L\) with a flow of \(flow_{total}\) can be easily determined by combing the above two sets of paths.

Note that for each subproblem s, to differentiate the objective paths with different flows, the set of paths \(\varLambda _{s}^L\) with a flow of f (\(f \le flow_{total}\)) is expressed as \({\widetilde{\varLambda }}_{f}^L\) hereafter. Moreover, if a path \({\widetilde{\varLambda }}_f^l \in {\widetilde{\varLambda }}_{f}^L\) is expected to be found in a subproblem of s, then the flow of each \(\varLambda _{s_i}^l\) (\(1\le i \le n)\) should be updated as \(flow_{s_i}^{f} = \varDelta _{In_s}(Out_{s_i} \cap \varPhi _{In_s}(f))\). For example, as we can see from the subproblem V, \(flow_{II}^7 = 3\) while \(flow_{II}^1 = 1\).

On the basis of the above description, a chained dynamic programming algorithm is proposed. Taking the optimization of the response time as an example, let F[i][f][l] represent the response time of the path \({\widetilde{\varLambda }}_f^l\) generated by combining the first i clusters of known paths (\(\varLambda _{s_1}^L,\varLambda _{s_2}^L,\ldots ,\varLambda _{s_i}^L\)). It can be shown that

$$\begin{aligned} F[i][f][l] = \mathop {\mathbf {min}} \limits _{\begin{array}{c} l^\prime ,l^{\prime \prime } \in L \\ U(l^\prime ,l^{\prime \prime })=l \end{array}}\{\mathop {\mathbf {max}}\{F[i-1][f-flow_{s_i}^f][l^\prime ],RT(\varLambda _{s_i}^{l^{\prime \prime }})\}\}. \end{aligned}$$
(6)

where \(U(l^\prime ,l^{\prime \prime }) = \vert Servs({\widetilde{\varLambda }}_{f-flow_{s_i}^f}^{l^\prime })\cup Servs(\varLambda _{s_i}^{l^{\prime \prime }})\vert \). Several candidate paths whose flow is f are obtained by combining \({\widetilde{\varLambda }}_{f-flow_{s_i}^f}^{l^\prime }\) (for each \(l^\prime \in L\)) and \(\varLambda _{s_i}^{l^{\prime \prime }}\) (for each \(l^{\prime \prime } \in L\)). \({\widetilde{\varLambda }}_f^l\) is the one with the optimal response time while owning the length of l. Thus, for fixed i and f, paths \({\widetilde{\varLambda }}_f^L\) can be determined with a time complexity of \(O({\vert L \vert }^2)\). By systematically increasing the values of i (from 1 to n) and f (from 1 to \(flow_{total}\)), the desired path \(\varLambda _s^L\) will finally be obtained when \(i=n\) and \(f=flow_{total}\). For each \(\varLambda _s^l \in \varLambda _s^L\),

$$\begin{aligned} RT(\varLambda _s^l) = F[n][flow_{total}][l] + RT(s). \end{aligned}$$
(7)

Therefore, the subproblem s is solved. The solved subproblems are known conditions of those unsolved; therefore, a chain of decisions is made from \(s_o\) to \(s_k\), which is the reason the algorithm is named after Chain-DP.

5.5 Global-Local Pruning Strategy

When applying a dp algorithm, it is probable that numbers of useless subproblems are solved, or many idle search spaces are explored. A global-local pruning strategy is adopted to further improve the efficiency of the Chain-DP.

As can be seen from Definition 3, the optimal global QoS is the essential prerequisite for seeking the expected composition. We first propose a fast preprocessing approach to compute the optimal global QoS of each path without consideration of the length. In the dependency graph, a path with the optimal QoS from the service \(s_o\) to a service s is expressed as \({\widehat{\varLambda }}_s\). Then, the approach is shown in Algorithm 2, taking the preprocessing of the optimal response time as an example. The optimal response time of each path is stored in a hash table \(Opt\_RT\) where \(Opt\_RT[s] = RT({\widehat{\varLambda }}_s)\). Moreover, for each input \(in_s \in In_s\), \(RT\_in[in_s]\) stores the shortest response time to generate \(in_s\), that is to say, \(RT\_in[in_s] = \mathop {\mathbf {min}} \limits _{\begin{array}{c} s^\prime \in Pre(s) \ \wedge \ in_s \in Out_{s^\prime } \end{array}} RT({\widehat{\varLambda }}_{s^\prime })\). Therefore, the decision-making process for the path \({\widehat{\varLambda }}_s\) can be described as \(RT({\widehat{\varLambda }}_s) = \mathop {\mathbf {max}} \limits _{in_s \in In_s} \{RT\_in[in_s]\} + RT(s)\).

figure b

Global pruning is applied to lessen the number of redundant subproblems, which further reduces the number of useless services in the graph G.

Lemma 3

For each \(s \in G\), if \(RT({\widehat{\varLambda }}_s) > RT({\widehat{\varLambda }}_{s_k})\) or \(TR({\widehat{\varLambda }}_s) < TR({\widehat{\varLambda }}_{s_k})\), the service s will not be involved in the final composition.

According to Lemma 3, a service s can be removed from G if \(Opt\_RT[s] > Opt\_RT[s_k]\). For example, in Fig. 1, the final path highlighted in purple has a global response time of 450 ms, while the path ending with Credit Info Query Service has an optimal response time of 570 ms. Obviously, the Credit Info Query Service is not involved in the final path. In addition, eliminating it from the graph will not make a difference in the optimal solution of the composition.

Local pruning aims at reducing the search space of each subproblem.

Lemma 4

In the decision-making process for each subproblem s, for each \(s^\prime \in Pre(s)\) and each \(l \in L\), the path \(\varLambda _{s^\prime }^l\) will make no contribution to the cluster of paths \(\varLambda _s^L\) on the condition that \(RT(\varLambda _{s^\prime }^l) + RT(s) > RT({\widehat{\varLambda }}_s)\) and \(l + 1 > \vert Servs({\widehat{\varLambda }}_s) \vert \), or \(TR(\varLambda _{s^\prime }^l) < TR({\widehat{\varLambda }}_s)\) and \(l + 1 > \vert Servs({\widehat{\varLambda }}_s) \vert \).

On the basis of Lemma 4, local pruning is applied as follows. For each subproblem s, when determining the cluster of paths \(\varLambda _s^L\), a precursor path \(\varLambda _{s^\prime }^l\) can be disregarded, subject to the following constraints: (1) \(Rets[s^\prime ][l] + RT(s) > Opt\_RT[s]\) and (2) \(l + 1 > \vert Opt\_Paths[s] \vert \).

6 Experimental Evaluation

We completed three groups of experiments on the datasets of Web Service Challenge (WSC) 2010 to evaluate the performance of the proposed mechanism. The groups of experiments that were sequentially constructed are as follows: (1) validation of the global-local pruning strategy, (2) validation of the Chain-DP from the aspects of results and efficiency, and (3)validation of the applied scenarios.

6.1 Datasets

WSC 2010’s datasets range from 572 to 15211 services. Each dataset contains a WSDL file defining the inputs and outputs of the services, a WSLA file storing the QoS values (response time and throughput) of the services, and an OWL file describing the matching relations between all of the inputs and outputs.

6.2 Validation of the Hybrid Pruning Strategy

We first constructed experiments to validate the global-local pruning strategy.

Table 2. Comparisons of number of services before and after pruning.

On the one hand, we focused on the reduction of services in the dependency graph after pruning. Moreover, when performing the experiment, we measured the execution time of the preprocessing approach in passing. Table 2 lists the results obtained for each dataset and for each QoS property. Row #Graph Services shows the initial number of services in the graph, and #Graph Services (opt) the number of services after pruning. As can be seen, the number of services is reduced, on average, by 39% via pruning. Row Preprocessing Time shows the execution time of the preprocessing approach. It is evident from the table that the extra time spent by the preprocessing of pruning is no more than 5 ms.

Table 3. Comparison of efficiency of Chain-DP before and after pruning.

On the other hand, for each dataset and for each QoS, we compared the efficiency of Chain-DP before and after pruning. As shown in Table 3, row Chain-DP shows the execution time of the Chain-DP algorithm without pruning, and Chain-DP-Pruning shows the execution time of Chain-DP with global-local pruning. The results indicate that Chain-DP with global-local pruning is, on average, over 30 times faster than Chain-DP without pruning.

In conclusion, the above experiments indicate that the proposed pruning strategy can effectively remove the redundant services in the dependency graph, and can also significantly reduce the search space of the composition problem. As a result, the strategy is powerful in improving the efficiency of Chain-DP.

6.3 Validation of the Chain-DP Algorithm

To validate our chained dynamic programming algorithm, we compared our approach with five different approaches in the same experimental environment.

Table 4. Comparisons with other approaches.

Table 4 shows all of the comparisons. For each dataset and for each QoS property, we mainly paid close attention to the global QoS of the obtained composition (RT for the response time and TR for throughput), the number of services included in the obtained composition (#Services), and the execution time to extract a composition from a graph (Time). A composition is better if (1) its global QoS is better, or (2) it owns the same global QoS but fewer services. It can be seen that our approach can always generate the same or better compositions. As an exact algorithm, the global search proposed in [6] could not find a solution for D-04 owing to a combinatorial explosion, while our approach succeeded in finding one with better throughput (4000 inv/s) than other methods, except for the local search [6].

Fig. 6.
figure 6

Further comparison with other approaches. (Color figure online)

Considering that almost all of the methods in Table 4 could find compositions with the same global QoS, we further compared our method with those approximate algorithms in terms of the number of services, and also compared our method with the exact algorithm in terms of the execution time. Figure 6 shows that compared with the approximate algorithms, our algorithm always found compositions with the same or fewer services while guaranteeing the optimal global QoS. In addition, compared with the global search, it took far less time to generate solutions. The orange line represents the average execution time of the global search, while the green line shows the average time of Chain-DP with hybrid pruning. Our algorithm is, on average, over 35 times faster than the global search without regard to D-04, which is a significant improvement. In summary, our algorithm achieves an ideal trade-off between quality and efficiency.

6.4 Validation of the Applicable Scenarios

Lastly, we tested and validated the applicable scenarios that the proposed mechanism can be generalized to use.

To evaluate the effects of hybrid pruning in different scenarios, we define an indicator as

$$\begin{aligned} Ratio\_PR = \frac{Time\_WO}{Time\_WI}. \end{aligned}$$
(8)

where \(Time\_WO\) represents the execution time of Chain-DP without pruning, and \(Time\_WI\) the execution time of Chain-DP with hybrid pruning. As can be seen from Fig. 7(a), the larger the size of a dataset, the better the hybrid pruning strategy performs.

Furthermore, in order to compare Chain-DPs efficiency with that of the global search in different scenarios, we similarly define another indicator:

$$\begin{aligned} Ratio\_CP = \frac{Time\_GS}{Time\_DP}. \end{aligned}$$
(9)

where \(Time\_GS\) represents the execution time of the global search, and \(Time\_DP\) the execution time of Chain-DP with hybrid pruning. Seeing that the global search could not find a solution for D-04, we used the execution time of the local search instead. As shown in Fig. 7(b), with an increase in the size of the datasets, the advantage of our Chain-DP becomes progressively obvious. For D-05 (Validation with Throughput), Chain-DP is even two orders of magnitude faster than the global search while guaranteeing the same optimal solution.

Fig. 7.
figure 7

Validation of applied scenarios.

The above experiments indicate that both the hybrid pruning strategy and the Chain-DP algorithm can be easily generalized and applied to a variety of scenarios, especially large-scale scenarios.

7 Conclusion

In this paper, we proposed an effective and efficient mechanism to automatically generate compositions by minimizing the number of services while simultaneously satisfying the optimal global QoS. The mechanism combines a global-local pruning strategy and a chained dynamic programming algorithm to extract the optimal composition from the service dependency graph with high efficiency. A large number of experiments on two different groups of datasets show that our mechanism performs better than the state-of-the-art methods, as it not only obtains compositions with fewer services for the optimal QoS than the approximate algorithms, but also executes much faster than an exact algorithm while obtaining nearly the same results. It is proven to achieve a very good trade-off between quality and efficiency in various scenarios, especially in large-scale scenarios.