1 Introduction

Due to intense market competition, organisations can survive only if they offer services that are either innovative or exhibit a better quality than their competitors. However, by owning a limited infrastructure and continuously requiring to improve the existing business processes (BPs) leads to reaching certain impassable limits. Moreover, the infrastructure maintenance, operation and management costs can be quite prohibiting, especially for small or medium enterprises.

Fortunately, cloud computing can become the medium via which organisations can acquire cheap, commodity resources on-demand while also being able to achieve certain benefits, including: outsourcing infrastructure management with reduced cost, flexible resource management, and elasticity. Such benefits can certainly enable improving and optimally controlling BP performance.

However, as cloud computing handles only the infrastructure level, an organisation now faces the hard and yet unsolvable problem of aligning the business with the IT level. Moreover, many organisations do not have the expertise and know-how to use and combine the cloud services offered.

The above problems can be solved by combining BP management with cloud computing to realise the BP as a service (BPaaS) paradigm to enable migrating and more optimally managing BPs in the Cloud [6, 31, 32]. However, such a combination is not trivial as it leads to the following challenges which especially concern the BP design lifecycle activity: (a) how to map a BP to a technical workflow with a suitable automation level; (b) how to align business terms and requirements with technical ones to drive the selection of the most suitable services to be then integrated into the workflow; (c) how to deal with the service incompatibility problem effectively to guarantee the correct execution of the designed workflow. Such a problem relates to checking the syntactic compatibility of messages exchanged between two or more selected workflow services.

To realise the vision of BPaaS, the CloudSocket project (www.cloudsocket.eu) delivers a platform that unifies together environments supporting different BP lifecycle activities. This paper presents our contribution in form of a BPaaS Design Environment able to deal successfully with all aforementioned challenges. This translates to introducing an innovative architecture with suitable components that support: smart and semantic service discovery at both business and technical levels, optimal cross-level service selection, mapping between business and technical requirements and mediation between the execution of two or more services to achieve message-level compatibility. In result, the developed environment enables a BPaaS provider to transform the initial business functional and non-functional requirements that match the necessities of potential BPaaS customers into an executable workflow. That workflow can then become deployable in the cloud by using other CloudSocket environments.

The BPaaS Design Environment was built by exploiting state-of-the-art as well as two novel components. The first component, the business matchmaker, enables to find services that satisfy the user functional and non-functional requirements at the business level by following a novel questionnaire-based approach. Such services are then filtered and selected by employing state-of-the-art technical service matchmaking and selection components. Service selection relies on the second novel component, the syntactic matchmaking one, able to infer the message-based compatibility between two or more selected services and produce a mapping specification. This specification can then be exploited by a service mediation service to support the compatible message transformation between services and thus guarantee the smooth operation of the BPaaS workflow in which this mediation service is integrated.

This paper is structured as follows. Section 2 shortly analyses existing research results, some of which are exploited in the production of the BPaaS Design Environment. Section 3 analyses the environment’s main architecture by also explaining the main functionality and role of its components. Sections 4 and 5 detail the architecture’s two main novel components, the business and syntactic matchmakers. Section 6 introduces a use case to demonstrate the main benefits of the proposed environment and to validate it. Finally, the last section concludes the paper and draws directions for further research.

2 Background

2.1 Business-to-IT Alignment

Business-to-IT alignment typically refers to the gap between business requirements and technical solutions [12]. Cloud offerings are technically described making it hard for business people to properly assess the best fitting cloud solution [33]. Thus, identifying suitable cloud solutions requires specifying requirements for and capabilities of a service in both a business and IT language. To ensure knowledge understandability and transparency, it is a common practice to represent knowledge in models [11, 29]. Models abstract away from complex realities and achieve precise modelling of the intended domain. In [13] we already adopted a model-driven approach where an extension of BPMN 2.0 allows modeling both BP requirements for business and workflows/cloud services in a technical language. That approach includes translating the business to the technical language to enable matching process requirements and workflow/cloud service capabilities. Translation and matching are performed by semantically lifting models with ontologies to make them machine-interpretable.

[2] defines Semantic Lifting as “the process of associating content items with suitable semantic objects as metadata to turn unstructured content items into semantic knowledge resources”. Semantic Lifting shifts the purpose of modelling beyond transparency and communication [14]. The interpretable knowledge base (ontology) allows reaching higher system automation levels based on models [13]. For example, an ontology-based early warning system assessing supply chain risks was proposed in [8], while in [7] ontologies are combined with a case-based reasoning approach to support workplace learning. Closer to our current problem, [9] introduced the AML ontology for automatic identification of correspondences between BP model activities. Similar BP matching approaches are described in [1]. Such approaches are not sufficient for BP-to-workflow matching as a BP is far less detailed than a workflow such that a BP activity is most likely to refer to a whole workflow fragment. As such, due to this different degree of detail between the two levels, such approaches suffer from inaccurate matching, something not only addressed but also far improved by our approach.

Approach. We follow a model-driven which performs domain-specific conceptualization (mapping to well-known benefits [10, 17, 25]) on two levels, where the one targets BP users, while the other targets IT service experts. This allows designing domain-specific models capturing suitable domain knowledge on both levels. This approach builds upon the findings in [13] but adopts a different perspective on business-to-IT alignment in the Cloud. Namely, there is a shift from language translation to the mapping of values between requirements and specifications on both the business and IT levels, separately. Hence, the Business-IT alignment paradigm is applied sequentially by further refining results from the business to the IT level. As such, 3 matchmaking components are proposed: (a) the business and (b) technical matchmakers enhanced with formal semantics for machine-interpretation plus (c) the syntactic matchmaker. The combination of these 3 components allows identifying the most suitable cloud services within both business and technical terms that will eventually form a workflow.

2.2 Technical Service Matchmaking

Technical service matchmaking involves functional and QoS matching. Functional matching usually focuses on I/O-based matching [18, 26] while QoS matching takes the view of QoS as conformance [20] and employs different kinds of techniques [21] to infer if the service’s solution space is included in that of the request. While most work focuses on one aspect individually, some approaches consider both aspects simultaneously [3, 16]. However, they usually sequentially combine the matching in both aspects and do not employ semantic techniques, thus not exhibiting the right performance and accuracy level.

As such, our previous work [23] explored different ways the 2 matching types can be jointly performed: (a) sequential combination; (b) parallel combination; (c) subsumes-based combination. The experimental evaluation of these combinations showed that the parallel one leads to the best possible results with respect to performance, as matchmaking accuracy is perfect in all combinations.

Our approach exploits two aspect-specific matchmakers, a functional and a non-functional. The functional is a state-of-the-art matchmaker developed in the Alive project [4] which relies on the combination of I/O-based and IR-based matching. It exploits a smart graph-based structure to dynamically tolerate changes in domain ontologies (i.e., the ontologies via service I/O is annotated) as well as supply almost constant-in-time query operations over the graph.

The unary matchmaker [21] follows a hybrid QoS service matching approach. First, it aligns ontology-based service specifications based on their QoS terms. Then, it performs service filtering in a step-wise manner by considering each QoS term individually in each step. As unary constraints are assumed to be involved in service offers and demands, the matchmaker employs smart structures to support term-based filtering which results in ultra fast matching time.

2.3 Service Selection

Service selection work usually considers only one abstraction level by also neglecting semantics, thus producing results of imperfect accuracy. Accuracy is further reduced as some algorithms employ smart but non-optimal solving techniques, like Genetic Algorithms to accelerate the service selection time.

As service selection for a BPaaS includes different abstraction levels, we have developed a cross-level constraint-based algorithm [22] which exhibits the following features: (a) handling of multiple optimisation objectives by employing the Analytic Hierarchy Process (AHP) [27] and Simple Additive Weighting (SAW) [15] techniques; (b) the capability to bridge the gap between the two levels (SaaS and IaaS) via inserting functions that derive the QoS at the SaaS level based on the capabilities selected at the IaaS level; (c) the addressing of overconstrained requirements by employing smart utility functions that allow slightly violating these requirements so as to produce at least one solution; (d) consideration of dependencies between QoS parameters at the same level enabling a more accurate evaluation of respective solutions; (e) the capability [19] to accelerate solving time by fixing parts of the problem to certain partial solutions by relying on the BPaaS execution history.

State-of-the-Art Advancement.The proposed BPaaS Design Environment advances the state-of-the-art by exhibiting an innovative combination of existing, like holistic technical service matchmaking, and new features. The innovative business matchmaker follows a dynamic questionnaire-based approach enabling business users to answer a minimum set of questions before the mapping of the designed BP to a set of services, able to realise its functionality, can be produced. Such an approach is more natural and user-intuitive as it employs questions mapped to a natural language with terms drawn from the business domain. It also supports producing a minimal set of services to be further filtered and selected based on technical requirements such that the solution space is significantly reduced and service discovery time accelerated.

The novel syntactic matchmaker enables producing a correct executable workflow via the suitable integration of services at the technical level based on their message compatibility. Such compatibility is guaranteed by generating mapping specifications that are exploited by mediation tasks incorporated in the generated workflow. Finally, our framework addresses all layers involved in a BPaaS system along with their dependencies thus being able to produce a more complete and optimal BPaaS design product.

3 Architecture

The creation of the BPaaS Design Environment was underpinned by the design science research (DSR) methodology in [30]. First, the literature on Business-IT alignment in the Cloud was screened. Then, CloudSocket created the settings to contribute to the problem awareness: application scenarios were created in workshops involving both industrial and scientific experts. The results and insights were useful to suggest the BPaaS Design Environment’s first draft which was then finalised in a web-based solution through continuous development. Finally, as shown in Sect. 6, the validation took place with respect to the most agreed application scenario among the members of the CloudSocket consortium.

The BPaaS Design Environment follows a model-driven and semantics-aware approach for business-to-IT alignment in the cloud which comprises 3 main transformation steps: (a) BP-to-business-services; (b) business-services-to-technical-services; (c) BP & technical-services-to-executable-workflow. The approach guarantees the produced solution’s technical feasibility by employing a two-step service matchmaking process at both business and technical levels and a service selection algorithm that is syntactic-compatibility-aware at the technical level.

To achieve its main goal, the environment exhibits an architecture, depicted in Fig. 1, comprising 8 main components that are now analysed in detail. Some components correspond directly to some of the aforementioned steps while others play a supporting or orchestration role.

Fig. 1.
figure 1

The architecture of the BPaaS Design Environment

BPaaS Designer (BD). It represents the main point of interaction with the user during BPaaS design. It enables specifying both domain-specific BPs and executable workflows. It also guides users in providing suitable input to support the BP-to-workflow alignment.

Orchestrator (Orch). Orchestrates its underlying components to handle requests issued by the BPaaS Designer.

Business Matchmaker (BM). Matches the cloud services registered in the Knowledge Base based on business requirements derived from a questionnaire-based approach explained in Sect. 4.

Technical Matchmaker (TM). It exploits technical state-of-the-art aspect-specific matchmakers in a parallelised fashion according to the approach in [23].

Service Selector (SS). It [22] produces a concrete optimal solution for the service-based workflow at hand by considering the user technical non-functional requirements while also attempting to maximise the message compatibility between services by exploiting the next component.

Syntactic Matchmaker (SM). Called dynamically by the SS while solving the service selection problem to find the message compatibility [24] between the next and all previously selected services in each BPaaS workflow’s execution path where such a service participates. When an incompatible solution is constructed, SS can backtrack and check another one. To smartly deal with cases where the same call is issued, e.g., due to deep backtracking, SM stores the call results to immediately answer it. The mapping of the output parameters to the input ones of the next service is also recorded to enable updating the BPaaS workflow via a mediation service, as performed by the next component.

Workflow Updater (WU). Updates the BPaaS workflow by performing the following actions for each workflow’s execution path: (a) replays the solution construction in each path to obtain the mapping of the current service in the path from the SM; (b) introduces a mediation service within the workflow, immediately before the current service, which takes as input the current output parameter set and the mapping specification and produces as output the input parameters of the current service.

Knowledge Base (KB). Includes all necessary and sufficient information to support all reasoning/matching/selection tasks executed in the system.

4 Business Matchmaking

The Business Matchmaker allows specifying requirements in a more user-centric approach than that in [13]. It relies on a context-adaptive questionnaire that guides the user via a set of questions reflecting BP functional and non-functional requirements. Follow-up questions are displayed based on the result of a prioritisation algorithm that considers: (a) user preferences in terms of categories (e.g., Performance rather than Data Security); (b) information value (or entropy) of semantic attributes reflecting cloud service specifications at the business level, e.g., how distinguishing an attribute, such as monthly downtime, is for service filtering. Namely, the higher the entropy value of an attribute, the higher its service distinguishability degree, and thus the higher the assigned priority of the related question. This approach leads to the least possible number of questions being answered, thus reducing the business service matching time. The idea is that the questionnaire can be applied on the whole BP first. If no service is found, we then move down to groups of activities, until the level of single activities.

4.1 The Context-Adaptive Questionnaire

The Context-Adaptive Questionnaire relies on our BPaaS ontology [11]. Questions focus first on functional requirements and then on non-functional ones. The questionnaire enables the user to specify functional requirements in two ways by:

  • inserting an action and object from a predefined taxonomy in the BPaaS ontology. This corresponds to the convention of BPMN to name activities by a verb (i.e., action) and a noun (object) [28] whose combination provides the “what-is-about” knowledge.

  • inserting the most suitable category from APQC Process Classification Framework.

Next, the user can choose one of the 5 non-functional (NF) categories: Data Security, Payment, Performance, Service support, and Target Market.

The NF categories were derived from the Cloud Service Agreement Standardisation Guidelines [5], published by EC to standardize and streamline the terminologies and understanding of cloud services. The NF categories were subsequently discussed and validated within the CloudSocket consortium. In result, a set of questions and sub-questions were derived out of them. For instance, the Performance category includes questions like the following:

  • What is your preferred monthly downtime in minutes?

    Possible answer: 30 min

  • Should the process be executed on a daily, weekly, monthly or yearly basis?

    Possible answer: On a weekly basis

  • What is your favorite response time level?

    Possible answer: High, Medium or Low

  • How many simultaneous users should the cloud service support?

    Possible answer: at most 10

For each question, we have distinguished among 4 types of answers as: (1) single-answer selection; (2) multi-answer selection; (3) search-insert; (4) value-insert. Value- and search-insert require user input. While the former enables inserting attribute values (e.g., the aforementioned downtime), the latter enables crawling predefined values from the ontology and selecting the suitable one. For instance, answers related to the first 3 functional requirement questions (Action, Object and APQC category) are of search-insert type. Namely, users can insert keywords for the BP they are looking for, and the ontology returns the concepts matching these keywords. Figure 2 shows this functionality’s implementation result.

Fig. 2.
figure 2

The object selection for the functional requirements posing

Each time a question is answered, semantic rules are applied to convert implicit knowledge reflecting the business requirements into an explicit one. This prepares the ground to identify matching cloud services by applying a semantic query. For example, assume we have the following:

Specifications from the KB as follows:

  • A cloud service with the execution constraint of 20 times per day.

Requirements from the questionnaire as follows:

  • Should the process be executed on a daily, weekly, monthly or yearly basis?

    Answer: At least on a weekly basis.

    • How many times should the process be executed?

      Answer: At least 10 times

Running a process at least on a weekly basis implies that can also run on a daily basis. The semantic rule, therefore, would infer the answer “On a daily basis” and insert it in the KB. The semantic query then compares the derived fact with the cloud service fact related to the execution constraint. In result, the cloud service specification matches with the requirement.

4.2 Question Prioritisation Algorithm

The NFR questions follow a question prioritisation algorithm. This enables identifying the matching cloud services by asking as few questions as possible. Answers to the questions, along with previous ones, are used to display the follow-up question. The algorithm considers the following:

  • Grouping among non-functional attributes. For instance, if the user selects to answer one from availability and response time attributes of the Performance category, the follow-up question will be on the other attribute in this category.

  • Entropy expressing the variation degree in the values of each non-functional attribute. Entropy of an attribute is “0” when every cloud service stored in the KB contains the same attribute value, while “1” in the opposite case.

The entropy formula is expressed as follows:

$$Entropy\left( attr_i\right) =-\sum _{j=1}^J\left( p_{ij}\cdot log_2\left( p_{ij}\right) \right) $$

where J is the total number of attribute values and \(p_{ij}\) is the probability that a certain attribute value \(val_{ij}\) of attribute \(attr_i\) appears in a certain cloud service. As this probability can be regarded as independent and uniform across all attribute values, \(p_{ij}\) can be expressed as: \(p_{ij}=\frac{[CS ]_{csval_{ik}=val_{ij}}}{[CS ]}\) where the nominator denotes the number of cloud services that exhibit the respective attribute value (\(csval_{ik}\) denotes the value of \(attr_i\) for cloud service k) and the denominator the number of all services.

The prioritisation algorithm’s signature and main logic is as follows.

Input.

  • Already stated variables: attrCSvalcsval.

  • The set of non-functional categories \(C=\){Data Security, Payment, Performance, Service support, Target Market}.

  • Set of tuples \(<attr_i,Q_l>\) where Q is the set of questions and \(Q_l\) is a certain question where \(1 \le l \le [Q]\). So, each tuple maps 1 attribute to 1 question.

Output. The filtered set of cloud services CS that match with the content of the questionnaire, i.e., questions and answers.

Business Logic.

  1. 1.

    IF the number of categories left is positive (\(\left| C\right| >0\)), select a category \(c_n\), ELSE exit.

  2. 2.

    IF \(c_n\) has a positive number of semantic attributes left, i.e., \(\left| attr_i \text { s.t } attr_i.cat=c_n \right| > 0\), THEN calculate the entropy of all the selected category’s attributes, ELSE remove the current category \(c_n\) from C and go to (1).

  3. 3.

    Select attribute \(attr_i\) with highest entropy.

  4. 4.

    Display question \(Q_l\) that is mapped with the \(attr_i\).

  5. 5.

    Get user answer mapping to a value \(val_{ij}\) of attribute \(attr_i\).

  6. 6.

    Filter services in CS which do not satisfy the condition: \(csval_{ik}=val_{ij}\).

  7. 7.

    Remove the semantic attribute \(attr_i\) from the category \(c_n\) and go to (2).

  8. 8.

    Exit.

5 Syntactic Matchmaking

Business/technical matching cannot guarantee the message compatibility between selected services in a BPaaS workflow. Such a compatibility is thus a hard constraint in service selection for producing optimal, message-compatible solutions that can be safely executed. As such, the TM was developed to derive such compatibility and offer it as a function to SS.

The main idea is that the TM should first find which output messages of previously selected services match to which input messages of the currently selected service (based on SS’s solution generation process) for each execution path in the BPaaS workflow. Then, it should check for each message-to-message match if the first message conveys less information than that required by the second message. If this checking succeeds, no compatibility between the execution path’s considered services exists. When all message pair matches are compatible, the considered services are message-compatible.

Message Matching. The first message compatibility step can rely on existing semantic service annotations to easily and rapidly discover matching message pairs, as the messages involved in these pairs should map to semantically compatible concepts. However, even in the presence of such knowledge, message matching is not trivial and follows a two-step process involving semantic & syntactic message matching. This process is exemplified via the example of a certain service pair involving service \(S_2\) with 2 input parameters mapped to ontology concepts A & B and service \(S_1\) with 2 output parameters mapped to ontology concepts C & D.

At the semantic level, a bipartite matching approach is followed checking whether every parameter of the current service has a mapping to one parameter of the previously selected services (or the initial user input) in a certain execution path and attempting to discover a solution with the lowest overall distance. As such, we first define a local matching degree between two parameters to be the distance between the parameters’ annotation concepts in the ontology subsumption hierarchy, provided that the second parameter’s concept subsumes the first parameter’s one. If the latter does not hold, the distance is infinite. This guarantees that no information loss occurs as in the opposite case, the more concrete concept in the \(S_2\) input will require specifying additional pieces of information than those exhibited in the concept in the \(S_1\) output. A mapping solution’s overall distance is then the sum of the distances of the matches found. As such, the matching problem can be defined as follows:

$$\begin{aligned} min \left. {\left\{ \begin{array}{ll} \frac{1}{[J ]} \cdot \left( \sum \nolimits _{i \in I}^{j \in J} \left( \frac{dist\left( M_i,N_j\right) }{maxPSize}\cdot x_{ij}\right) + \sum \nolimits _{j \in J}\left( 1-\sum \nolimits _{i \in I}x_{ij}\right) \right) \\ \sum \nolimits _{j\in J}x_{ij}\le 1\\ \sum \nolimits _{i\in I}x_{ij}\le 1\\ i=[1,\ldots ,[I]],j=[1,\ldots ,[J]] \end{array}\right. } \right\} \end{aligned}$$

where I and J are the sets of input and output parameters, respectively, \(x_{ij}\) is a decision variable whether the output parameter i matches the input one j, \(dist\left( M_i,N_j\right) \) is the distance between annotation concepts \(M_i\) and \(N_j\) of the two parameters pair while maxPSize represents the maximum subsumption path length in the respective domain ontology used.

Suppose that the following relations hold in the running example: A subsumes B, C & B, C subsume D. In this respect, the best possible matching is \(\{A\rightarrow C, B\rightarrow D\}\) with overall distance of 2. The other matching solution \(\{A\rightarrow D, B\rightarrow C\}\) is not selected as the local distance between B & C is infinite so the overall distance is also infinite.

The algorithm then proceeds at the syntactic level by considering only those message pairs with a finite local degree of match. For each message pair filtered, we note the information items for the output parameter and those of the input parameter and then we check whether the former include the latter. As the information items have been already matched to ontology concepts, we perform this checking by replacing the information items with the attributes of the ontology concept. Even if the concepts matched are not identical, as they are related with a subsumption relation, they will have common attributes. So, the problem then is mapped to checking whether the concept attributes of the output parameter form a superset of those of the input parameter.

Message types might also convey information not included in an ontology requiring to perform a different matching kind for them. This matching’s logic is similar to that for the semantic level. In particular, bipartite matching is performed with the exception of how the distance is calculated at the local level. At that level, we consider both how similar the field names are and how close are their types. Name similarity can rely on well-known string distance measures (e.g., Levenshtein) while type similarity relies on the approach in [24] mapping to the compatibility level between types. The local overall distance would then equal the weighted sum of the two different distances.

If all input parameter parts are matched, the compared messages are semantically compatible. Otherwise, the compared services are semantically incompatible.

Let us continue the running example to explain syntactic matchmaking. Suppose that A & C were found equivalent. C maps to message type \(MT_1\) containing 4 information pieces \(MT_{11}\), \(MT_{12}\), \(MT_{13}\) and \(MT_{14}\). A maps to message type \(MT_2\) containing 3 information pieces \(MT_{21}\), \(MT_{22}\), and \(MT_{23}\). Based on matching message types to ontology concepts, we have that \(MT_{11}\) and \(MT_{21}\) map to A.A1 while \(MT_{12}\) and \(MT_{22}\) map to A.A2. Thus, the information pieces are transformed into \(\{A.A1, A.A2, MT_{13}, MT_{14}\}\) for first message type and \(\{A.A1, A.A2, MT_{23}\}\) for the second. For those pieces not mapping to ontology attributes, we solve a bipartite matching problem again. Suppose that \(dist\left( MT_{13},MT_{23}\right) =0.8\) and \(dist\left( MT_{14},MT_{23}\right) =0.2\). Then, the sole mapping to be selected will be \(\{MT_{13}\rightarrow MT_{23}\}\). If we replace \(MT_{23}\) with \(MT_{13}\), we then need to check whether \(\{A.A1, A.A2, MT_{13}\}\) is subset of \(\{A.A1, A.A2, MT_{13}, MT_{14}\}\) which holds.

6 Validation

Our approach was validated based on a use case developed by CloudSocket’s industrial partners. We focused on a very common BPs among SMEs - the Send Invoice one. This BP is modelled in BPMN, see Fig. 3, via our BPaaS Design environment. It starts with the “Manage Customer Relationship” activity; next an exclusive gateway splits the BP flow between either creating a new invoice or updating an existing one. Then, invoice completeness is checked, and finally the invoice is sent. Subsequently, starting with this BPMN process, we acquaint the reader with a prerequisite plus the main steps involved in our approach.

Fig. 3.
figure 3

The Send Invoice business process in BPMN 2.0

Prerequisite Step: Service Profile Registration. The following services were inserted in the KB as instances of CloudService class:

  • YMENS, Zoho and Sugar CRM were inserted as CRM systems which were annotated with the action Manage, the object Customer and the APQC category 3.5.2.4 Manage Customer Relationship

  • Mathema Document Generator, Open Source Billing, Simple Invoice and InvoiceNinja as invoicing systems annotated with action Generate, object Invoice and APQC category 9.2.2.2 Generate Customer Billing Data

  • Gmail, Ninja_email and Mailjet were inserted as e-mail systems which were annotated with the action Manage, the object Invoice and the APQC category 9.2.2.3 Transmitting Billing Data to Customers

Table 1 shows a part of the non-functional profiles of the considered services.

Table 1. Functional requirements for each group and single activity

First Main Step: Business Matchmaking. BM was used to identify the most suitable cloud services. As a first step, the questionnaire was applied on the whole BP (see starting notebook at Fig. 4).

Fig. 4.
figure 4

The starting notebook for the whole process

We specified functional requirements in the first 3 questions - object Send, action Invoice and APQC category 9.2.2 Invoice Customer - and none of the cloud services matched.

Next, the questionnaire was applied on two single activities (i.e., Manage Customer Relationship and Send Invoice) as well as on a group of activities (i.e., Create Invoice, Update Invoice and Check Invoice Completeness).

Table 2 shows the functional requirements for each activity/group. In the first case, after specifying action, object and APQC category, the questionnaire showed the 3 matching cloud services: YMENS, Zoho and SugarCRM. In the 4th question, we chose the Performance category, and the question prioritisation algorithm kicked in. The question regarding the number of simultaneous users was asked (attribute with highest entropy) and a value of 500 was entered. This filtered out SugarCRM as it has the capability of max 200 simultaneous users.

Table 2. Functional requirements for each group and single activity

Similarly, we applied the questionnaire on the designated group of activities. The matching services were InvoiceNinja and Open Source Billing, see Fig. 5a.

Fig. 5.
figure 5

The selected services for last two activity groups

Finally, we applied the questionnaire on the last BP activity: Send Invoice. The matching cloud services were Ninja E-mail and Mailjet (see Fig. 5b).

Second Main Step: Technical Matchmaking & Selection. As the final result maps to two services per each activity (group), we now proceed with the technical matching and selection. Suppose that the user provides the next global requirements for the whole process: \(cost < 100\) euros per month, \(cycle time < 1\) min and \(VPM < 16\) (#vulnerabilities per month). Further, suppose that the user imposes for the Manage Customer Relationship activity the following constraints: \(response time < 30\) s and \(VPM < 10\). Finally, Table 3 depicts the non-functional profiles of the remaining services.

Table 3. The technical non-functional offerings of the 6 services

Technical non-functional matching would then filter Zoho CRM as it does not conform to the local constraints posed for the CRM activity. This leads to selecting over 4 solutions as we have one candidate for the first (group) of activities and 2 candidates for the rest two activity groups. However, while running service selection, it is detected that the Ninja_Email and Open Source Billing are incompatible, which leaves us with 3 solutions. Moreover, the solution mapping to selecting YMENS, Open Source Billing and Ninja_Email has VPM equal to 16 violating the respective global constraint. So, in the end, we need to select between 2 solutions which are depicted in Table 4.

Table 4. The final ordered solutions produced

As the broker requires to optimise all non-functional terms (cost, cycle time and VPM), it gives equal preference over them. By also considering that the activities are sequentially executed in the BPaaS workflow, the final result would map to selecting services YMENS, InvoiceNinja, and Ninja_Email. While there is perfect syntactic compatibility between InvoiceNinja and Ninja_Email as they are offered by the same company, in the case of YMENS CRM and InvoiceNinja the message types are compatible but still need to be aligned (e.g., attributes accountid and id_number mapping to the same attribute id of concept Client). As such, the MS service was included between these 2 services resulting in a workflow with 4 services sequentially executed (YMENS CRM \(\rightarrow \) MS \(\rightarrow \) InvoiceNinja \(\rightarrow \) Ninja_Email).

7 Conclusions and Future Work

This paper has introduced a novel architecture for the design of BPaaS products able to effectively deal with the business-to-IT alignment problem in order to map an initial domain-specific BP into an executable BPaaS workflow. Such an architecture has been carefully designed and implemented to include suitable components which focus on different parts of the business-to-IT alignment problem, including business and technical matchmakers, a service selection as well as an automatic workflow update component to enable the effective addressing of the message compatibility problem in service-based workflow execution.

Our future work will focus on more advanced research challenges which include: (a) the automatic production of a more complete and more close to production workflow via the incorporation of different kinds of non-service tasks (see previous section); (b) the automatic population of the KB; (c) the coverage of additional cases in business-to-technical-requirement alignment.