1 Introduction

In today’s increasingly globalised, complex economy that is flooded with data, decision-makers have more need than ever to manage their business efficiently. To help them achieve this, we have developed a new multi-criteria performance management method that provides input for visual management and continuous improvement initiatives.

The most commonly used method of performance estimation is Data Envelopment Analysis (DEA; Charnes et al. 1978; Banker et al. 1984). The basic idea behind DEA is that global performance is given by the ratio of the sum of weighted output levels to the sum of weighted input levels (Chen and Iqbal Ali 2002). Although this useful method allows one to estimate performance without any ex ante assumptions about the form of the production function it has two main drawbacks. The first is that in a multi-input/multi-output context the evaluation is made in a ‘black box’ that does not give the decision-maker a clear visual representation of the frontier. This feature of DEA makes it more difficult to disseminate the information it provides within an organisation. The second drawback of DEA is that evaluation becomes problematic when the output–input ratios do not make much sense, as is the case when scale-independent data (ratios or indices) are mixed with scale-dependent data (Dyson et al. 2001; Cooper et al. 2007; Cook et al. 2014). Consider, for example, evaluation of universities’ performance; if an output such as research quality—evaluated on a 1–10 index and therefore not linked to the size of the university—is compared with an input such as expenditure, which is scale-dependent, then a small university will inevitably be rated efficient and the efficiency index is meaningless. Various artifices have been proposed to circumvent this problem; one of the most frequent is multiplying the scale-independent variable by the level of one of the scale-dependent variables. Obviously, the results are dependent on the variable chosen for the transformation, and this choice is problematic in the multi-input/multi-output case (Dyson et al. 2001; Cooper et al. 2007; Cook et al. 2014).

Another stream are the multi-criteria decision analysis (MCDA) methods. The analogy with DEA is striking if we replace the name “DMU” with “actions”, “outputs” with “criteria to be maximised” and “inputs” with “criteria to be minimised”. They have been developed to help decision-makers when they are faced with ambiguous and conflicting data (Ishizaka and Nemery 2013). Moreover, MCDM can also help to analyse the performance of actions (also called alternatives). For example, da Rocha et al. (2016) evaluated the operational performance of Brazilian airport terminals using Borda-AHP, Longaray et al. (2015) used MACBETH to evaluate Brazilian university hospitals, Nemery et al. (2012) evaluated the innovation capacity of small and medium enterprises (SMEs), Ishizaka and Pereira (2016) developed a hybrid ANP-PROMETHEE method for evaluating employee performance and Galariotis et al. (2016) used MAUT to evaluate French local governments.

Performance evaluations are often given in the form of a ranking, which represents a synthesis of the data and hides important information. In this paper we report the development of a new MCDA method, PROMETHEE Productivity analysis (PPA), which coupled with a productivity graph, permits one to distinguish between four categories: efficient, effective, frugal and ineffective actions.

Performance analysis generally serves as a basis for developing corrective actions as part of a continuous improvement framework. It is important to have an understanding of the current situation, but the required analytical process can be complicated. Possible corrective strategies also need to be identified and discussed and it has been proposed that visual analytical tools offer a way of presenting, justifying and explaining them effectively and transparently (Nemery et al. 2012; Ishizaka and Pereira 2016). Visual representation permits users to take in a large amount of information simultaneously as it maximises human capacity to perceive, understand and reason about complex data and situations. Visual representations promote high-quality human judgement with limited investment of the analysts’ time (Thomas and Cook 2006). Visual management has largely been used in production management in the forms of visual stream mapping, flow charts and area name boards (Tezel et al. 2016). It has also been coupled with MCDA outputs. Two popular graphical methods are the GAIA plane, which represents multi-dimensional information in a two-dimensional space whilst preserving as much of it as possible (Ishizaka et al. 2016) and stacked bar charts, which allow the user to see the pros and cons of each action (Ishizaka and Pereira 2016).

Here we developed a productivity graph that allows actions to be compared against each other. It can be used to identify more efficient peers in order subsequently to follow their example. The new multi-criteria performance management method (Sect. 2) coupled with its visual management tool is illustrated by an evaluation of the performance of British universities (Sect. 3).

2 PROMETHEE productivity analysis

2.1 Problem formulation

A large number of methods of solving multi-criteria problems have been developed, and this trend seems set to continue. Wallenius et al. (2008) showed that the number of academic publications related to MCDA is steadily increasing. This proliferation is not only due to researchers’ impressive productivity but also to the development of MCDA methods specific to certain types of problems. Roy (1981) has described four types of problem:

  1. 1.

    Choice problem (\({P} \cdot \alpha \)): the goal is to select the single best action or to reduce the group of actions to a subset of equivalent or similar actions.

  2. 2.

    Sorting problem (\({P} \cdot \beta \)): actions are sorted into predefined, ordered categories. This method is useful for repetitive and/or automatic use. It can also be used for screening in order to reduce the number of actions subjected to more detailed consideration.

  3. 3.

    Ranking problem (\({P} \cdot \gamma \)): actions are placed in decreasing order of preference. The order can be partial, if we consider incomparable actions, or complete.

  4. 4.

    Description problem (\({P} \cdot \delta \)): the goal is to help to describe actions and their consequences.

Additional problem formulations have also been proposed:

  1. 5.

    Elimination problem (Bana e Costa 1996): a particular case of the sorting problem.

  2. 6.

    Design problem: the goal is to identify or create a new action that will meet the goals and aspirations of the decision-maker (Keeney 1992).

PROMETHEE (Brans and Mareschal 1994, 2005) is a multi-criteria method that belongs to the family of the outranking methods. It has been easily adapted to solve all the problems formulations (Mareschal et al. 2010). The aim of this paper is to present a MCDA-based solution to a new type of problem, namely the productivity problem. The majority of MCDA tools allow to assess the effectiveness: namely an alternative is said to be effective when it meets the output target. The MCDA methods allow to collapse multi-dimensional outputs into a single index that can be used as measure of effectiveness. The major shortcoming of effectiveness is that it is based just on the levels of output, and it does not deal with resource used to produce the output. For instance, an alternative could be ineffective because it has got a very limited resource. This is the reason why the most commonly used measure of performance is productivity (Ray and Chen 2015).

  1. 7.

    Productivity problem: the goal is to assess the efficient utilisation of resources in the production.

The usual measure of productivity is a single indicator like output per worker. However, when there is more than one input (like labour and capital) and more than one output, the ratio Output/Workers fails to account for the use of all the inputs used and all the output produced. In order to overcome this shortcoming, we propose a MCDA method to build an aggregate measure of the inputs, and an aggregate measure of the outputs, and we assess productivity by means of relation between these two indices.

In the next subsections, we explain how we adapted PROMETHEE to solve this problem.

2.2 Method description

As with any other multi-criteria method, we consider a set of n possible actions \(A=\{ {a_1 ,a_2 ,\ldots ,a_n }\}\) which are evaluated according to a set of k criteria \(C=\{ {c_1 ,c_2 ,\ldots ,c_k }\}\). The main difference between PPA and the other methods in the PROMETHEE family is that the criteria are split into two groups: input and output criteria. The processing of the data is kept separate. For each criterion \(c_i\), and for each pair of actions ( ab), the decision-maker expresses his/her preference by means of a preference function \(P_i\): the preference degree \(P_i( {a,b})\) is a number between 0 and 1 that indicates the extent to which action a is preferred to action b based on criterion \(c_i\). Six typical preference function shapes are proposed (Brans and Vincke 1985): usual function, U-shape, Level (Fig. 2), V-shape, V-shape with indifference (Fig. 1) and Gaussian (Fig. 3). The usual (\({p} = {q} = 0\)), V-shape (\({p} = 0\)) and U-shape (\({p} = {q}\)) are particular cases of the V-shape with indifference (Fig. 1), where p is the indifference and q the preference threshold on the axis d, which represents the score of an action on the given criterion.

Fig. 1
figure 1

V-shape with indifference preference function

Fig. 2
figure 2

Level preference function

Fig. 3
figure 3

Gaussian preference function

A multi-criteria preference index is then calculated as a weighted sum of the single-criterion preference degrees:

$$\begin{aligned} \pi ( {a,b})= \sum \limits _{i=1}^k {P}_{i} ({a,b})\cdot w_i. \end{aligned}$$
(1)

The weights \(w_i\) represent the relative importance of each criterion in the decision.

As each action is compared with the other \(n-1\) actions, the two following preference flows are defined:

  • Positive flow

    $$\begin{aligned} \Phi ^{+}(a)=\frac{1}{n-1}\sum \limits _{x\in A}^ \pi ({a,x}). \end{aligned}$$
    (2)

where n is the number of actions in set A.

This score represents the global strength of action a relative to all the other actions and the aim is to maximise it.

  • Negative flow

    $$\begin{aligned} \Phi ^{-}\left( a \right) =\frac{1}{n-1} \sum \limits _{x\in A}^\pi \left( {x,a} \right) . \end{aligned}$$
    (3)

This score represents the global weakness of a relative to all the other actions and the aim is to minimise it.

The net flow is the balance between the two previous flows and is given by:

$$\begin{aligned} \Phi (a)=\Phi ^{+}(a)-\Phi ^{-}(a). \end{aligned}$$
(4)

The positive and negative flows are used to build the PROMETHEE I partial ranking, whilst the net flow is the basis for the PROMETHEE II complete ranking: all the actions can be ranked in order of net flow value.

As PPA is based on PROMETHEE, it inherits its advantages. In particular, weights and preference functions can be assigned to criteria. If the criteria weights are known a priori by the decision-maker, this is important information which should be added to the model. Applying a preference function can also result in better representation of reality because changes in inputs (resources) and outputs (production) do not always have a linear effect on productivity. An increase in facility spending of 1000 Euros on an existing 10,000 Euro bill does not necessarily have the same effect as an increase of 1000 Euros on a 1000,000 Euro bill. Furthermore, in PROMTHEE all criteria can be expressed in their own units and thus there is no scaling effect. No normalisation of the scores is required, which avoids the drawback of the ranking depending on the choice of normalisation method (Tofallis 2008; Ishizaka and Nemery 2011). A detailed, technical discussion of the normalisation effect in the specific case of universities is given in Tofallis (2012).

2.3 Productivity measurement

The interpretation of net flows in PPA depends on the set of output and input criteria used. The higher the net flows of an action’s outputs and the lower the net flows of its inputs, the better it is.

In order to evaluate performances on the basis of the net flows of outputs and inputs, we define the PPA production possibility set \(({{\Psi }_\mathrm{PPA}})\) as:

$$\begin{aligned} {\Psi }_\mathrm{PPA}= & {} \left\{ \left( {{\Phi }_\mathrm{I} ,{\Phi }_\mathrm{O} } \right) \in R^{2}|{\Phi }_\mathrm{O} \le {\mathop {\sum }\limits _{i=1}^n} \gamma _i {\Phi }_{\mathrm{O}i} ;{\Phi }_\mathrm{I} \right. \nonumber \\\ge & {} \left. {\mathop {\sum }\limits _{i=1}^n} \gamma _i {\Phi }_{\mathrm{I}i} ;{\mathop {\sum }\limits _{i=1}^n} \gamma _i =1;\gamma _i \ge 0,i=1,\ldots ,n \right\} . \end{aligned}$$
(5)

where \({\Phi }_\mathrm{I}\) is the net input flow, and \({\Phi }_\mathrm{O}\) is the net output flow.

In line with the DEA variable return to scale production possibility set (Banker et al. 1984) we assume that (Cooper et al. 2004, p 42):

  1. 1.

    All observed activities\(\left( {{\Phi }_{\mathrm{I}i} ,{\Phi }_{\mathrm{O}i} } \right) \, ( {i=1,\ldots ,n})\) belong to \({\Psi }_\mathrm{PEA}\);

  2. 2.

    For an activity \(\left( {{\Phi }_\mathrm{I} ,{\Phi }_\mathrm{O} } \right) \) in \({\Psi }_\mathrm{PPA} \), any activity \(\left( {\overline{\Phi _\mathrm{I}} , \overline{\Phi _\mathrm{O}}} \right) \) with \( \overline{\Phi _\mathrm{I}}>{\Phi }_\mathrm{I}\) and \(\overline{\Phi _\mathrm{O}} <{\Phi }_\mathrm{O}\) is included in \({\Psi }_\mathrm{PEA}\). That is, any activity with net input flow of not less than \({\Phi }_\mathrm{I}\) and net output flow no greater than \({\Phi }_\mathrm{O}\) is feasible.

  3. 3.

    Any semi-positive convex linear combination of activities in \({\Psi }_\mathrm{PPA}\) belongs to \({\Psi }_\mathrm{PPA}\).

The production possibility set is a polyhedral convex set whose vertices correspond to all the actions that are not dominated by another action, i.e. no action simultaneously has higher net output flow and lower net input flow. The PPA frontier can easily be represented graphically (Sect. 2.4).

It worth noting that, unlike the original DEA (Charnes et al. 1978; Banker et al. 1984), inputs and outputs are not explicitly considered in the production possibility set (5). Similar approaches can be found in more recent DEA applications. One of the most popular of these approaches is PCA-DEA (see Ueda and Hoshiai 1997; Adler and Golany 2001, 2002; Adler and Yazhemsky 2010), in which DEA is used on the Principal Components of the original variables.

A second difference from the original DEA is that if one action changes its inputs and outputs, it can modify the inputs and outputs of other actions in the production possibility set (5). This feature is also displayed in other DEA model, for example by the model of Muñiz (2002), in which indexes of efficiency obtained by DEA in the first stage are used as inputs in the second stage. Other examples of DEA models sharing this feature are all the DEA applications on data pre-treated with specific normalisations (such as max–min, Nardo et al. 2008), for instance the model used in Mizobuchi (2014).

To measure the distance to the frontier in PPA, we use an algorithm based on the standard additive model introduced by Charnes et al. (1985) and elaborated by Banker et al. (1989). The algorithm is based on two steps. The first step measures the input distance as:

$$\begin{aligned}&{\max } {{\varvec{\Delta }}}{_{{{\varvec{\Phi }}}_{\varvec{I}}}}_{\varvec{k}} \nonumber \\&{{{\varvec{\Phi }}}_{{{\varvec{I}}}_{{\varvec{k}}}} =\sum \limits _{{\varvec{j}}=1}^{\varvec{n}} {{\varvec{\Phi }} }_{{\varvec{I}}_{\varvec{j}}} {\varvec{\lambda }}_{\varvec{j}} +{{\varvec{\Delta }} }{_{{{\varvec{\Phi }}}_{\varvec{I}}} } _{\varvec{k}}} \nonumber \\&{{{\varvec{\Phi }} }_{{\varvec{O}} _{\varvec{k}}} \le \sum \limits _{{\varvec{j}}=1}^{\varvec{n}} {{\varvec{\Phi }} }_{{\varvec{O}} _{\varvec{j}}} {\varvec{\lambda }}_{\varvec{j}}} \nonumber \\&{\sum \limits _{{\varvec{j}}=1}^{\varvec{n}} {\varvec{\lambda }}_{\varvec{j}} =1} \nonumber \\&{{\varvec{\lambda }}_{\varvec{j}} \ge 0;{\varvec{j}}=1,\ldots ,{\varvec{n}}} \nonumber \\&{{{\varvec{\Delta }} }{_{{{\varvec{\Phi }} }_{\varvec{I}}} } _{\varvec{k}} \ge 0.} \end{aligned}$$
(6)

where \({{\varvec{\Delta }}}_{{{\Phi }_{{I}}} _{\varvec{k}}}\) is the input distance to the frontier for the actions k under evaluation;

  • \({\Phi }_{\mathrm{I}_{\varvec{k}}}\) is the net output flow of action k in the evaluation;

  • \({\Phi }_{\mathrm{O}_{\varvec{k}}}\) is the net output flow of action k in the evaluation;

  • \(\lambda _j\) is one element of the intensity vector.

In the second step we measure the output distance as:

$$\begin{aligned}&{\max {{\varvec{\Delta }} }_{{{{\varvec{\Phi }} }_{{\varvec{O}} }} _{\varvec{k}}}} \nonumber \\&{{{\varvec{\Phi }} }_{{\varvec{I}} _{\varvec{k}}} \ge \sum \limits _{{\varvec{j}}=1}^{\varvec{n}} {{\varvec{\Phi }} }_{{\varvec{I}} _{\varvec{j}}} {\varvec{\lambda }}_{\varvec{j}} } \nonumber \\&{{{\varvec{\Phi }} }_{{\varvec{O}} _{\varvec{k}}} =\sum \limits _{{\varvec{j}}=1}^{\varvec{n}} {{\varvec{\Phi }} }_{{\varvec{O}} _{\varvec{j}}} {\varvec{\lambda }}_{\varvec{j}} -{{\varvec{\Delta }} }_{{{{\varvec{\Phi }} }_{{\varvec{O}}} } _{\varvec{k}}} } \nonumber \\&{\sum \limits _{{\varvec{j}}=1}^{\varvec{n}} {\varvec{\lambda }}_{\varvec{j}} =1} \nonumber \\&{{\varvec{\lambda }}_{\varvec{j}} \ge 0;{\varvec{j}}=1,\ldots ,{\varvec{n}}} \nonumber \\&{{{\varvec{\Delta }} }_{{{\varvec{V O}}} _{\varvec{k}}} \ge 0.} \end{aligned}$$
(7)

where \({{\varvec{\Delta }} }_{{{\Phi }_{\mathrm{O}}}_{\varvec{k}}}\) is the output distance to the frontier for the actions k under evaluation.

Finally, the minimum distance to the frontier defines the PPA inefficiency as:

$$\begin{aligned} {\min } \Delta ={\min }\left[ {{\begin{array}{ll} {{{\varvec{\Delta }} }_{{{\Phi }_\mathrm{I}}_k} ,}&{} {{{\varvec{\Delta }} }_{{{\Phi }_{\mathrm{O}} } _k }} \\ \end{array} }} \right] . \end{aligned}$$
(8)

2.4 PPA frontier visualisation

The productivity of the actions can be depicted in a two-dimensional graph (Fig. 4). The range of the two axes is \({-}\,1\) to 1, and they represent the net flows of the inputs and outputs. Four categories of action can be defined:

Efficient:

This type of action produces a high output (i.e. high net output flow) with a low input (i.e. low net input flow). Thus efficient actions appear in the top-left quadrant of the graph (e.g. actions B and C in Fig. 4).

Effective:

This type of action is defined solely in terms of its outputs (high) and thus appears in the top-right quadrant (e.g. actions D and E in Fig. 4).

Frugal:

This type of action minimises spending and is represented by the bottom-left quadrant (e.g. A in Fig. 4).

Inefficient:

This type of action has high inputs and low outputs and sits in bottom-right quadrant (e.g. G in Fig. 4).

Fig. 4
figure 4

PPA frontier graph (colour figure online)

Any action that is not dominated in terms of Input/Output net flows lies on the PPA frontier (shown in red on Fig. 4). Actions that are not on the frontier can be improved by taking real action(s) that are closest to the frontier as an example (e.g. in Fig. 4 action F can be improved by looking at the example of action C).

An action’s productivity ranking is determined by measuring the distance between it and the productivity frontier. This distance can be measured in various ways, e.g.:

  • Horizontally, on the input axis (x), by programme (6): the input orientation assumes a given output level and searches for improvements in inputs that will bring the action closer to the PPA frontier.

  • Vertically, on the output axis (y), by programme (7): the output orientation assumes a given level of inputs and searches for improvements in outputs that will bring the action closer to the PPA frontier.

  • Along both axes simultaneously: if the decision-maker is not facing any constraints and has the control over both inputs and outputs, the orientation will depend on his/her objectives. For instance, by choosing the minimum between the two previous measures (8), this shows how an action can be efficient when either input or output values are adjusted.

PPA has been implemented in two stages analysis:

  1. 1.

    The input and output net flows have been estimated by Visual PROMETHEE software (http://www.promethee-gaia.net/software.html), which is free to academics.

  2. 2.

    In order to estimate the output and input distance to the frontier with programme (6) and programme (7), an optimisation code in R has been developed and can be forwarded on request.

3 Case study

3.1 Introduction

Measuring universities’ performance is a fundamental necessity for society as it shows how well taxpayers’ money is being used, but it is not an easy task. The productive process of a university is a complex, multi-dimensional system with more than one objective (Agasisti and Haelermans 2016). Recently globalisation has increased the pressure on universities to improve their standards (Steiner et al. 2013). Performance measures have a significant impact on universities’ ability to attract the top scholars, bright students and research funding (Kaycheng 2015).

Comparisons of the quality of universities are regularly published in the form of ranking lists. Some UK examples are The Complete University Guide, The Guardian University Guide and the Sunday Times University Guide, all of which include leagues tables based on publicly available statistical data, e.g. from the Higher Education Statistical Agency (HESA) and the National Student Survey (NSS). These rankings have a sizeable impact on universities as they are one indication of prestige and have a direct influence on the number and quality of applicants (Hazelkorn 2007). As published rankings have proliferated so have criticisms of them (Yorke 1997; Bowden 2000; Oswald 2001; Vaughn 2002; Marginson 2007; Taylor and Braddock 2007; Tofallis 2012; De Witte and Hudrlikova 2013). The method used to rank universities is usually a simple weighted sum of the evaluation criteria, where the aggregation is fully compensatory and does not differentiate between universities with strengths in different areas. Moreover, as each criterion is measured in a different unit, the values need to be transformed into equivalent units to enable them to be summed. There are numerous ways of standardising data (commercial rankings generally uses z-transformation), and they often lead to different final rankings (Pomerol and Barba-Romero 2000). To avoid these problems, Giannoulis and Ishizaka (2010) used an outranking method. Another criticism is that input and output data are treated in the same way, which rewards inefficiency (Bougnol and Dulá 2015). For example, consider two universities with exactly the same input and output levels. The following year one of them reduces its input (e.g. facilities spend), but they continue to have the same output. The published lists will assign a lower ranking to the one which has decreased its input, although it has become more efficient. It would therefore be wise in the MCDA methodology to minimise input data and to maximise output data. Universities are facing funding cuts, so their input resources are becoming more restricted and therefore productivity is crucial. This means that efficiency should be considered alongside rankings based on various measures of perceived quality.

3.2 Description of data

The data used in the analysis were obtained from the Sunday Times University Guide 2014 (http://www.thesundaytimes.co.uk/sto/University_Guide/). It evaluated British universities on the basis of nine criteria (Table 1).

Table 1 Evaluation criteria; abbreviations in parentheses are used hereafter

Two of the criteria in Table 1 can be used as proxy of the two standard factors of production: capital (“Services/facilities spent”) and labour (“Staff”). As shown in several analysis of efficiency in Higher Education sector (among others see Thanassoulis et al. 2011) the “production” of university is multi-dimensional and often intangible. This is the reason why not only quantitative, but also qualitative measure of production should be taken into account. Based on this framework “The total number of students” can be used as a proxy of quantity production, and the remaining six criteria can be used as proxies of quality of production (“Student satisfaction”, “Research quality”, “UCAS entry points”, “Graduate prospects”, “Firsts/2:1s”, and “Completion rate”).

From a technical perspective, several of these criteria are scale-independent. Figure 5 shows the universities’ production process based on Table 1 variables. In our case, the universities use one scale-dependent and one scale-independent input to produce one scale-dependent and six scale-independent outputs.

Fig. 5
figure 5

University production process

Scale independence is not a problem in PPA as the preference function transforms all differences to a 1 to \({-}\,1\) scale. In this case study V-shape preference functions have been selected as all criteria are numerical. The preference thresholds have been chosen proportionally to the standard deviation observed on each criterion. This is a statistical approach which is consistent with the normalisation (z-transformation) used in the Sunday Times analysis. Of course other choices could be made taking into account the preferences of evaluators. The weights of the criteria are the same as in the Sunday Times university ranking, i.e. 1.5 for student satisfaction and research quality, 1 for all other criteria. In the analysis, the criterion “student–staff ratio” has been replaced by two separate criteria: the numbers of students (an output) and staff (an input) as suggested by (Thanassoulis et al. 1995).

Table 2 summarises the data. The last two columns show that some universities perform well in terms of the scale-independent outputs (Bath, Cambridge, Imperial College and Oxford). Some universities do best in terms of the scale-dependent output (Manchester). Finally, some of the small universities are best at keeping their costs low (Highlands and Islands, London Metropolitan).

Table 2 Descriptive statistic

3.3 Results

In order to measure the universities’ productivity, we used the algorithms presented in Sects. 2.2 and 2.3. In particular, we first estimate the input and output net flows by Promethee, and then we measure inefficiencies by means of Eqs. (6), (7), and (8). Four universities are on the PPA frontier (i.e. \({{\varvec{\Delta }} }_{{{\Phi }_\mathrm{I}}} ={{\varvec{\Delta }} }_{{{\Phi }_\mathrm{I}}} =0)\): two of them are old universities, Cambridge and Bath, and two of them are among the most recent universities, Bishop Grosseteste and Arts Bournemouth. The distance to PPA frontier (Sect. 2.4) can be interpreted as the amount of potential improvement. Table 3 in Appendices summarises the results: \({{\varvec{\Delta }} }_{{{\Phi }_\mathrm{I}}}\) is the inefficiency inputs, \({{\varvec{\Delta }}}_{{{\Phi }_\mathrm{O}}}\) represents the inefficiency outputs and \(\min {\Delta }\) is the minimum distance to the frontier defined by the four efficient universities.

Fig. 6
figure 6

PPA frontier (colour figure online)

As Table 3 in Appendices is difficult to read, the PPA method allows for an intuitive, visual representation of the results, this highlights unexpected relationships that might lead to important insights.

Higher education institutions vary considerably in history, objectives and operating practices (Shattock 2013). For our analysis we separated universities into three groups:

  • The old universities, also called traditional or pre-1992 universities (black rhombuses);

  • The new universities, also called modern universities or post-1992 universities. These are polytechnics that acquired university status in 1992 (grey triangles);

  • The new-new universities. Post-1992 universities that are not former polytechnics (black circles).

The red line in Fig. 6 represents the PPA frontier. We have used the same criterion weights as Sunday Times university ranking, i.e. 1.5 for student satisfaction and research quality, 1 for all other criteria. Only the best actions are on the frontier (Cambridge, Bath, Bishop Grosseteste and Arts Bournemouth). It is interesting to note that the new-new universities are generally located on the bottom left: they use a frugal production system. None of post-1992 universities have a positive net flow; they are not effective. None of the old universities are in the worst quadrant (bottom-right). The former polytechnics are generally located towards the bottom right and thus are generally less efficient than the other universities. A clear exception is London Metropolitan University whose position suggests that its working practices are much closer to those of the new-new universities.

In policy terms, effectiveness is expensive and requires long-term strategies; however, effectiveness is the only guarantee of the prestige that allows an institution last more than a single generation (Huisman et al. 2012; Middlehurst 2013; Filippakou and Tapper 2015). In conflicts with other authorities it is only the quality of their achievements that has enabled universities to maintain their position (Lenartowicz 2015). Moreover, in the recursive production of themselves, universities have specialised in two basic activities: research and education of students. Both are ‘axes of construction’ for self-reproduction. What is new today will be known tomorrow and will form the basis for further discoveries. Similarly, today’s students will be tomorrow’s teachers.

3.4 Comparison of the results with the traditional DEA

3.4.1 The role of scale-independent and scale-dependent variables

In order to clarify the strong points of PPA, we compared our results with those of a DEA efficiency analysis. A graphical comparison cannot be done as it is not possible for DEA to have a multi-dimensional representation. It is worth noting that a DEA evaluation of performance based on the data presented in Table 1 does not make much sense as data are a mixture of scale-independent (ratios and indices) and scale-dependent variables (Fig. 5). Because of this we modified the data before proceeding with the DEA analysis. We used Dyson et al.’s (2001) procedure, which is as follows: we multiplied ServSp, StSat, Ucas, GrdProsp, 1–2:1s and Compl by the number of students and we multiplied ResQual by the number of academic staff. These transformations allowed us to avoid the problems with DEA described by (Dyson et al. 2001; Cooper et al. 2007; Cook et al. 2014). There is, however, the drawback that the rankings depend on the variables by which the ratio and index variables were multiplied (Tofallis 2012). This problem does not appear with the PPA.

3.4.2 Selectivity power

In this analysis, we use the standard input-oriented DEA model and we assume variable return to scale (Banker et al. 1984). The efficiency indices estimated by DEA are given in the sixth column in Table 3. In order to compare the PPA inefficiency (ranging from 0 to 2) with the DEA efficiency (ranging from 0 to 1) indices, we transformed min \(\Delta \) as follows:

$$\begin{aligned} \hbox {EFF}_{\mathrm{PPA}} =\frac{\left( {2-{\min }\Delta } \right) }{2} \end{aligned}$$

In this case \(\hbox {EFF}_{\mathrm{PPA}}\) is in the interval [0,1] as the DEA efficiency index. The last column in Table 3 shows the difference between \(\hbox {EFF}_\mathrm{PPA}\) and \(\hbox {EFF}_\mathrm{DEA}\). As the DEA analysis places more universities on the efficient frontier it is less selective than PPA. Under DEA 22 universities are fully efficient, they are: Sheffield Hallam, Manchester Metropolitan, West of England, Northumbria, Plymouth, Coventry, Strathclyde, Manchester, Queen Margaret Edinburgh, Bangor, Nottingham, University College London, Cardiff, Leeds, Birmingham, Warwick, London Metropolitan, Oxford, Highlands and Islands, Bishop Grosseteste, Cambridge and Teesside. Only two of those universities are also efficient under PPA, Bishop Grosseteste and Cambridge. Two universities (Bath and Arts Bournemouth) are efficient under PPA but not DEA.

3.4.3 External information and preferences

The efficiency difference is due to the main difference between DEA and PPA, which is the decision-maker’s role in the evaluation. The results of DEA are entirely data driven. However, in some case, information is available due to the decision-maker expertise. Therefore, weights restriction are entered in the system (Podinovski 2016). In the PPA, we have another scenario, where the decision-maker has a good idea of the preference function and weight of each criterion. Depending on the problem and information available, decision-maker can choose to use DEA (no or few information available) or the PPA (good information inducing clear preference functions and weights).

3.4.4 Relative measures

PPA uses relative measures as in PROMETHEE. This means that if the performance (in input or output) of one action is modified, the relative performance of all others actions are also modified. This is because, if one alternative change its inputs and outputs, such a change modifies the input and output net flows of all the other units. This characteristic is different from the DEA, where the performance of DMUs is modified only if the performance of DMUs on the frontier is modified. Therefore, PPA takes into account the global situation (not only as regards to DMUs on the frontier) to calculate the performance of actions.

4 Conclusions

New technologies have allowed large volumes of data to be stored and transferred rapidly across large distances. The final challenge is working out how to extract and deliver information from this vast amount of data. An overload of mismanaged information can lead to disagreements, stress, waste and poor performance. In this paper, we have described an adaptation of PROMETHEE for productivity analysis problems and communicated the results in a graph that makes it easy to distinguish between efficient, effective, frugal and ineffective actions. This visual tool gives a holistic view of the results and makes it easy to identify unexpected relationships as well as increasing the transparency of the analysis. It supports the justification of suggestions for improvements. Evidence-based visual management creates a sense of openness and objectivity, which is a precondition for developing employees’ trust in management. PPA is an extension of PROMETHEE and therefore inherits its advantages. It does not require any standardisation; in contrast standardisation is a widespread problem in MCDA methods, which require the analyst to begin by standardising raw data in different units to make them comparable. There are several standardisation methods that produce different results. Avoiding the need for standardisation removes this problem. PPA uses a partial compensatory approach: a bad score cannot be compensated for (as in the full-aggregation MCDA methods) or ignored (as in the traditional DEA). A preference function and a weight can be defined for each criterion.

To illustrate the method, we analysed the performance of British universities. The task of universities is to generate, acquire and transfer knowledge. They are an important component of the economy. University rankings are attracting more attention than ever. Many universities clearly state their ranking objective (e.g. be among the top 20 universities) in their strategic plan. At the same time, economical sustainability is a major issue, because public funding for universities is decreasing. As universities face the challenge of doing more with less, improving productivity becomes vital. In our case study, PPA highlighted the wide differences in the productivity of British universities. Overall, new and the most recent universities tend to be more interested in keeping costs down, whereas old universities tend to be more effective. The PPA is a tool that can be used to inform decision-makers about best practice, based on easy-to-interpret information. To improve their position universities can look at peers on the PPA frontier. This kind of benchmarking scheme can be used by university management to identify ways to improve relative performance. The graphical representation of results clearly illustrates an institution’s position relative to its competitors. It is important to note, however, that productivity evaluation is only a first step in the process of reflection on performance. It gives some indication of which variables need to be improved, but the determining the operational changes required to do this can be very complex. For example, reducing spending on staff and services will reduce input, but may also have a negative impact on the output variables if working practices are not adjusted.

Finally, as the PPA is a generic method, implemented in a free accessible tool, we expect that in future research it will be applied to a wide range of industrial and public problems.