Abstract
Data Envelopment Analysis (DEA) is a non-parametric, optimisation-based benchmarking technique first introduced by Charnes et al. (European Journal of Operational Research, 2(6), pp. 429–444, 1978), later extended by Banker et al. (Management Science 30(9), pp. 1078–1092, 1984), with many variations of DEA models proposed since. DEA measures the production efficiency of a so-called Decision Making Unit (DMU) which consumes inputs to produce outputs. DEA is a particularly useful tool when there are multiple measures to be analysed in terms of DMU (or organisation) performance, allowing it to benchmark and identify comparable peers. DEA can incorporate different measures of multi-dimensional activities thus allowing for DMU complexity and is particularly useful for more ingrained analyses when investigating the effects of contextual or environmental factors on organisations’ performance. DEA has been applied in numerous areas including banking, education, health, transport, justice, retail stores, auditing, fighter jet design, research and development to name a few.
DEA is based around a production model which assesses the efficiency of DMUs in turning inputs into outputs. This is done by comparing units with each other to identify the most efficient DMUs that define a frontier of best performance, which is used to measure the performance of non-efficient DMUs. This efficient frontier represents “achieved best performance” based on actual outputs produced and inputs consumed and thus provides a useful practical reference set for benchmarking and performance improvement. There are very few assumptions required in DEA and its non-parametric form avoids the need to consider alternative distribution properties.
In this chapter we first describe the case of a Post and Banking Business, and then introduce DEA in the context of our case. Different DEA models and additional features are discussed. We give a brief outline of an open-source software tool for DEA and finally apply three different DEA models to the case study and discuss the results.
The Learning Outcomes of This Chapter Are:
-
Develop an intuitive understanding of DEA
-
Understand basic linear programming models for DEA
-
Be aware of common DEA modelling techniques
-
Be able to conduct a DEA analysis supported by pyDEA software
-
Be able to interpret the DEA results and explain them to a non-technical audience
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
For instance, model (4) for DMU i = H minimises θ while satisfying \( {y}_H\le {\sum}_{j=A}^{H}{y}_j{\lambda}_j \) and \( \theta {x}_H\ge {\sum}_{j=A}^{H}{x}_j{\lambda}_j \) with \( {\sum}_{j=A}^{H}{\lambda}_j=1 \). Since H and A are the DMUs which consume the least input (x H = x A ≤ x j for j = B,…,G), only λ H,λ A can take non-zero values and θ = 1 (corresponding to 100% efficiency). H is considered efficient as its input cannot be reduced.
- 3.
Most DEA software requires strict positivity for inputs and outputs. Thus negative values need to be adjusted. For the VRS models a negative input or output can be adjusted by adding a number to all DMUs’ values for that input or output to convert the minimum value to a positive one. Note that this can only be done for VRS models and not the CRS model. See also (Cooper et al. 2007, Chap. 4).
- 4.
Optimization with PuLP https://pythonhosted.org/PuLP/
References
Ali, A., & Seiford, L. (1993). Computational accuracy and infinitesimals in data envelopment analysis. INFOR: Information Systems and Operational Research, 31(4), 290–297.
Allen, R., Athanassopoulos, R., Dyson, R., & Thanassoulis, E. (1997). Weight restrictions and value judgements in data envelopment analysis: Evolution, development and future directions. Annals of Operations Research, 73(0), 13–34.
Banker, R., Charnes, A., & Cooper, W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science, 30(9), 1078–1092.
Banker, R., & Morey, R. (1986a). Efficiency analysis for exogenously fixed inputs and outputs. Operations Research, 34(4), 513–521.
Banker, R., & Morey, R. (1986b). The use of categorical variables in data envelopment analysis. Management Science, 32(12), 1613–1627.
Barr, R., Durchholz, M., & Seiford, L. (2000). Peeling the DEA onion: Layering and rank-ordering DMUs using tiered DEA. Dallas, TX: Southern Methodist University.
Charnes, A., & Cooper, W. (1962). Programming with linear fractional functionals. Naval Research Logistics Quarterly, 9(3–4), 181–186.
Charnes, A., Cooper, W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2(6), 429–444.
Coelli, T., Rao, D., O’Donnell, C., & Battes, G. (2005). An introduction to efficiency and productivity analysis (Chap. 6). New York: Springer Science & Business Media.
Cooper, W., Seiford, L., & Tone, K. (2007). Data envelopment analysis: A comprehensive text with models, applications, references and DEA-solver software (Chap. 2). New York: Springer.
Dyson, R., et al. (2001). Pitfalls and protocols in DEA. European Journal of Operational Research, 132(2), 245–259.
Emrouznejad, A., & Yang, G.-I. (2017). A survey and analysis of the first 40 years of scholarly literature in DEA. Socio-Economic Planning Sciences, 61, 1978–2016.
Hosseinzadeh Lotfi, F., et al. (2008). An MOLP based procedure for finding efficient units in DEA models. Central European Journal of Operations Research, 17(1), 1–11.
Leleu, H. (2006). A linear programming framework for free disposal hull technologies and cost functions: Primal and dual models. European Journal of Operational Research, 168(2), 340–344.
Lovell, C., & Rouse, A. (2003). Equivalent standard DEA models to provide super-efficiency scores. Journal of the Operational Research Society, 54(1), 101–108.
Olesen, O., & Petersen, N. (2016). Stochastic data envelopment analysis—A review. European Journal of Operational Research, 251(1), 2–21.
Priddey, H., & Harton, K. (2010). Comparing the efficiency of stores at New Zealand post. In Proceedings of the 45th Annual Conference of the ORSNZ, Auckland.
Seiford, L. (1990). Models, extensions, and applications of data envelopment analysis. Computers, Environment, and Urban Systems, 14(2), 171–175.
Seiford, L., & Zhu, J. (1999). An investigation of returns to scale in data envelopment analysis. Omega, 27(1), 1–11.
Yougbaré, J., & Teghem, J. (2007). Relationships between Pareto optimality in multi-objective 0–1 linear programming and DEA efficiency. European Journal of Operational Research, 183(2), 608–617.
Yu, G., Wei, Q., Brockett, P., & Zhou, L. (1996). Construction of all DEA efficient surfaces of the production possibility set under the generalized data envelopment analysis model. European Journal of Operational Research, 95(3), 491–510.
Acknowledgements
The authors thank the Auckland Medical Research Foundation who partially supported the development of open-source software package pyDEA as part of project 1115021 Knowledge-based radiotherapy treatment planning.
The authors also thank NZ Post for letting us use their data for the presented case.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Models from Sect. 7 in pyDEA
Appendix: Models from Sect. 7 in pyDEA
In the following we provide instructions and screenshots of pyDEA settings to illustrate how the different DEA models for the case study are solved. The description contains most detail for Model 1, only differences are shown for the other models. All instructions are current at the time the chapter was written.
1.1 Installing and Starting pyDEA
Windows Operating System
-
Python version between 3.2 and 3.5 (at the time of writing) must be installed.
-
Open a command window (cmd.exe).
-
To find out which version of python is installed type in the command window:
py –-version
C:\Users\andrea>py –-version Python 3.5.4rc1
In the example, the python version is 3.5
-
To install pyDEA, type the following command, where 3.x is replaced by your version of python (see previous point).
py -3.x –m pip install pyDEA
-
After successful installation, to run pyDEA type the following command in the command window:
py –m pyDEA.main_gui
Linux Operating System
-
Python version between 3.2 and 3.5 (at the time of writing) must be installed.
-
Open a terminal.
-
In Linux, python 2.x (if installed) is usually available via command
python2
-
whereas, python 3.x (if installed) is usually available via command
python3
-
The generic python command may point to either version:
python
-
To find out which version of python is installed type in the terminal
python ––version
For example:
andrea@computer:∼$ python ––version Python 2.7.12
In the example, python maps to version 2.7, whereas python3 maps to version 3.5:
andrea@computer:∼$ python3 –-version Python 3.5.2
-
To install pyDEA, type the following command. If python maps to python 3.x use pip to install (or use the next option with pip3):
pip install pyDEA
-
Otherwise (if python maps to python 2.x, and you have python3 installed, use pip3 to install):
pip3 install pyDEA
-
After successful installation, to run pyDEA type the following command in the command window:
python3 –m pyDEA.main_gui
-
Alternatively, simply type “pyDEA”, for example:
andrea@computer:∼$ pyDEA
1.1.1 Using pyDEA
Starting pyDEA brings up the pyDEA main window, as shown in Fig. 15. The main window has a Data section (left part of the window) and a parameter section (right part of the window).
Input data can be in csv, xls and xlsx format. The “load” button brings up a dialogue to browse to the location of the input file, and select it, as shown in Fig. 16.
If the Excel file contains more than one worksheet a dialogue allows selection of the appropriate worksheet containing the data, see Fig. 17.
Having loaded the data, it is displayed in the left data section of the screen, as shown in Fig. 4.
1.1.2 Model 1
Start pyDEA and choose the input data set, as explained above. Choose the inputs and outputs for Model 1 by selecting “input” or “output” for the corresponding columns in the data, as shown in Fig. 18. Note that the customer satisfaction score is stored as “MysteryShop” in the data set.
The parameters for a particular DEA model are chosen in the parameter section on the right side of the pyDEA window as highlighted in Fig. 15. Since Model 1 is an input-oriented VRS model, these two options are selected in pyDEA. We could solve both the Envelopment or Multiplier form of DEA, but we keep the default Envelopment form here, see Fig. 19.
Once all selections are made the “Run” button (Fig. 15) computes DEA results, which are then displayed on tab “Solution” on the left side of the screen, see Fig. 20. The asterisk next to “Solution” indicates that solutions have not yet been saved.
There are several solution displays available, such as “Peers”, “PeerCount”, “InputOutputWeights”, etc., as shown in Fig. 20. They can be explored in pyDEA; they can be selected, copied and pasted; or they can be saved by clicking the “Save solution” button. A “Save As” dialogue opens allowing to browse to a destination folder and to type in the output file name, as shown in Fig. 21. All solution display options will be stored as separate worksheets in an Excel file.
Section 7 discusses the efficiency scores as shown under “EfficiencyScores” in Fig. 20, some of which are included in Table 3.
1.1.3 Model 2
Input and output choices for Model 2 are shown in Fig. 22.
Model 2 is run with the same parameter choices as Model 1, see Fig. 19. Efficiency scores and virtual weight data, as reported in Table 4, are displayed as part of the solution under “WeightedData”, see Fig. 23.
Model 2 is also run with categorical variables. To do this, the categories, which are originally rural, satellite urban and urban, need to be re-coded as pyDEA requires a numerical coding where category 1 is the least favourite category (rural stores in the case study), followed by category 2 (satellite urban) and category 3 (urban). Column “LocationCategory” contains this numerical coding. pyDEA will first consider only DMUs with location category 1, then consider categories 2 and 3 (but assess only efficiency scores of category 2 stores), and finally all categories (assessing efficiency of category 3 stores). Inputs and outputs are chosen as before (Fig. 22), and the categorical variable is chosen under parameter “Options”, as in Fig. 24. Only column names that were not selected as input or output already, can be chosen here.
The resulting efficiency scores are now for the categorical model, and categories are also displayed as part of the results, as shown in Fig. 25.
1.1.4 Model 3
Model 3 considers the subset of stores without special banking facilities. The easiest way to select this subset is to prepare a separate Excel worksheet in the input data file which only contains the corresponding 95 stores. This is loaded as described above, and the correct worksheet is chosen (Fig. 17). Inputs and outputs are selected as explained for Models 1 and 2, and parameter options are also as for Models 1 and 2 (Fig. 19).
The virtual weights reported in Table 6 are from the “WeightedData” solution display, as shown in Fig. 26. The VRS model weight v 0 is shown in the solution as column “VRS”.
Solution display options “Peers” and “PeerCount” give rise to the discussion of peer stores presented for Model 3 in Sect. 7. As shown in Fig. 27, the peers of each DMU are shown, with associated peer weights, under “Peers”.
“PeerCount” is shown in Fig. 28, and summarised in Table 7. For each DMU the benchmark stores are shown and their corresponding peer weight λ. In Fig. 28 we mainly see the peer weight associated with Store 6 which acts as peer for many other stores. When a store is efficient, this store is its only peer with a peer weight of 1, examples are store 6, 13 and 15 in Fig. 28. At the end of the PeerCount tab display, the total number of times a store acts as peer for other stores is listed as “Peer count”.
The targets listed in Table 8 can be found in pyDEA under “Targets”. Targets for stores 3, 4 and 5 are shown in Fig. 29. The targets in Table 8 were shown as percentage changes, whereas targets in pyDEA are given in absolute terms.
Finally, weight restrictions are added in pyDEA via the weights editor in the parameter section. A screenshot of the weights editor is shown in Fig. 30. The weights editor has separate sections for absolute, virtual and ratio weights. The two virtual weights requiring the customer satisfaction score to be between 0.10 and 0.30 are show in Fig. 30. Weight restrictions are entered as free text, based on the name of the input or output column and the restriction, such as “>= 0.1”. The “Validate weight restrictions” button checks weight restrictions for typos. If there is an input error, the corresponding weight will be highlighted in red, see for instance Fig. 31. Care needs to be taken with weight restrictions as they may render DEA models infeasible, which editor does not check.
The resulting solution now respects the weight restriction. This is best seen on the WeightedData display, where the weights for MysteryShopYTD are now between 0.10 and 0.30 (Fig. 32).
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Raith, A., Rouse, P., Seiford, L.M. (2019). Benchmarking Using Data Envelopment Analysis: Application to Stores of a Post and Banking Business. In: Huber, S., Geiger, M., de Almeida, A. (eds) Multiple Criteria Decision Making and Aiding. International Series in Operations Research & Management Science, vol 274. Springer, Cham. https://doi.org/10.1007/978-3-319-99304-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-99304-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99303-4
Online ISBN: 978-3-319-99304-1
eBook Packages: Business and ManagementBusiness and Management (R0)