Abstract
Software Engineering (SE) experiments are traditionally analyzed with statistical tests (e.g., t-tests, ANOVAs, etc.) that assume equally spread data across groups (i.e., the homogeneity of variances assumption). Differences across groups’ variances in SE are not seen as an opportunity to gain insights on technology performance, but instead, as a hindrance to analyze the data. We have studied the role of variance in mature experimental disciplines such as medicine. We illustrate the extent to which variance may inform on technology performance by means of simulation. We analyze a real-life industrial experiment on Test-Driven Development (TDD) where variance may impact technology desirability. Evaluating the performance of technologies just based on means—as traditionally done in SE—may be misleading. Technologies that make developers obtain similar performance (i.e., technologies with smaller variances) may be more suitable if the aim is minimizing the risk of adopting them in real practice.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Dybå, T., Kampenes, V.B., Sjøberg, D.I.: A systematic review of statistical power in software engineering experiments. Inf. Softw. Technol. 48(8), 745–755 (2006)
Field, A.: Discovering Statistics Using IBM SPSS Statistics. Sage, London (2013)
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M.C., Regnell, B., Wesslén, A.: Experimentation in Software Engineering. Springer Science & Business Media, New York (2012)
Juristo, N., Moreno, A.M.: Basics of Software Engineering Experimentation. Springer Science & Business Media, New York (2001)
Quinn, G.P., Keough, M.J.: Experimental Design and Data Analysis for Biologists. Cambridge University Press, Cambridge (2002)
Cumming, G.: Understanding the New Statistics: Effect Sizes, Confidence Intervals, and Meta-analysis. Routledge, New York (2013)
Borenstein, M., Hedges, L.V., Higgins, J.P., Rothstein, H.R.: Introduction to Meta-Analysis. Wiley, New York (2011)
Cohen, J.: The earth is round (p \(<\).05). American Psychologist (1994) 997–1003
Kruschke, J.K., Liddell, T.M.: The bayesian new statistics: hypothesis testing, estimation, meta-analysis, and power analysis from a bayesian perspective. Psychon. Bull. Rev. 25(1), 178–206 (2018)
Fritz, C.O., Morris, P.E., Richler, J.J.: Effect size estimates: current use, calculations, and interpretation. J. Exp. Psychol. Gen. 141(1), 2 (2012)
Bates, D., Mächler, M., Bolker, B., Walker, S.: Fitting linear mixed-effects models using lme4. arXiv preprint arXiv:1406.5823 (2014)
Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures. CRC Press, Boca Raton (2003)
Nakagawa, S., et al.: Meta-analysis of variation: ecological and evolutionary applications and beyond. Methods Ecol. Evol. 6(2), 143–152 (2015)
Senior, A.M., Gosby, A.K., Lu, J., Simpson, S.J., Raubenheimer, D.: Meta-analysis of variance: an illustration comparing the effects of two dietary interventions on variability in weight. Evol. Med. Public Health 2016(1), 244–255 (2016)
Stevens, S.L., et al.: Blood pressure variability and cardiovascular disease: systematic review and meta-analysis. bmj 354 (2016) i4098
Senior, A., Nakagawa, S., Raubenheimer, D., Simpson, S., Noble, D.: Dietary restriction increases variability in longevity. Biol. Lett. 13(3), 20170057 (2017)
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. vol. 2. Taylor & Francis, Boca Raton (2014)
Brown, M.B., Forsythe, A.B.: Robust tests for the equality of variances. J. Am. Stat. Assoc. 69(346), 364–367 (1974)
Erdogmus, H., Morisio, M., Torchiano, M.: On the effectiveness of the test-first approach to programming. IEEE Trans. Softw. Eng. 31(3), 226–237 (2005)
Kitchenham, B., Madeyski, L., Budgen, D., Keung, J., Brereton, P., Charters, S., Gibbs, S., Pohthong, A.: Robust statistical methods for empirical software engineering. Empir. Softw. Eng. 22(2), 579–630 (2017)
Vickers, A.J.: Parametric versus non-parametric statistics in the analysis of randomized trials with non-normally distributed data. BMC Med. Res. Methodol. 5(1), 35 (2005)
Norman, G.: Likert scales, levels of measurement and the laws of statistics. Adv. Health Sci. Educ. 15(5), 625–632 (2010)
Dieste, O., Fernández, E., Garcia Martinez, R., Juristo, N.: Comparative analysis of meta-analysis methods: when to use which? In: 15th Annual Conference on Evaluation & Assessment in Software Engineering (EASE 2011), IET, pp. 36–45 (2011)
Acknowledgments
This research was developed with the support of the Spanish Ministry of Science and Innovation project TIN2014-60490-P.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Santos, A., Oivo, M., Juristo, N. (2018). Moving Beyond the Mean: Analyzing Variance in Software Engineering Experiments. In: Kuhrmann, M., et al. Product-Focused Software Process Improvement. PROFES 2018. Lecture Notes in Computer Science(), vol 11271. Springer, Cham. https://doi.org/10.1007/978-3-030-03673-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-03673-7_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03672-0
Online ISBN: 978-3-030-03673-7
eBook Packages: Computer ScienceComputer Science (R0)