Predicting Regression Test Failures Using Genetic Algorithm-Selected Dynamic Performance Analysis Metrics

  • Michael Mayo
  • Simon Spacey
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8084)


A novel framework for predicting regression test failures is proposed. The basic principle embodied in the framework is to use performance analysis tools to capture the runtime behaviour of a program as it executes each test in a regression suite. The performance information is then used to build a dynamically predictive model of test outcomes. Our framework is evaluated using a genetic algorithm for dynamic metric selection in combination with state-of-the-art machine learning classifiers. We show that if a program is modified and some tests subsequently fail, then it is possible to predict with considerable accuracy which of the remaining tests will also fail which can be used to help prioritise tests in time constrained testing environments.


regression testing test failure prediction program analysis machine learning genetic metric selection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Nori, A.V., Rajamani, S.K.: Program analysis and machine learning: A win-win deal. In: Yahav, E. (ed.) Static Analysis. LNCS, vol. 6887, pp. 2–3. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  2. 2.
    The OpenPAT Project. The Open Performance Analysis Toolkit, (accessed March 20, 2013)
  3. 3.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. SIGKDD Explorations 11(1), 10–18 (2009)CrossRefGoogle Scholar
  4. 4.
    Goldberg, D.E.: Genetic algorithms in search, optimization and machine learning. Addison-Wesley (1989)Google Scholar
  5. 5.
    Harman, M., McMinn, P., de Souza, J.T., Yoo, S.: Search Based Software Engineering: Techniques, Taxonomy, Tutorial. In: Meyer, B., Nordio, M. (eds.) Empirical Software Engineering and Verification. LNCS, vol. 7007, pp. 1–59. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Spacey, S., Wiesmann, W., Kuhn, D., Luk, W.: Robust software partitioning with multiple instantiation. INFORMS Journal on Computing 24(3), 500–515 (2012)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Spacey, S.: 3S: Program instrumentation and characterisation framework. Technical Paper, Imperial College London (2006)Google Scholar
  8. 8.
    Spacey, S.: 3S Quick Start Guide. Technical Manual, Imperial College London (2009)Google Scholar
  9. 9.
    Aigner, G., Diwan, A., Heine, D., Lam, M., Moore, D., Murphy, B., Sapuntzakis, C.: An overview of the SUIF2 compiler infrastructure. Technical Paper, Stanford University (2000)Google Scholar
  10. 10.
    Pearce, D.J., Kelly, P.H.J., Field, T., Harder, U.: GILK: A dynamic instrumentation tool for the Linux kernel. In: Field, T., Harrison, P.G., Bradley, J., Harder, U. (eds.) TOOLS 2002. LNCS, vol. 2324, pp. 220–226. Springer, Heidelberg (2002)Google Scholar
  11. 11.
    Nethercote, N., Seward, J.: Valgrind: A program supervision framework. Electronic Notes in Theoretical Computer Science 89(2), 44–66 (2003)CrossRefGoogle Scholar
  12. 12.
    Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: Building customized program analysis tools with dynamic instrumentation. In: Proc. of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 190–200 (2005)Google Scholar
  13. 13.
    Hall, M.A.: Correlation-based Feature Subset Selection for Machine Learning. Ph.D. Thesis, University of Waikato, Hamilton, New Zealand (1998)Google Scholar
  14. 14.
  15. 15.
    Hutchins, M., Foster, H., Goradia, T., Ostrand, T.: Experiments on the effectiveness of dataflow- and controlflow-based test adequacy criteria. In: Proc. of the 16th International Conference on Software Engineering, pp. 191–200 (1994)Google Scholar
  16. 16.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27, 861–874 (2006)CrossRefGoogle Scholar
  17. 17.
    Yoo, S.: Evolving human competitive spectra-based fault localization techniques. In: Fraser, G., Teixeira de Souza, J. (eds.) SSBSE 2012. LNCS, vol. 7515, pp. 244–258. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  18. 18.
    John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Mateo (1995)Google Scholar
  19. 19.
    Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning. MIT Press (1998)Google Scholar
  20. 20.
    Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)zbMATHCrossRefGoogle Scholar
  21. 21.
    Domingos, P.: A Few Useful Things to Know about Machine Learning. Communications of the ACM 55(10), 78–87 (2012)CrossRefGoogle Scholar
  22. 22.
    Spacey, S., Luk, W., Kuhn, D., Kelly, P.H.J.: Parallel Partitioning for Distributed Systems using Sequential Assignment. Journal of Parallel and Distributed Computing 73(2), 207–219 (2013)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Michael Mayo
    • 1
  • Simon Spacey
    • 1
  1. 1.Waikato UniversityHamiltonNew Zealand

Personalised recommendations