Regression Test Selection for Distributed Software Histories

  • Milos Gligoric
  • Rupak Majumdar
  • Rohan Sharma
  • Lamyaa Eloussi
  • Darko Marinov
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8559)


Regression test selection analyzes incremental changes to a codebase and chooses to run only those tests whose behavior may be affected by the latest changes in the code. By focusing on a small subset of all the tests, the testing process runs faster and can be more tightly integrated into the development process. Existing techniques for regression test selection consider two versions of the code at a time, effectively assuming a development process where changes to the code occur in a linear sequence.

Modern development processes that use distributed version-control systems are more complex. Software version histories are generally modeled as directed graphs; in addition to version changes occurring linearly, multiple versions can be related by other commands, e.g., branch, merge, rebase, cherry-pick, revert, etc. This paper describes a regression test-selection technique for software developed using modern distributed version-control systems. By modeling different branch or merge commands directly in our technique, it computes safe test sets that can be substantially smaller than applying previous techniques to a linearization of the software history.

We evaluate our technique on software histories of several large open-source projects. The results are encouraging: our technique obtained an average of 10.89× reduction in the number of tests over an existing technique while still selecting all tests whose behavior may differ.


Regression Test Test Selection Software Maintenance Lower Common Ancestor History Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools. Addison-Wesley Longman Publishing Co., Inc., Boston (1986)Google Scholar
  2. 2.
    Alali, A., Kagdi, H., Maletic, J.I.: What’s a typical commit? A characterization of open source software repositories. In: International Conference on Program Comprehension, pp. 182–191 (2008)Google Scholar
  3. 3.
    Bender, M.A., Pemmasani, G., Skiena, S., Sumazin, P.: Finding least common ancestors in directed acyclic graphs. In: Symposium on Discrete Algorithms, pp. 845–853 (2001)Google Scholar
  4. 4.
    Bird, C., Rigby, P.C., Barr, E.T., Hamilton, D.J., German, D.M., Devanbu, P.: The promises and perils of mining Git. In: International Working Conference on Mining Software Repositories, pp. 1–10 (2009)Google Scholar
  5. 5.
    Biswas, S., Mall, R., Satpathy, M., Sukumaran, S.: Regression test selection techniques: A survey. Informatica (Slovenia) 35(3), 289–321 (2011)Google Scholar
  6. 6.
    Briand, L., Labiche, Y., He, S.: Automating regression test selection based on UML designs. Information and Software Technology 51(1), 16–30 (2009)CrossRefGoogle Scholar
  7. 7.
    Brindescu, C., Codoban, M., Shmarkatiuk, S., Dig, D.: How do centralized and distributed version control systems impact software changes? In: International Conference on Software Engineering (to appear, 2014)Google Scholar
  8. 8.
    Chittimalli, P.K., Harrold, M.J.: Regression test selection on system requirements. In: India Software Engineering Conference, pp. 87–96 (2008)Google Scholar
  9. 9.
    Czumaj, A., Kowaluk, M., Lingas, A.: Faster algorithms for finding lowest common ancestors in directed acyclic graphs. Theor. Comput. Sci. 380(1-2), 37–46 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  10. 10.
    Eckhardt, S., Mühling, A.M., Nowak, J.: Fast lowest common ancestor computations in dags. In: Arge, L., Hoffmann, M., Welzl, E. (eds.) ESA 2007. LNCS, vol. 4698, pp. 705–716. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Engström, E., Runeson, P., Skoglund, M.: A systematic review on regression test selection techniques. Inf. Softw. Technol. 52(1), 14–30 (2010)CrossRefGoogle Scholar
  12. 12.
    Engström, E., Skoglund, M., Runeson, P.: Empirical evaluations of regression test selection techniques: a systematic review. In: International Symposium on Empirical Software Engineering and Measurement, pp. 22–31 (2008)Google Scholar
  13. 13.
  14. 14.
    Git home page,
  15. 15.
    Gligoric, M., Majumdar, R., Sharma, R., Eloussi, L., Marinov, D.: Regression test selection for distributed software histories. Technical report (2014),
  16. 16.
    Gupta, P., Ivey, M., Penix, J.: Testing at the speed and scale of Google (June 2011),
  17. 17.
    Gupta, R., Harrold, M.J., Soffa, M.L.: Program slicing-based regression testing techniques. Softw. Test., Verif. Reliab. 6(2), 83–111 (1996)CrossRefGoogle Scholar
  18. 18.
    Harrold, M.J., Soffa, M.L.: Interprocedual data flow testing. In: Third Symposium on Software Testing, Analysis, and Verification, pp. 158–167 (1989)Google Scholar
  19. 19.
    Harrold, M., Soffa, M.: An incremental approach to unit testing during maintenance. In: International Conference on Software Maintenance, pp. 362–367 (1988)Google Scholar
  20. 20.
  21. 21.
    Jones, J., Harrold, M.J.: Test-suite reduction and prioritization for modified condition/decision coverage. Transactions on Software Engineering 29, 195–209 (2003)CrossRefGoogle Scholar
  22. 22.
    K.F. Fischer, F. Raji, A.C.: A methodology for retesting modified software. In: National Telecommunications Conference (1981)Google Scholar
  23. 23.
    Kung, D.C., Gao, J., Hsia, P., Lin, J., Toyoshima, Y.: Class firewall, test order, and regression testing of object-oriented programs. Journal of Object-Oriented Programming 8(2), 51–65 (1995)Google Scholar
  24. 24.
    Leung, H.K.N., White, L.: Insights into regression testing. In: International Conference on Software Maintenance, pp. 60–69 (1989)Google Scholar
  25. 25.
    LinuxKernel Git repository, //git:// Scholar
  26. 26.
    Memon, A.M., Soffa, M.L.: Regression testing of GUIs. In: International Symposium on Foundations of Software Engineering, pp. 118–127 (2003)Google Scholar
  27. 27.
    Mercurial home page,
  28. 28.
    Orso, A., Shi, N., Harrold, M.J.: Scaling regression testing to large software systems. In: International Symposium on Foundations of Software Engineering, pp. 241–251 (2004)Google Scholar
  29. 29.
    Perez De Rosso, S., Jackson, D.: What’s wrong with Git?: A conceptual design analysis. In: International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software, pp. 37–52 (2013)Google Scholar
  30. 30.
    Rigby, P., Barr, E., Bird, C., Devanbu, P., German, D.: What effect does distributed version control have on OSS project organization? In: International Workshop on Release Engineering, pp. 29–32 (2013)Google Scholar
  31. 31.
    Rothermel, G., Harrold, M.J.: A safe, efficient algorithm for regression test selection. In: Conference on Software Maintenance, pp. 358–367 (1993)Google Scholar
  32. 32.
    Rothermel, G., Harrold, M.J.: A framework for evaluating regression test selection techniques. In: International Conference on Software Engineering, pp. 201–210 (1994)Google Scholar
  33. 33.
    Rothermel, G., Harrold, M.J.: A safe, efficient regression test selection technique. Trans. Softw. Eng. Methodol. 6(2), 173–210 (1997)CrossRefGoogle Scholar
  34. 34.
    Srivastava, A., Thiagarajan, J.: Effectively prioritizing tests in development environment. In: International Symposium on Software Testing and Analysis, pp. 97–106 (2002)Google Scholar
  35. 35.
    Willmor, D., Embury, S.M.: A safe regression test selection technique for database driven applications. In: International Conference on Software Maintenance, pp. 421–430 (2005)Google Scholar
  36. 36.
    Yau, S.S., Kishimoto, Z.: A method for revalidating modified programs in the maintenance phase. In: Signature Conference on Computers, Software, and Applications (1987)Google Scholar
  37. 37.
    Yoo, S., Harman, M.: Regression testing minimization, selection and prioritization: a survey. Software Testing, Verification and Reliability 22(2), 67–120 (2012)CrossRefGoogle Scholar
  38. 38.
    Zhang, L., Kim, M., Khurshid, S.: Localizing failure-inducing program edits based on spectrum information. In: International Conference on Software Maintenance, pp. 23–32 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Milos Gligoric
    • 1
  • Rupak Majumdar
    • 2
  • Rohan Sharma
    • 1
  • Lamyaa Eloussi
    • 1
  • Darko Marinov
    • 1
  1. 1.University of Illinois at Urbana-ChampaignUSA
  2. 2.Max Planck Institute for Software SystemsGermany

Personalised recommendations