A study of build inflation in 30 million CPAN builds on 13 Perl versions and 10 operating systems

Zolfagharinia, Mahdis; Adams, Bram; Guéhéneuc, Yann-Gaël

doi:10.1007/s10664-019-09709-6

A study of build inflation in 30 million CPAN builds on 13 Perl versions and 10 operating systems

Published: 19 June 2019

Volume 24, pages 3933–3971, (2019)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Mahdis Zolfagharinia ORCID: orcid.org/0000-0002-7601-720X¹,
Bram Adams² &
Yann-Gaël Guéhéneuc³

338 Accesses
4 Citations
2 Altmetric
Explore all metrics

Abstract

Continuous Integration (CI) is a cornerstone of modern quality assurance, providing on-demand builds (compilation and tests) of code changes or software releases. Yet the many existing CI systems do not help developers in interpreting build results, in particular when facing build inflation. Build inflation arises when each code change has to be built on dozens of combinations (configurations) of runtime environments (REs), operating systems (OSes), and hardware architectures (HAs). A code change C1 sent to the CI system may introduce programming faults that result in all these builds to fail, while a change C2 introducing a new library dependency might only lead one particular build configuration to fail. Consequently, the one build failure due to C2 will be “hidden” among the dozens of build failures due to C1 when the CI system reports the results of the builds. We have named this phenomenon build inflation, because it may bias the interpretation of build results by developers by “hiding” certain types of faults.

In this paper, we study build inflation through a large-scale study of the relationship between REs and OSes and build failures on 30 million builds of the CPAN repository on the CPAN Testers package-level CI system. We show that the builds of Perl packages may fail differently on different REs and OSes and any combination thereof . Thus, we show that the results provided by CPAN Testers require filtering and selection to identify real trends of build failures among the many failures. Manual analysis of 791 build failures shows that dependency faults (missing modules) and programming faults (undefined values) are the main reasons for failures, with dependency faults being easier to fix. We conclude with recommendations for practitioners and researchers in interpreting build results as well as for tool builders who should improve he scheduling of builds and the reporting of build failures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Future of software development with generative AI

Article Open access 11 March 2024

Model-based testing leveraged for automated web tests

Article 27 November 2021

Challenges of Low-Code/No-Code Software Development: A Literature Review

Notes

http://www.cpan.org
In this paper, we use the term “package” in its usual sense while Perl developers talk about “distribution”.
Most of the vectors did not contain any build failure, which is expected.
http://www.ptidej.net/downloads/replications/emse19b
The models are not useful to predict build failures in practice because they only include OSes and REs and ignore other factors. However, they are useful to validate the extent to which OSes and REs alone explain build failures, i.e., to validate the strength of the link between build configurations and build failures.

References

Adams B, McIntosh S (2016) Modern release engineering in a nutshell – why researchers should care. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering, Leaders of tomorrow: future of software engineering (SANER). Osaka, Japan
Adams B, Tromp H, De Schutter K, De Meuter W (2007) Design recovery and maintenance of build systems. In: 2007 IEEE international conference on software maintenance. IEEE, pp 114–123
Adams B, De Schutter K, Tromp H, De Meuter W (2008) The evolution of the linux build system. Electronic Communications of the EASST, vol 8. https://journal.ub.tuberlin.de/eceasst/article/view/115/0
Allende Esteban, Fabry Johan, Garcia Ronald, Tanter É (2014) Confined gradual typing. In: Proceedings of the 2014 ACM international conference on object oriented programming systems languages & applications, OOPSLA ’14, Portland, Oregon, USA. ACM, New York, pp 251–270. https://doi.org/10.1145/2660193.2660222. http://doi.acm.org/10.1145/2660193.2660222
Anderson C, Giannini P, Drossopoulou S (2005) Towards Type Inference for Javascript. In: Proceedings of the 19th European conference on object-oriented programming, ECOOP’05, Glasgow, UK. Springer, Berlin, pp 428–452. https://doi.org/10.1007/11531142_19
Chapter Google Scholar
Atlee C (2017) What happens when you push - 2012 edition. https://atlee.ca/blog/posts/blog20120113what-happens-when-you-push-2012-edition. Accessed 07 March 2017
Bass L, Weber I, Zhu L (2015) Devops: a software architect’s perspective, 1st. Addison-Wesley Professional, Reading
Google Scholar
Bell J, Legunsen O, Hilton M, Eloussi L, Yung T, Marinov D (2018) Deflaker: automatically detecting flaky tests. In: 40th international conference on software engineering (ICSE). [Online]. Available: https://doi.org/10.1145/3180155.3180164, pp 433–444
Beller M, Gousios G, Zaidman A (2017) Oops, my tests broke the build: an explorative analysis of travis ci with github. In: Proceedings of the 14th international conference on mining software repositories. IEEE Press, pp 356–367
Booch G (1994) Object-oriented analysis and design with applications, 2nd. Benjamin-Cummings Publishing Co. Inc., Redwood City
MATH Google Scholar
Bouckaert RR, Frank E, Hall M, Kirkby R, Reutemann P, Seewald A, Scuse D (2010) Weka manual for version 3-7-3, The University of WAIKATO
Bracha G, Griswold D (1993) Strongtalk: typechecking smalltalk in a production environment. In: Proceedings of the eighth annual conference on object-oriented programming systems, languages, and applications, ser OOPSLA ’93. ACM, New York, pp 215–230. https://doi.org/10.1145/165854.165893. http://doi.acm.org/10.1145/165854.165893
Calle ML, Urrea V, Boulesteix A-L, Malats N (2011) Auc-rf: a new strategy for genomic profiling with random forest. Hum Hered 72(2):121–132
Article Google Scholar
Campbell JL, Quincy C, Osserman J, Pedersen OK (2013) Coding in-depth semistructured interviews: Problems of unitization and intercoder reliability and agreement. Sociol Methods Res 42(3):294–320. [Online]. Available: https://doi.org/10.1177/0049124113500475
Article MathSciNet Google Scholar
Carrez T (2015) Preventing craziness: a deep dive into OpenStack testing automation. Presentation at FOSDEM, Feb 2014
Chaudhuri A, Vekris P, Goldman S, Roch M, Levi G (2017) Fast and precise type checking for JavaScript. Proc. ACM Program. Lang. 1(OOPSLA):48:1–48:30. https://doi.org/10.1145/3133872. http://doi.acm.org/10.1145/3133872
Article Google Scholar
CPAN Comprehensive Perl Archive Network (2015). http://www.cpan.org. Accessed 22 Dec 2015
CPAN Testers (2015) http://www.cpantesters.org. Accessed 22 Dec 2015
mailto:cpan@perl.org (2016) PerlSource versions and release date, accessed: 2016-11-01. [Online]. Available: http://www.cpan.org/src/
Denny P, Luxton-Reilly A, Tempero E (2012) All syntax errors are not equal. In: Proceedings of the 17th ACM annual conference on Innovation and technology in computer science education. ACM, pp 75–80
DeRemer F, Kron H (1975) Programming-in-the large versus programming-in-the-small. In: Proceedings of the international conference on reliable software, Los Angeles, California. ACM, New York, pp 114–121. https://doi.org/10.1145/800027.808431. http://doi.acm.org/10.1145/800027.808431
Duvall P, Matyas SM (2007) A glover continuous integration: improving software quality and reducing risk (The Addison-Wesley Signature Series). Addison-Wesley Professional, Reading
Google Scholar
Dyke G (2011) Which aspects of novice programmers’ usage of an ide predict learning outcomes. In: Proceedings of the 42nd ACM technical symposium on Computer science education. ACM, pp 505–510
Feldman SI (1979) Make a program for maintaining computer programs. Software: Practice and Experience 9(4):255–265
MATH Google Scholar
Fowler M, Foemmel M (2006) Continuous integration, Thought-Works). http://www.thoughtworks.com/ContinuousIntegration.pdf
Gallaba K, Macho C, Pinzger M, McIntosh S (2018) Noise and heterogeneity in historical build data: an empirical study of travis ci. In: 33rd ACM/IEEE international conference on automated software engineering (ASE), pp 87–97
Gao Z, Bird C, Barr ET (2017) To type or not to type: quantifying detectable bugs in javascript. In: Proceedings of the 39th international conference on software engineering, ser. ICSE ’17. Piscataway, IEEE Press, pp 758–769. [Online]. Available: https://doi.org/10.1109/ICSE.2017.75
Glatard T, Lewis LB, Ferreira da Silva R, Adalat R, Beck N, Lepage C, Rioux P, Rousseau M-E, Sherif T, Deelman E, Khalili-Mahani N, Evans AC (2015) Reproducibility of neuroimaging analyses across operating systems. Frontiers in Neuroinformatics 9:12. https://doi.org/10.3389/fninf.2015.00012. https://www.frontiersin.org/article/10.3389/fninf.2015.00012
Article Google Scholar
Hassan AE, Zhang K (2006) Using decision trees to predict the certification result of a build. In: 2006. ASE’06 21st IEEE/ACM international conference on automated software engineering. IEEE, pp 189–198
Hassan F, Wang X (2018) Hirebuild: an automatic approach to history-driven repair of build scripts. In: 40th international conference on software engineering (ICSE), pp 1078–1089
Hilton M, Tunnell T, Huang K, Marinov D, Dig D (2016) Usage, costs, and benefits of continuous integration in open-source projects. In: 2016 31st IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 426–437
Humble J, Farley D (2010) Continuous delivery: reliable software releases through build, test, and deployment automation, 1st. Addison-Wesley Professional, Reading
Google Scholar
Kerzazi N, Khomh F, Adams B (2014) Why do automated builds break? an empirical study. In: 2014 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 41–50
Kruchten P (1995) The 4 + 1 view model of architecture. IEEE Softw 12(6):42–50. [Online]. Available: https://doi.org/10.1109/52.469759
Article Google Scholar
Labuschagne A, Inozemtseva L, Holmes R (2017) Measuring the cost of regression testing in practice: a study of java projects using continuous integration. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, ser. ESEC/FSE 2017. ACM, New York, pp 821–830. [Online]. Available: https://doi.org/10.1145/3106237.3106288
Laukkanen E, Paasivaara M, Arvonen T (2015) Stakeholder perceptions of the adoption of continuous integration–a case study. In: 2015 Agile conference (AGILE). IEEE, pp 11–20
Lehman MM (1996) Laws of software evolution revisited. In: European workshop on software process technology. Springer, pp 108–124
Leppänen M, Mäkinen S, Pagels M, Eloranta V-P, Itkonen J, Mäntylä MV, Männistö T (2015) The highways and country roads to continuous deployment. IEEE Softw 32(2):64–72
Article Google Scholar
Luo Q, Hariri F, Eloussi L, Marinov D (2014) An empirical analysis of flaky tests. In: 22nd ACM SIGSOFT International symposium on foundations of software engineering (FSE), pp 643–653
Macho C, McIntosh S, Pinzger M (2018) Automatically repairing dependency-related build breakage. In: International conference on software analysis, evolution, and reengineering (SANER
McIntosh S, Adams B, Hassan AE (2010) The evolution of ant build systems. In: 2010 7th IEEE working conference on mining software repositories (MSR 2010). IEEE, pp 42–51
McIntosh S, Adams B, Kamei Y, Nguyen T, Hassan AE (2011) An empirical study of build maintenance effort. In: Proceedings of the 33rd international conference on software engineering (ICSE), Waikiki, Honolulu, Hawaii, pp 141–150
McIntosh S, Nagappan M, Adams B, Mockus A, Hassan AE (2015) A large-scale empirical study of the relationship between build technology and build maintenance. Empir Softw Eng 20(6):1587–1633
Article Google Scholar
MetaCPAN API (2016) https://github.com/metacpan/metacpan-api. Accessed 07 Dec 2016
Micco J (2016) Continuous integration at google scale. https://www.slideshare.net/JohnMicco1/2016-0425-continuous-integration-at-google-scale
Miller A (2008) A hundred days of continuous integration. In: AGILE’08 conference Agile 2008. IEEE, pp 289–293
Mirhosseini S, Parnin C (2017) Can automated pull requests encourage software developers to upgrade out-of-date dependencies?. In: Proceedings of the 32Nd IEEE/ACM international conference on automated software engineering, ser. ASE 2017. Piscataway, IEEE Press, pp 84–94. [Online]. Available: http://dl.acm.org/citation.cfm?id=3155562.3155577
O’Duinn J (2013) The financial cost of a checkin (part 2). https://oduinn.com/2013/12/13/the-financial-cost-of-a-checkin-part-2/
Openstack Zuul CI Dashboard (2014) http://zuul.openstack.org
Palomba F, Zaidman A (2017) Does refactoring of test smells induce fixing flaky tests?. In: 2017 IEEE international conference on software maintenance and evolution (ICSME), pp 1–12
Raemaekers S, van Deursen A, Visser J (2012) Measuring software library stability through historical version analysis. In: 2012 28th IEEE international conference on software maintenance (ICSM). IEEE, pp 378–387
Rausch T, Hummer W, Leitner P, Schulte S (2017) An empirical analysis of build failures in the continuous integration workflows of java-based open-source software. In: Proceedings of the 14th International Conference on Mining Software Repositories. IEEE Press, pp 345–355
Rogers RO (2004) Scaling continuous integration. In: International conference on extreme programming and Agile processes in software engineering. Springer, pp 68–76
Seo H, Sadowski C, Elbaum S, Aftandilian E, Bowdidge R (2014) Programmers’ build errors: a case study (at Google). In: Proceedings of the 36th international conference on software engineering. ACM, pp 724–734
Ståhl D, Bosch J (2014) Modeling continuous integration practice differences in industry software development. J Syst Softw 87:48–59
Article Google Scholar
Suvorov R, Nagappan M, Hassan AE, Zou Y, Adams B (2012) An empirical study of build system migrations in practice: case studies on kde and the linux kernel. In: IEEE, pp 160–169
TreeHerder (2017) https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound. Accessed 20 Sept 2017
Tu Q, Godfrey MW (2001) The build-time software architecture view. In: Proceedings of the IEEE international conference on software maintenance (ICSM’01), ser. ICSM ’01. Washington, DC, USA: IEEE Computer Society, p 398. [Online]. Available: https://doi.org/10.1109/ICSM.2001.972753
Vasilescu B, Yu Y, Wang H, Devanbu P, Filkov V (2015) Quality and productivity outcomes relating to continuous integration in github. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, pp 805–816
Vassallo C, Schermann G, Zampetti F, Romano D, Leitner P, Zaidman A, Penta MD, Panichella S (2017) A tale of ci build failures: an open source and a financial organization perspective. In: 2017 IEEE international conference on software maintenance and evolution (ICSME), pp 183–193
Wikipedia (2008) List of build automation software. https://en.wikipedia.org/wiki/List_of_build_automation_software. Accessed 14 May 2019
Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer Academic Publishers, Norwell
Book Google Scholar
Yoo S, Harman M (2012) Regression testing minimization, selection and prioritization: a survey. Softw Test Verif Reliab 22(2):67–120. [Online]. Available: https://doi.org/10.1002/stv.430
Article Google Scholar
Zhao Y, Serebrenik A, Zhou Y, Filkov V, Vasilescu B (2017) The impact of continuous integration on other software development practices: a large-scale empirical study. In: Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering, ser. ASE 2017. Piscataway, IEEE Press, pp 60–71. [Online]. Available: http://dl.acm.org/citation.cfm?id=3155562.3155575
Ziftci C, Reardon J (2017) Who broke the build?: automatically identifying changes that induce test failures in continuous integration at Google scale. In: Proceedings of the 39th international conference on software engineering: software engineering in practice track, ser. ICSE-SEIP ’17. IEEE Press, Piscataway, pp 113–122. [Online]. Available: https://doi.org/10.1109/ICSE-SEIP.2017.13
Zolfagharinia M, Adams B, Guéhéneuc Y-G (2017) Do not trust build results at face value: an empirical study of 30 million CPAN builds. In: Proceedings of the 14th international conference on mining software repositories. IEEE Press, pp 312–322

Download references

Acknowledgements

Part of this work was funded by the NSERC Discovery Grant and Canada Research Chair programs.

Author information

Authors and Affiliations

MCIS and Ptidej labs, Polytechnique Montréal, Québec, Canada
Mahdis Zolfagharinia
MCIS lab, Polytechnique Montréal, Québec, Canada
Bram Adams
Ptidej lab, Concordia University, Québec, Canada
Yann-Gaël Guéhéneuc

Authors

Mahdis Zolfagharinia
View author publications
You can also search for this author in PubMed Google Scholar
Bram Adams
View author publications
You can also search for this author in PubMed Google Scholar
Yann-Gaël Guéhéneuc
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mahdis Zolfagharinia.

Additional information

Communicated by: Romain Robbes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zolfagharinia, M., Adams, B. & Guéhéneuc, YG. A study of build inflation in 30 million CPAN builds on 13 Perl versions and 10 operating systems. Empir Software Eng 24, 3933–3971 (2019). https://doi.org/10.1007/s10664-019-09709-6

Download citation

Published: 19 June 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s10664-019-09709-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A study of build inflation in 30 million CPAN builds on 13 Perl versions and 10 operating systems

Abstract

Access this article

Similar content being viewed by others

Future of software development with generative AI

Model-based testing leveraged for automated web tests

Challenges of Low-Code/No-Code Software Development: A Literature Review

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A study of build inflation in 30 million CPAN builds on 13 Perl versions and 10 operating systems

Abstract

Access this article

Similar content being viewed by others

Future of software development with generative AI

Model-based testing leveraged for automated web tests

Challenges of Low-Code/No-Code Software Development: A Literature Review

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation