Empirical Software Engineering

, Volume 24, Issue 1, pp 287–328 | Cite as

Syntax, predicates, idioms — what really affects code complexity?

  • Shulamyt Ajami
  • Yonatan Woodbridge
  • Dror G. FeitelsonEmail author


Program comprehension concerns the ability to understand code written by others. But not all code is the same. We use an experimental platform fashioned as an online game-like environment to measure how quickly and accurately 220 professional programmers can interpret code snippets with similar functionality but different structures; snippets that take longer to understand or produce more errors are considered harder. The results indicate, inter alia, that for loops are significantly harder than if s, that some but not all negations make a predicate harder, and that loops counting down are slightly harder than loops counting up. This demonstrates how the effect of syntactic structures, different ways to express predicates, and the use of known idioms can be measured empirically, and that syntactic structures are not necessarily the most important factor. We also found that the metrics of time to understanding and errors made are not necessarily equivalent. Thus loops counting down took slightly longer, but loops with unusual bounds caused many more errors. By amassing many more empirical results like these it may be possible to derive better code complexity metrics than we have today, and also to better appreciate their limitations.


Code complexity Program understanding Gamification 



Many thanks to Micha Mandel for his help with the statistical analysis, and to the anonymous reviewers for their comments and suggestions.


  1. Abrahão S, Gravino C, Insfran E, Scanniello G, Tortora G (2013) Assessing the effectiveness of sequence diagrams in the comprehension of functional requirements: results from a family of five experiments. IEEE Trans Softw Eng 39 (3):327–342. Google Scholar
  2. Adelson B, Soloway E (1985) The role of domain experience in software design. IEEE Trans Softw Eng SE-11(11):1351–1360. Google Scholar
  3. Agresti A, Kateri M (2011) Categorical data analysis. Springer, BerlinzbMATHGoogle Scholar
  4. Ajami S, Woodbridge Y, Feitelson DG (2017) Syntax, predicates, idioms — what really affects code complexity? In: 25th international conference of program comprehension, pp 66–76.
  5. Ali M, Elish MO (2013) A comparative literature survey of design patterns impact on software quality. In: International conference of information science & applications.
  6. Arunachalam V, Sasso W (1996) Cognitive processes in program comprehension: an empirical analysis in the context of software reengineering. J Syst Softw 34 (3):177–189. Google Scholar
  7. Avidan E, Feitelson DG (2017) Effects of variable names on comprehension: an empirical study. In: 25th international conference in program comprehension, pp 55–65.
  8. Ball T, Larus JR (2000) Using paths to measure, explain, and enhance program behavior. Computer 33(7):57–65. Google Scholar
  9. Bednarik R, Tukiainen M (2006) An eye-tracking methodology for characterizing porgram comprehension processes. In: 4th symposium eye tracking research & applications, pp 125–132.
  10. Bergersen GR, Gustafsson J-E (2011) Programming skill, knowledge, and working memory among professional software developers from an investment theory perspective. J Individ Differ 32(4):201–209. Google Scholar
  11. Bergersen GR, Sjøberg DIK, Dybå T (2014) Construction and validation of an instrument for measuring programming skill. IEEE Trans Softw Eng 40(12):1163–1184. Google Scholar
  12. Bishop B, McDaid K (2008) Spreadsheet debugging behaviour of expert and novice end-users. In: 4th international workshop end-user software engineering, pp 56–60.
  13. Bishop J, Horspool RN, Xie T, Tillmann N, de Halleux J (2015) Code hunt: experience with coding contests at scale. In 37th international conference and software engineering, vol 2, pp 398–407.
  14. Brooks R (1983) Towards a theory of the comprehension of computer programs. Intl J Man-Mach Stud 18(6):543–554. Google Scholar
  15. Brooks Jr, FP (1987) No silver bullet: essence and accidents of software engineering. Computer 20(4):10–19. MathSciNetGoogle Scholar
  16. Buse RPL, Weimer WR (2008) A metric for software readability. In: International symposium software testing & analysis, pp 121–130.
  17. Butler S, Wermelinger M, Yu Y, Sharp H (2010) Exploring the influence of identifier names on code quality: An empirical study. In: 14th European conference in software maintenance & reengineering.
  18. Coe R (2002) It’s the effect size, stupid: what effect size is and why it is important. In: Conference in British educational research associationGoogle Scholar
  19. Curtis B (1981) Substantiating programmer variability. Proc IEEE 69(7):846. Google Scholar
  20. Curtis B, Sheppard SB, Milliman P (1979) Third time charm: stronger prediction of programmer performance by software complexity metrics. In: 4th international conference software and engineeringGoogle Scholar
  21. Curtis B, Sappidi J, Subramanyam J (2011) An evaluation of the internal quality of business applications: does size matter? In: 33rd international conference software and engineering, pp 711–715.
  22. Denaro G, Pezzè M (2002) An empirical evaluation of fault-proneness models. In: 24th international conference software and engineering, pp 241–251.
  23. Deterding S, Dixon D, Khaled R, Nacke L (2011) From game design elements to gamefulness: Defining “gamification”. In: 15th international academic MindTrek conference: envisioning future media environments, pp 9–15.
  24. Dijkstra EW (1968) Go To statement considered harmful. Comm ACM 11(3):147–148. Google Scholar
  25. Feigenspan J, Apel S, Liebig J, Kästner C (2011) Exploring software measures to assess program comprehension. In: International symposium empirical software engineering & measurement, pp 127–136.
  26. Feitelson DG (2015) Using students as experimental subjects in software engineering research – a review and discussion of the evidence. arXiv: [cs.SE]
  27. Gamma E, Helm R, Johnson R, Vlissides J (1994) Design patterns: elements of reusable object-oriented software. Addison-Wesley, BostonzbMATHGoogle Scholar
  28. Gil Y, Lalouche G (2017) On the correlation between size and metric validity. Empir Softw Eng 22(5):2585–2611. Google Scholar
  29. Gill GK, Kemerer CF (1991) Cyclomatic complexity density and software maintenance productivity. IEEE Trans Softw Eng 17(12):1284–1288. Google Scholar
  30. Gramß D, Frank T, Rehberger S, Vogel-Heuser B (2014) Female characteristics and requirements in software engineering in mechanical engineering. In: International conference in interactive collaborative learning, pp 272–279.
  31. Gruhn V, Laue R (2007) On experiments for measuring cognitive weights for software control structures. In: 6th international conference in cognitive informatics, pp 116–119.
  32. Hamari J, Shernoff DJ, Rowe E, Coller B, Asbell-Clarke J, Edwards T (2016) Challenging games help students learn: an empirical study on engagement, flow and immersion in game-based learning. Comput Human Behav 54:170–179. Google Scholar
  33. Hansen M, Goldstone RL, Lumsdaine A (2013) What makes code hard to understand? arXiv:1304.5257v2[cs.SE]
  34. Heathcote A, Brown S, Mewhort DJK (2000) The power law repealed: the case for an exponential law of practice. Psychon Bullet Rev 7 (2):185–207. Google Scholar
  35. Henry S, Kafura D (1981) Software structure metrics based on information flow. IEEE Trans Softw Eng SE-7(5):510–518. Google Scholar
  36. Herraiz I, Hassan AE (2011) Beyond lines of code: do we need more complexity metrics?. In: Oram A, Wilson G (eds) Making software: what really works, and why we believe it. O’Reilly Media Inc., pp 125–141Google Scholar
  37. Huotari K, Hamari J (2012) Defining gamification: a service marketing perspective. In: 16th international academic MindTrek conference, pp 17–22.
  38. Iselin ER (1988) Conditional statements, looping constructs, and program comprehension: an experimental study. Intl J Man-Mach Stud 28(1):45–66. Google Scholar
  39. Jbara A, Feitelson DG (2014) On the effect of code regularity on comprehension. In: 22nd international conference in program comprehension, pp 189–200.
  40. Jbara A, Feitelson DG (2017) How programmers read regular code: a controlled experiment using eye tracking. Empir Softw Eng 22(3):1440–1477. Google Scholar
  41. Kahney H (1983) What do novice programmers know about recursion. In: SIGCHI conference human factors in computer system, pp 235–239.
  42. Katzmarski B, Koschke R (2012) Program complexity metrics and programmer opinions. In: 20th international conferenc in program comprehension, pp 17–26.
  43. Kirkpatrick K (2016) Coding as sport. Comm ACM 59(5):32–33. Google Scholar
  44. Klerer M (1984) Experimental study of a two-dimensional language vs Fortran for first-course programmers. Intl J Man-Mach Stud 20(5):445–467. Google Scholar
  45. Landman D, Serebrenik A, Vinju J (2014) Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods. In: International conference software maintenance & evolutionGoogle Scholar
  46. Letovsky S (1987) Cognitive processes in program comprehension. J Syst Softw 7(4):325–339. Google Scholar
  47. Lumley T, Diehr P, Emerson S, Chen L (2002) The importance of the normality assumption in large public health data sets. Ann Rev of Publ Health 23 (1):151–169Google Scholar
  48. Mair P, Hatzinger R (2007) Extended Rasch modeling: the eRm package for the application of IRT models in R. J Stat Softw 20(9).
  49. McCabe T (1976) A complexity measure. IEEE Trans Softw Eng SE-2(4):308–320. MathSciNetzbMATHGoogle Scholar
  50. Munson JC, Khoshgoftaar TM (1990) Applications of a relative complexity metric for software project management. J Syst Softw 12 (3):283–291. Google Scholar
  51. Myers GJ (1977) An extension to the cyclomatic measure of program complexity. SIGPLAN Not 12(10):61–64. Google Scholar
  52. Myers RH, Montgomery DC, Vining GG, Robinson TJ (2010) Generalized linear models: with applications in engineering and the sciences. Wiley, HobokenzbMATHGoogle Scholar
  53. Mynatt BT (1984) The effect of semantic complexity on the comprehension of program modules. Intl J Man-Mach Stud 21(2):91–103. Google Scholar
  54. Newell A, Rosenbloom PS (1981) Mechanisms of skill acquisition and the law of practice. In: Anderson JR (ed) Cognitive skills and their acquisition. Lawrence Erlbaum Association, pp 1–55Google Scholar
  55. Nunez WZ, Marin VJ, Rivero CR (2017) ARCC: Assistant For repetitive code comprehension. In: 11th joint European software engineering conference & symposium foundations of software engineering, pp 999–1003.
  56. Ohlsson N, Alberg H (1996) Predicting fault-prone software modules in telephone switches. IEEE Trans Softw Eng 22(12):886–894. Google Scholar
  57. Parnin C, Siegmund J, Peitek N (2017) On the nature of programmer expertise. In: 28th psychology of programming interest group annals workshopGoogle Scholar
  58. Pink DH (2009) Drive: The surprising truth about what motivates us. Tiverhead HardcoverGoogle Scholar
  59. Piwowarski P (1982) A nesting level complexity measure. SIGPLAN Not 17 (9):44–50. Google Scholar
  60. Prechelt L (1999) Comparing Java vs. C/C++ efficiency differences to interpersonal differences. Comm ACM 42(10):109–112. Google Scholar
  61. Rajlich V, Cowan GS (1997) Towards standard for experiments in program comprehension. In: 5th IEEE international workshop program comprehension, pp 160–161.
  62. Rich C (1987) Inspection methods in programming: Clichés and plans. A.I. Memo 1005, MIT Artificial Intelligence LaboratoryGoogle Scholar
  63. Rilling J, Klemola T (2003) Identifying comprehension bottlenecks using program slicing and cognitive complexity metrics. In: 11th IEEE international workshop program comprehension, pp 115–124Google Scholar
  64. Sackman H, Erikson WJ, Grant EE (1968) Exploratory experimental studies comparing online and offline programming performance. Comm ACM 11(1):3–11. Google Scholar
  65. Schneidewind N, Hinchey M (2009) A complexity reliability model. In: 20th international symposium software reliability engineering, pp 1–10.
  66. Shao J, Wang Y (2003) A new measure of software complexity based on cognitive weights. Canadian. J Elect Comput Eng 28(2):69–74. Google Scholar
  67. Sharafi Z, Soh Z, Guéhéneuc Y-G, Antoniol G (2012) Women and men — different but equal: on the impact of identifier style on source code reading. In: 20th international conferenc program comprehension, pp 27–36.
  68. Shepperd M (1988) A critique of cyclomatic complexity as a software metric. Softw Eng J 3(2):30–36. Google Scholar
  69. Shneiderman B, Mayer R (1979) Syntactic/semantic interactions in programmer behavior: a model and experimental results. Intl J Comput Inf Syst 8(3):219–238. zbMATHGoogle Scholar
  70. Siegmund J, Kästner C, Liebig J, Apel S, Hanenberg S (2014) Measuring and modeling programming experience. Empir Softw Eng 19(5):1299–1334. Google Scholar
  71. Siegmund J, Schumann J (2015) Confounding parameters on program comprehension: a literature survey. Empir Softw Eng 20(4):1159–1192. Google Scholar
  72. Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng SE-10(5):595–609. Google Scholar
  73. Sonnentag S (1998) Expertise in professional software design: a process study. J App Psychol 83(5):703–715. Google Scholar
  74. Sonnentag S, Niessen C, Volmer J (2006) Expertise in software design. In: Ericsson KA, Charness N, Feltovich PJ, Hoffman RR (eds) The Cambridge handbook of expertise and expert performance. Cambridge University Press, pp 373–387Google Scholar
  75. Vinju JJ, Godfrey MW (2012) What does control flow really look like? Eyeballing the cyclomatic complexity metric. In: 12th IEEE international working conference source code analysis & manipulationGoogle Scholar
  76. von Mayrhauser A, Vans AM (1995) Program comprehension during software maintenance and evolution. Computer 28(8):44–55. Google Scholar
  77. Welch BL (1938) The significance of the difference between two means when the population variances are unequal. Biometrika 29(3/4):350–362zbMATHGoogle Scholar
  78. Weyuker EJ (1988) Evaluating software complexity measures. IEEE Trans Softw Eng 14(9):1357–1365.,MathSciNetGoogle Scholar
  79. Yoder KJ, Belmonte MK (2010) Combining computer game-based behavioral experiments with high-density EEG and infrared gaze tracking. J Vis Exp 46, art. no. e2320.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceThe Hebrew UniversityJerusalemIsrael
  2. 2.Department of StatisticsThe Hebrew UniversityJerusalemIsrael

Personalised recommendations