Empirical Software Engineering

, Volume 24, Issue 1, pp 417–443 | Cite as

Shorter identifier names take longer to comprehend

  • Johannes C. HofmeisterEmail author
  • Janet Siegmund
  • Daniel V. Holt


Developers spend the majority of their time reading code, a process in which identifier names play a key role. Although many identifier naming styles exist, they often lack an empirical basis and it is not clear whether short or long identifier names facilitate comprehension. In this paper, we investigate the effect of different identifier naming styles (single letters, abbreviations, and words) on program comprehension. We conducted an experimental study with 72 professional C# developers who had to locate defects in source code snippets. We used a within-subjects design, such that each developer worked with all three versions of identifier naming styles, and we measured the time it took them to find a defect. We found that word identifiers led to a 19% increase in speed to find defects compared to meaningless single letters and abbreviations, but we did not find a difference between letters and abbreviations. The results of our study suggest that code is more difficult to comprehend when it contains only letters and abbreviations as identifier names. Words as identifier names facilitate program comprehension and may help to save costs and improve software quality.


Identifier names Program comprehension Professional C# developers Psychology Defect detection Software quality 



This work has been supported by the DFG grant SI 2045/2-1. Janet Siegmund’s work is further funded by the Bavarian State Ministry of Education, Science and the Arts in the framework of the Centre Digitisation.Bavaria (ZD.B).

Compliance with Ethical Standards

This study was performed in accordance with the ethical standards of the Department of Psychology, Heidelberg University, Germany.

Conflict of interests

The authors declare that they have no conflict of interest.


  1. Anquetil N, Lethbridge T (1998) Assessing the relevance of identifier names in a legacy software system. In: Conf. centre for advanced studies on collaborative research, CASCON ’98. IBM Press, Toronto, pp 1–10Google Scholar
  2. Baddeley AD, Thomson N, Buchanan M (1975) Word length and the structure of short-term memory. J Verbal Learn Verbal Behav 14(6):575–589. Google Scholar
  3. Bakeman R (2005) Recommended effect size statistics for repeated measures designs. Behav Res Methods 37(3):379–384. Google Scholar
  4. Balota DA, Chumbley JI (1985) The locus of word-frequency effects in the pronunciation task: lexical access and/or production? J Mem Lang 24(1):89–106. Google Scholar
  5. Binkley D, Davis M, Lawrie D, Morrell C (2009) To CamelCase or under_score. In: Proc. Int’l conf. program comprehension (ICPC), pp 158–167.
  6. Brooks R (1983) Towards a theory of the comprehension of computer programs. Intĺ J Man-Mach Stud 18(6):543–554. Google Scholar
  7. Buse RPL, Weimer WR (2010) Learning a metric for code readability. IEEE Trans Softw Eng (TSE) 36(4):546–558. Google Scholar
  8. Ceccato M, Di Penta M, Falcarin P, Ricca F, Torchiano M, Tonella P (2014) A family of experiments to assess the effectiveness and efficiency of source code obfuscation techniques. Empir Softw Eng 19:1040–1074Google Scholar
  9. Cohen J (1988) Statistical power analysis for the behavioral sciences. Erlbaum, HillsdalezbMATHGoogle Scholar
  10. Collins AM, Loftus EF (1975) A spreading-activation theory of semantic processing. Psychol Rev 82(6):407–428. Google Scholar
  11. Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J (2001) DRC: a dual route cascaded model of visual word recognition and reading aloud. Psychol Rev 108(1):204–256Google Scholar
  12. Cowan N (2001) The magical number 4 in short-term memory: a reconsideration of mental storage capacity. Behav Brain Sci 24(1):87–185Google Scholar
  13. Deissenboeck F, Pizka M (2006) Concise and consistent naming. Softw Qual Control 14(3):261–282. Google Scholar
  14. Hofmeister J, Siegmund J, Holt DV (2017) Shorter identifier names take longer to comprehend. In: 2017 IEEE 24th International conference on software analysis, evolution and reengineering (SANER), pp 217–227.
  15. Jansen AR, Blackwell AF, Marriott K (2003) A tool for tracking visual attention: the restricted focus viewer. Behav Res Methods Instrum Comput 35(1):57–69zbMATHGoogle Scholar
  16. Lawrie D, Morrell C, Feild H, Binkley D (2006) What’s in a name? A study of identifiers. In: Proc. Int’l conf. program comprehension (ICPC), pp 3–12.
  17. Lawrie D, Morrell C, Feild H, Binkley D (2007) Effective identifier names for comprehension and memory. Innov Syst Softw Eng 3(4):303–318. Google Scholar
  18. Leonhart R (2009) Lehrbuch Statistik Einstieg und Vertiefung, 2nd edn. Hans Huber, Hogrefe AG, BernGoogle Scholar
  19. Miller GA (1994) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 101(2):343–352Google Scholar
  20. MSDN (2016) Class naming guidelines [online]. available:
  21. Posnett D, Hindle A, Devanbu P (2011) A simpler model of software readability, ACM, New YorkGoogle Scholar
  22. Ratcliff R (1993) Methods for dealing with reaction time outliers. Psychol Bull 114(3):510–532Google Scholar
  23. Scalabrino S, Linares-Vásquez M, Poshyvanyk D, Oliveto R (2016) Improving code readability models with textual features. In: Proc. Int’l conf. program comprehension (ICPC), pp 1–10.
  24. Sharif B, Maletic JI (2010) An eye tracking study on camelcase and under_score identifier styles. In: Proc. Int’l Conf. program comprehension (ICPC). Proc. Int’l Conf. Program Comprehension (ICPC). IEEE Computer Society, Washington, DC, pp 196–205Google Scholar
  25. Sneed H (1996) Object-oriented COBOL Recycling. In: Proceedings of the Third working conference on reverse engineering, 1996, pp 169–178.
  26. Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng SE 10(5):595–609. Google Scholar
  27. Tichy WF (1998) Should computer scientists experiment more? In: IEEE ComputerGoogle Scholar
  28. Weekes BS (1997) Differential effects of number of letters on word and nonword naming latency. Q J Exper Psychol Sec A 50(2):439–456. Google Scholar
  29. Whelan R (2008) Effective analysis of reaction time data. Psychol Record 58 (3):475Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Johannes C. Hofmeister
    • 1
    Email author
  • Janet Siegmund
    • 1
  • Daniel V. Holt
    • 2
  1. 1.University of PassauPassauGermany
  2. 2.Heidelberg UniversityHeidelbergGermany

Personalised recommendations