Empirical Software Engineering

, Volume 20, Issue 2, pp 549–575 | Cite as

Detecting and refactoring code smells in spreadsheet formulas

  • Felienne Hermans
  • Martin Pinzger
  • Arie van Deursen


Spreadsheets are used extensively in business processes around the world and just like software, spreadsheets are changed throughout their lifetime causing understandability and maintainability issues. This paper adapts known code smells to spreadsheet formulas. To that end we present a list of metrics by which we can detect smelly formulas; a visualization technique to highlight these formulas in spreadsheets and a method to automatically suggest refactorings to resolve smells. We implemented the metrics, visualization and refactoring suggestions techniques in a prototype tool and evaluated our approach in three studies. Firstly, we analyze the EUSES spreadsheet corpus, to study the occurrence of the formula smells. Secondly, we analyze ten real life spreadsheets, and interview the spreadsheet owners about the identified smells. Finally, we generate refactoring suggestions for those ten spreadsheets and study the implications. The results of these evaluations indicate that formula smells are common, that they can reveal real errors and weaknesses in spreadsheet formulas and that in simple cases they can be refactored.


Spreadsheets Code smells Refactoring 


  1. Abraham R, Erwig M (2005) How to communicate unit error messages in spreadsheets. In: Proceedings of WEUSE ’05, pp 1–5Google Scholar
  2. Abraham R, Erwig M (2006) Inferring templates from spreadsheets. In: Proceedings of ICSE ’06, pp 182–191Google Scholar
  3. Abraham R, Erwig M (2007) Ucheck: a spreadsheet type checker for end users. J Vis Lang Comput 18:71–95. doi: 10.1016/j.jvlc.2006.06.001 CrossRefGoogle Scholar
  4. Alves TL, Ypma C, Visser J (2010) Deriving metric thresholds from benchmark data. In: Proceedings of ICSM ’10. IEEE Computer Society, pp 1–10Google Scholar
  5. Ayalew Y, Clermont M, Mittermeir RT (2000) Detecting errors in spreadsheets. In: Proceedings of EuSpRIG ’00, pp 51–62Google Scholar
  6. Badame S, Dig D (2012) Refactoring meets spreadsheet formulas. In: Proceedings of ICSM ’12, pp 399–409Google Scholar
  7. Bregar A (2004) Complexity metrics for spreadsheet models. In: Proceedings of EuSpRIG ’04, p 9Google Scholar
  8. Chambers C, Erwig M (2009) Automatic detection of dimension errors in spreadsheets. J Vis Lang Comput 20:269–283CrossRefGoogle Scholar
  9. Cunha J, Fernandes JP, Mendes J, Hugo Pacheco JS (2012) Towards a catalog of spreadsheet smells. In: Proceeding of ICCSA’12. LNCSGoogle Scholar
  10. Cunha J, Fernandes JP, Peixoto C, Saraiva J (2012) A quality model for spreadsheets. In: Proceedings of QUATIC ’12, pp 231–236Google Scholar
  11. Cunha J, Saraiva J, Visser J (2009) Discovery-based edit assistance for spreadsheets. In: Proceedings of VL/HCC ’09, pp 233–237. IEEEGoogle Scholar
  12. Erwig M (2009) Software engineering for spreadsheets. IEEE Softw 26:25–30. doi: 10.1109/MS.2009.140 CrossRefGoogle Scholar
  13. Fisher M, Rothermel G (2005) The EUSES spreadsheet corpus: a shared resource for supporting experimentation with spreadsheet dependability mechanisms. In: Proceedings of WEUSE ’05, pp 47–51Google Scholar
  14. Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley Longman Publishing Co., Inc., BostonGoogle Scholar
  15. Hendry DG, Green TRG (1994) Creating, comprehending and explaining spreadsheets: a cognitive interpretation of what discretionary users think of the spreadsheet model. Int J Human-Computer Stud 40(6):1033–1065CrossRefGoogle Scholar
  16. Hermans F, Pinzger M, van Deursen A (2010) Automatically extracting class diagrams from spreadsheets. In: Proceedings of ECOOP ’10, pp 52–75Google Scholar
  17. Hermans F, Pinzger M, van Deursen A (2011) Breviz: spreadsheet visualization and quality analysis. In: Proceedings of EuSpRIG ’11, pp 63–72Google Scholar
  18. Hermans F, Pinzger M, van Deursen A (2012a) Detecting and visualizing inter-worksheet smells in spreadsheets. In: Proceedings of ICSE ’12, pp 441–451Google Scholar
  19. Hermans F, Pinzger M, van Deursen A (2012b) Detecting code smells in spreadsheet formulas. In: Proceedings of ICSM ’12, pp 409–418Google Scholar
  20. Hermans F, Pinzger M, van Deursen A (2012c) Measuring spreadsheet formula understandability. In: Proceedings of EuSpRIG ’12Google Scholar
  21. Hodnigg K, Mittermeir R (2008) Metrics-based spreadsheet visualization: support for focused maintenance. In: Proceedings of EuSpRIG ’08, p 16Google Scholar
  22. Hole S, McPhee D, Lohfink A (2009) Mining spreadsheet complexity data to classify end user developers. In: Proceedings of ICDM ’09. CSREA Press, pp 573–579Google Scholar
  23. Kamiya T, Kusumoto S, Inoue K (2002) Ccfinder: a multilinguistic token-based code clone detection system for large scale source code. TSE 28(7):654–670Google Scholar
  24. Lanza M, Marinescu R, Ducasse S (2005) Object-oriented metrics in practice. Springer-Verlag, New York, Inc., SecaucusGoogle Scholar
  25. Marinescu R (2001) Detecting design flaws via metrics in object-oriented systems. In: Proceedings of TOOLS ’01. IEEE Computer Society, pp 173–182Google Scholar
  26. Moha N, Gueheneuc YG, Duchien L, Le Meur AF (2010) Decor: a method for the specification and detection of code and design smells. TSE 36(1):20–36CrossRefGoogle Scholar
  27. Nardi B, Miller J (1990) The spreadsheet interface: a basis for end user programming. In: Proceeding of the IFIP conference on human-computer interaction (INTERACT). North-Holland, pp 977–983Google Scholar
  28. Olbrich S, Cruzes DS, Basili V, Zazworka N (2009) The evolution and impact of code smells: a case study of two open source systems. In: Proceedings of ESEM ’09, pp 390–400Google Scholar
  29. Panko R (2006) Facing the problem of spreadsheet errors. Decis Line 37(5):8–10Google Scholar
  30. Panko RR (1998) What we know about spreadsheet errors. J End User Comput 10(2):15–21CrossRefGoogle Scholar
  31. Powell S, Baker K, Lawson B (2009) Errors in operational spreadsheets: a review of the state of the art. In: Proceedings of HICCS ’09. IEEE Computer Society, pp 1–8Google Scholar
  32. Raffensperger J (2009) New guidelines for spreadsheets. Int J Bus Econ 2:141–154Google Scholar
  33. Rajalingham K, Chadwick D, Knight B, Edwards D (2000) Quality control in spreadsheets: a software engineering-based approach to spreadsheet development. In: Proceedings HICSS ’00, pp 133–143Google Scholar
  34. Wettel R, Lanza M (2008) Visually localizing design problems with disharmony maps. In: Proceedings of SoftVis ’08, pp 155–164Google Scholar
  35. Wettel R, Lanza M, Robbes R (2011) Software systems as cities: a controlled experiment. In: Proceedings of ICSE ’11, pp 551–560Google Scholar
  36. Winston W (2001) Executive education opportunities. OR/MS Today 28(4)Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Felienne Hermans
    • 1
  • Martin Pinzger
    • 1
  • Arie van Deursen
    • 1
  1. 1.Delft University of TechnologyDelftThe Netherlands

Personalised recommendations