Advertisement

Empirical Software Engineering

, Volume 19, Issue 6, pp 1617–1664 | Cite as

Automating extract class refactoring: an improved method and its evaluation

  • Gabriele Bavota
  • Andrea De Lucia
  • Andrian Marcus
  • Rocco Oliveto
Article

Abstract

During software evolution the internal structure of the system undergoes continuous modifications. These continuous changes push away the source code from its original design, often reducing its quality, including class cohesion. In this paper we propose a method for automating the Extract Class refactoring. The proposed approach analyzes (structural and semantic) relationships between the methods in a class to identify chains of strongly related methods. The identified method chains are used to define new classes with higher cohesion than the original class, while preserving the overall coupling between the new classes and the classes interacting with the original class. The proposed approach has been first assessed in an artificial scenario in order to calibrate the parameters of the approach. The data was also used to compare the new approach with previous work. Then it has been empirically evaluated on real Blobs from existing open source systems in order to assess how good and useful the proposed refactoring solutions are considered by software engineers and how well the proposed refactorings approximate refactorings done by the original developers. We found that the new approach outperforms a previously proposed approach and that developers find the proposed solutions useful in guiding refactorings.

Keywords

Extract class refactoring Cohesion Coupling Graph clustering algorithms 

Notes

Acknowledgements

We would like to thank all the students who participated to our studies. We would also like to thank anonymous reviewers for their careful reading of our manuscript and high-quality feedback. Their detailed comments have helped us to substantially revise, extend, and improve the original version of this paper. Andrian Marcus was supported in part by grants from the US National Science Foundation (CCF-0845706 and CCF-1017263).

References

  1. Abadi A, Ettinger R, Feldman YA (2009) Fine slicing for advanced method extraction. In: 3rd workshop on refactoring toolsGoogle Scholar
  2. Abdeen H, Ducasse S, Sahraoui HA, Alloui I (2009) Automatic package coupling and cycle minimization. In: Proceedings of the 16th working conference on reverse engineering. IEEE CS Press, Lille, pp 103–112Google Scholar
  3. Anquetil N, Fourrier C, Lethbridge TC (1999) Experiments with clustering as a software remodularization method. In: Proceedings of the 6th working conference on reverse engineering. IEEE CS Press, Atlanta, GA, pp 235–255Google Scholar
  4. Arisholm E, Sjoberg D (2004) Evaluating the effect of a delegated versus centralized control style on the maintainability of object-oriented software. IEEE Trans Softw Eng 30(8):521–534CrossRefGoogle Scholar
  5. Atkinson DC, King T (2005) Lightweight detection of program refactorings. In: Proceedings of the 12th Asia-Pacific software engineering conference. IEEE CS Press, Taipei, pp 663–670Google Scholar
  6. Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-WesleyGoogle Scholar
  7. Basili VR, Briand L, Melo WL (1995) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761CrossRefGoogle Scholar
  8. Bavota G, De Lucia A, Oliveto R (2011) Identifying extract class refactoring opportunities using structural and semantic cohesion measures. J Syst Softw 84:397–414CrossRefGoogle Scholar
  9. Bavota G, Lucia AD, Marcus A, Oliveto R (2010) A two-step technique for extract class refactoring. In: Proceedings of 25th IEEE international conference on automated software engineering, pp 151–154Google Scholar
  10. Bavota G, Lucia AD, Marcus A, Oliveto R (2012) Automating extract class refactoring: an improved approach and its evaluation. Online appendix https://dl.dropbox.com/u/20652688/emseappendix.zip
  11. Binkley AB, Schach SR (1998) Validation of the coupling dependency metric as a predictor of run-time failures and maintenance measures. In: Proceedings of the 20th international conference on software engineering. Kyoto, Japan, pp 452–455CrossRefGoogle Scholar
  12. Bodhuin T, Canfora G, Troiano L (2007) SORMASA: a tool for suggesting model refactoring actions by metrics-led genetic algorithm. In: Proceedings of 1st workshop on refactoring tools. Berlin, Germany, pp 23–24Google Scholar
  13. Briand LC, Wuest J, Lounis H (1999a) Using coupling measurement for impact analysis in object-oriented systems. In: Proceedings of the 15th IEEE international conference on software maintenance. IEEE Press, Oxford, pp 475–482Google Scholar
  14. Briand LC, Wüst J, Ikonomovski SV, Lounis H (1999b) Investigating quality factors in object-oriented designs: an industrial case study. In: Proceedings of the 21st international conference on software engineering. ACM Press, Los Angeles, CA, pp 345–354CrossRefGoogle Scholar
  15. Brown WJ, Malveau RC, Brown WH, McCormick III HW, Mowbray TJ (1998) Anti patterns: refactoring software, architectures, and projects in crisis, 1st edn. John Wiley and SonsGoogle Scholar
  16. Canfora G, Cimitile A, De Lucia A, Di Lucca GA (2001) Decomposing legacy systems into objects: an eclectic approach. Inf Softw Technol 43(6):401–412CrossRefGoogle Scholar
  17. Casais E (1992) An incremental class reorganization approach. In: Proceedings of the 6th European conference on object-oriented programming. Utrecht, the Netherlands, pp 114–132CrossRefGoogle Scholar
  18. Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493CrossRefGoogle Scholar
  19. Christl A, Koschke R, Storey MA (2007) Automated clustering to support the reflexion method. Inf Softw Technol 49(3):255–274CrossRefGoogle Scholar
  20. Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Earlbaum AssociatesGoogle Scholar
  21. Conover WJ (1998) Practical nonparametric statistics, 3rd edn. WileyGoogle Scholar
  22. Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms, 2nd edn, chap 26 (maximum flow). MIT Press and McGraw-HillGoogle Scholar
  23. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407CrossRefGoogle Scholar
  24. van Deursen A, Kuipers T (1999) Identifying objects using cluster and concept analysis. In: Proceedings of the 21st international conference on software engineering. ACM Press, Los Angeles, CA, pp 246–255CrossRefGoogle Scholar
  25. Du Bois B, Demeyer S, Verelst J (2004) Refactoring—improving coupling and cohesion of existing code. In: Proceedings of 11th working conference on reverse engineering. IEEE CS Press, Delft, pp 144–151Google Scholar
  26. Fokaefs M, Tsantalis N, Chatzigeorgiou A, Sander J (2009) Decomposing object-oriented class modules using an agglomerative clustering technique. In: Proceedings of the 25th international conference on software maintenance. Edmonton, Canada, pp 93–101Google Scholar
  27. Fowler M (1999) Refactoring: improving the design of existing code. Addison-WesleyGoogle Scholar
  28. Girard JF, Koschke R (2000) A comparison of abstract data types and objects recovery techniques. Sci Comput Program 36(2–3):149–181CrossRefGoogle Scholar
  29. Gui G, Scott PD (2006) Coupling and cohesion measures for evaluation of component reusability. In: Proceedings of the 5th international workshop on mining software repositories. ACM Press, Shanghai, pp 18–21Google Scholar
  30. Gyimóthy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910CrossRefGoogle Scholar
  31. Joshi P, Joshi RK (2009) Concept analysis for class cohesion. In: Proceedings of the 13th European conference on software maintenance and reengineering. Kaiserslautern, Germany, pp 237–240Google Scholar
  32. Khomh F, Vaucher S, Guéhéneuc YG, Sahraoui H (2009) A bayesian approach for the detection of code and design smells. In: Proceedings of the 9th international conference on quality software. IEEE CS Press, Hong Kong, pp 305–314Google Scholar
  33. Khomh F, Vaucher S, Guéhéneuc YG, Sahraoui H (2009) A bayesian approach for the detection of code and design smells. In: Proceedings of the 2009 ninth international conference on quality software. IEEE Computer Society, Washington, DC, pp 305–314CrossRefGoogle Scholar
  34. Koschke R, Canfora G, Czeranski J (2006) Revisiting the delta ic approach to component recovery. Sci Comput Program 60(2):171–188MathSciNetMATHCrossRefGoogle Scholar
  35. Kuhn A, Ducasse S, Gîrba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49(3):230–243CrossRefGoogle Scholar
  36. Lee Y, Liang B, Wu S, Wang F (1995) Measuring the coupling and cohesion of an object-oriented program based on information flow. In: Proceedings of the international conference on software quality. Maribor, Slovenia, pp 81–90Google Scholar
  37. Li W, Henry S (1993) Maintenance metrics for the object oriented paradigm. In: Proceedings of the first international software metrics symposium, pp 52–60Google Scholar
  38. Liu Y, Poshyvanyk D, Ferenc R, Gyimóthy T, Chrisochoides N (2009) Modelling class cohesion as mixtures of latent topics. In: Proceedings of the 25th IEEE international conference on software maintenance. IEEE Press, Edmonton, pp 233–242Google Scholar
  39. Maletic JI, Marcus A (2001) Supporting program comprehension using semantic and structural information. In: Proceedings of the 23rd international conference on software engineering. IEEE CS Press, Toronto, ON, pp 103–112Google Scholar
  40. Marcus A, Poshyvanyk D, Ferenc R (2008) Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans Softw Eng 34(2):287–300CrossRefGoogle Scholar
  41. Marinescu R (2004) Detection strategies: metrics-based rules for detecting design flaws. In: Proceedings of the 20th IEEE international conference on software maintenance. IEEE Computer Society, Washington, DC, pp 350–359Google Scholar
  42. Maruyama K, Shima K (1999) Automatic method refactoring using weighted dependence graphs. In: Proceedings of 21st international conference on software engineering. ACM Press, Los Alamitos, CA, pp 236–245CrossRefGoogle Scholar
  43. Mens T, Tourwe T (2004) A survey of software refactoring. IEEE Trans Softw Eng 30(2):126–139CrossRefGoogle Scholar
  44. Moha N, Gueheneuc YG, Duchien L, Le Meur AF (2010) Decor: a method for the specification and detection of code and design smells. IEEE Trans Softw Eng 36(1):20–36CrossRefGoogle Scholar
  45. Moore I (1996) Automatic inheritance hierarchy restructuring and method refactoring. In: Proceedings of 11th ACM SIGPLAN conference on object-oriented programming, systems, languages, and applications. ACM Press, San Jose, CA, pp 235–250Google Scholar
  46. O’Keeffe M, O’Cinneide M (2006) Search-based software maintenance. In: Proceedings of 10th European conference on software maintenance and reengineering. IEEE CS Press, Bari, pp 249–260Google Scholar
  47. Olbrich S, Cruzes DS, Basili, V, Zazworka N (2009) The evolution and impact of code smells: a case study of two open source systems. In: Proceedings of the 2009 3rd international symposium on empirical software engineering and measurement, ESEM ’09, pp 390–400Google Scholar
  48. Oliveto R, Gethers M, Bavota G, Poshyvanyk D, Lucia A (2011) Identifying method friendships to remove the feature envy bad smell (nier track). In: 33rd IEEE/ACM international conference on software engineering—NIER Track. ACM Press, Hawaii, USA, pp 820–823Google Scholar
  49. Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter PublishersGoogle Scholar
  50. Poshyvanyk D, Marcus A, Ferenc R, Gyimóthy T (2009) Using information retrieval based coupling measures for impact analysis. Empir Software Eng 14(1):5–32CrossRefGoogle Scholar
  51. Praditwong K, Harman M, Yao X (2011) Software module clustering as a multi-objective search problem. IEEE Trans Softw Eng 37(2):264–282CrossRefGoogle Scholar
  52. Prete K, Rachatasumrit N, Sudan N, Kim M (2010) Template-based reconstruction of complex refactorings. In: 26th IEEE international conference on software maintenance (ICSM 2010). IEEE Computer Society, Timisoara, 12–18 September 2010, pp 1–10Google Scholar
  53. Sartipi K, Kontogiannis K (2001) Component clustering based on maximal association. In: Proceedings of the 8th working conference on reverse engineering. Stuttgart, Germany, pp 103–114CrossRefGoogle Scholar
  54. Seng O, Bauer M, Biehl M, Pache G (2005) Search-based improvement of subsystem decompositions. In: Proceedings of the genetic and evolutionary computation conference. ACM Press, Washington, DC, pp 1045–1051Google Scholar
  55. Seng O, Stammel J, Burkhart D (2006) Search-based determination of refactorings for improving the class structure of object-oriented systems. In: Proceedings of the genetic and evolutionary computation conference. Seattle, Washington, USA, pp 1909–1916Google Scholar
  56. Simon F, Steinbr F, Lewerentz C (2001) Metrics based refactoring. In: Proceedings of the 5th European conference on software maintenance and reengineering. IEEE CS Press, Lisbon, pp 30–38CrossRefGoogle Scholar
  57. Stevens W, Myers G, Constantine L (1974) Structured design. IBM Syst J 13(2):115–139CrossRefGoogle Scholar
  58. Stewart KJ, Darcy DP, Daniel SL (2006) Opportunities and challenges applying functional data analysis to the study of open source software evolution. Stat Sci 21(2):167–178MathSciNetMATHCrossRefGoogle Scholar
  59. Tahvildari L, Kontogiannis K (2003) A metric-based approach to enhance design quality through meta-pattern transformation. In: Proceedings of the 7st European conference on software maintenance and reengineering. Benevento, Italy, pp 183–192Google Scholar
  60. Tonella P (2001) Concept analysis for module restructuring. IEEE Trans Softw Eng 27(4):351–363CrossRefGoogle Scholar
  61. Trifu A, Marinescu R (2005) Diagnosing design problems in object oriented systems. In: Proceedings of the 12th working conference on reverse engineering. IEEE Press, Pittsburgh, PA, pp 155–164CrossRefGoogle Scholar
  62. Tsantalis N, Chatzigeorgiou A (2009) Identification ofmove method refactoring opportunities. IEEE Trans Softw Eng 35(3):347–367CrossRefGoogle Scholar
  63. Wen Z, Tzerpos V (2004) An effectiveness measure for software clustering algorithms. In: Proceedings of the 12th IEEE international workshop on program comprehension, IWPC ’04. IEEE Computer Society, pp 194–203Google Scholar
  64. Wiggerts TA (1997) Using clustering algorithms in legacy systems remodularization. In: Proceedings of the 4th working conference on reverse engineering. IEEE CS Press, Amsterdam, pp 33–43CrossRefGoogle Scholar
  65. WRT (2011) 2011 International Workshop on Refactoring Tools. http://refactoring.info/WRT11. Accessed 22 April 2013

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Gabriele Bavota
    • 1
  • Andrea De Lucia
    • 1
  • Andrian Marcus
    • 2
  • Rocco Oliveto
    • 3
  1. 1.Software Engineering LabUniversity of SalernoFiscianoItaly
  2. 2.SEVERE Group, Department of Computer ScienceWayne State UniversityDetroitUSA
  3. 3.Department of Bioscience and TerritoryUniversity of MolisePescheItaly

Personalised recommendations