Automating extract class refactoring: an improved method and its evaluation

Bavota, Gabriele; De Lucia, Andrea; Marcus, Andrian; Oliveto, Rocco

doi:10.1007/s10664-013-9256-x

Automating extract class refactoring: an improved method and its evaluation

Published: 04 May 2013

Volume 19, pages 1617–1664, (2014)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Gabriele Bavota¹,
Andrea De Lucia¹,
Andrian Marcus² &
…
Rocco Oliveto³

1597 Accesses
63 Citations
3 Altmetric
Explore all metrics

Abstract

During software evolution the internal structure of the system undergoes continuous modifications. These continuous changes push away the source code from its original design, often reducing its quality, including class cohesion. In this paper we propose a method for automating the Extract Class refactoring. The proposed approach analyzes (structural and semantic) relationships between the methods in a class to identify chains of strongly related methods. The identified method chains are used to define new classes with higher cohesion than the original class, while preserving the overall coupling between the new classes and the classes interacting with the original class. The proposed approach has been first assessed in an artificial scenario in order to calibrate the parameters of the approach. The data was also used to compare the new approach with previous work. Then it has been empirically evaluated on real Blobs from existing open source systems in order to assess how good and useful the proposed refactoring solutions are considered by software engineers and how well the proposed refactorings approximate refactorings done by the original developers. We found that the new approach outperforms a previously proposed approach and that developers find the proposed solutions useful in guiding refactorings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quality Histories of Past Extract Method Refactorings

Identification of refactoring opportunities for source code based on class association relationships

Article 01 December 2020

Assessment of the Code Refactoring Dataset Regarding the Maintainability of Methods

Notes

These approaches do not provide support to extract class refactoring.
http://www.eclipse.org/.
http://www.jetbrains.com/idea/features/refactoring.html.
If a private field needs to be shared by two or more of the extracted classes, the implementation of the needed getter and/or setter methods is left to the developer.
It is worth noting that while the general experimental design is the same, the Max Flow-Min Cut approach (Bavota et al. 2011) was evaluated on artificial Blobs created merging only two classes, as it is only able to split a Blob in two classes.
The complete results achieved with all possible combinations of parameters can be found in Bavota et al. (2012).
The interested reader can find the interaction plots for all systems in our online appendix (Bavota et al. 2012).
All students voluntarily took part to the study.
In the number of methods we do not count the constructors (for both pre- and post-refactoring) and any getters and setters methods that would be added after the refactoring. In this way the sum of methods of the extracted classes is equals to the number of methods of the Blob class.
The interested reader can find the results by the Max Flow-Min Cut approach for each Blob in our online appendix (Bavota et al. 2012).
A fine grained analysis of the scores assigned by the students is reported in our online Appendix (Bavota et al. 2012).
To avoid bias in the experiment none of the authors have been involved in this evaluation.
None of the 50 students involved in the user study reported in Section 5 has been involved in this experiment.
This data was provided by the subjects when sending their results to us.

References

Abadi A, Ettinger R, Feldman YA (2009) Fine slicing for advanced method extraction. In: 3rd workshop on refactoring tools
Abdeen H, Ducasse S, Sahraoui HA, Alloui I (2009) Automatic package coupling and cycle minimization. In: Proceedings of the 16th working conference on reverse engineering. IEEE CS Press, Lille, pp 103–112
Google Scholar
Anquetil N, Fourrier C, Lethbridge TC (1999) Experiments with clustering as a software remodularization method. In: Proceedings of the 6th working conference on reverse engineering. IEEE CS Press, Atlanta, GA, pp 235–255
Google Scholar
Arisholm E, Sjoberg D (2004) Evaluating the effect of a delegated versus centralized control style on the maintainability of object-oriented software. IEEE Trans Softw Eng 30(8):521–534
Article Google Scholar
Atkinson DC, King T (2005) Lightweight detection of program refactorings. In: Proceedings of the 12th Asia-Pacific software engineering conference. IEEE CS Press, Taipei, pp 663–670
Google Scholar
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley
Basili VR, Briand L, Melo WL (1995) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761
Article Google Scholar
Bavota G, De Lucia A, Oliveto R (2011) Identifying extract class refactoring opportunities using structural and semantic cohesion measures. J Syst Softw 84:397–414
Article Google Scholar
Bavota G, Lucia AD, Marcus A, Oliveto R (2010) A two-step technique for extract class refactoring. In: Proceedings of 25th IEEE international conference on automated software engineering, pp 151–154
Bavota G, Lucia AD, Marcus A, Oliveto R (2012) Automating extract class refactoring: an improved approach and its evaluation. Online appendix https://dl.dropbox.com/u/20652688/emseappendix.zip
Binkley AB, Schach SR (1998) Validation of the coupling dependency metric as a predictor of run-time failures and maintenance measures. In: Proceedings of the 20th international conference on software engineering. Kyoto, Japan, pp 452–455
Chapter Google Scholar
Bodhuin T, Canfora G, Troiano L (2007) SORMASA: a tool for suggesting model refactoring actions by metrics-led genetic algorithm. In: Proceedings of 1st workshop on refactoring tools. Berlin, Germany, pp 23–24
Briand LC, Wuest J, Lounis H (1999a) Using coupling measurement for impact analysis in object-oriented systems. In: Proceedings of the 15th IEEE international conference on software maintenance. IEEE Press, Oxford, pp 475–482
Google Scholar
Briand LC, Wüst J, Ikonomovski SV, Lounis H (1999b) Investigating quality factors in object-oriented designs: an industrial case study. In: Proceedings of the 21st international conference on software engineering. ACM Press, Los Angeles, CA, pp 345–354
Chapter Google Scholar
Brown WJ, Malveau RC, Brown WH, McCormick III HW, Mowbray TJ (1998) Anti patterns: refactoring software, architectures, and projects in crisis, 1st edn. John Wiley and Sons
Canfora G, Cimitile A, De Lucia A, Di Lucca GA (2001) Decomposing legacy systems into objects: an eclectic approach. Inf Softw Technol 43(6):401–412
Article Google Scholar
Casais E (1992) An incremental class reorganization approach. In: Proceedings of the 6th European conference on object-oriented programming. Utrecht, the Netherlands, pp 114–132
Chapter Google Scholar
Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493
Article Google Scholar
Christl A, Koschke R, Storey MA (2007) Automated clustering to support the reflexion method. Inf Softw Technol 49(3):255–274
Article Google Scholar
Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Earlbaum Associates
Conover WJ (1998) Practical nonparametric statistics, 3rd edn. Wiley
Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms, 2nd edn, chap 26 (maximum flow). MIT Press and McGraw-Hill
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Article Google Scholar
van Deursen A, Kuipers T (1999) Identifying objects using cluster and concept analysis. In: Proceedings of the 21st international conference on software engineering. ACM Press, Los Angeles, CA, pp 246–255
Chapter Google Scholar
Du Bois B, Demeyer S, Verelst J (2004) Refactoring—improving coupling and cohesion of existing code. In: Proceedings of 11th working conference on reverse engineering. IEEE CS Press, Delft, pp 144–151
Google Scholar
Fokaefs M, Tsantalis N, Chatzigeorgiou A, Sander J (2009) Decomposing object-oriented class modules using an agglomerative clustering technique. In: Proceedings of the 25th international conference on software maintenance. Edmonton, Canada, pp 93–101
Google Scholar
Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley
Girard JF, Koschke R (2000) A comparison of abstract data types and objects recovery techniques. Sci Comput Program 36(2–3):149–181
Article Google Scholar
Gui G, Scott PD (2006) Coupling and cohesion measures for evaluation of component reusability. In: Proceedings of the 5th international workshop on mining software repositories. ACM Press, Shanghai, pp 18–21
Google Scholar
Gyimóthy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910
Article Google Scholar
Joshi P, Joshi RK (2009) Concept analysis for class cohesion. In: Proceedings of the 13th European conference on software maintenance and reengineering. Kaiserslautern, Germany, pp 237–240
Google Scholar
Khomh F, Vaucher S, Guéhéneuc YG, Sahraoui H (2009) A bayesian approach for the detection of code and design smells. In: Proceedings of the 9th international conference on quality software. IEEE CS Press, Hong Kong, pp 305–314
Google Scholar
Khomh F, Vaucher S, Guéhéneuc YG, Sahraoui H (2009) A bayesian approach for the detection of code and design smells. In: Proceedings of the 2009 ninth international conference on quality software. IEEE Computer Society, Washington, DC, pp 305–314
Chapter Google Scholar
Koschke R, Canfora G, Czeranski J (2006) Revisiting the delta ic approach to component recovery. Sci Comput Program 60(2):171–188
Article MathSciNet MATH Google Scholar
Kuhn A, Ducasse S, Gîrba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49(3):230–243
Article Google Scholar
Lee Y, Liang B, Wu S, Wang F (1995) Measuring the coupling and cohesion of an object-oriented program based on information flow. In: Proceedings of the international conference on software quality. Maribor, Slovenia, pp 81–90
Google Scholar
Li W, Henry S (1993) Maintenance metrics for the object oriented paradigm. In: Proceedings of the first international software metrics symposium, pp 52–60
Liu Y, Poshyvanyk D, Ferenc R, Gyimóthy T, Chrisochoides N (2009) Modelling class cohesion as mixtures of latent topics. In: Proceedings of the 25th IEEE international conference on software maintenance. IEEE Press, Edmonton, pp 233–242
Google Scholar
Maletic JI, Marcus A (2001) Supporting program comprehension using semantic and structural information. In: Proceedings of the 23rd international conference on software engineering. IEEE CS Press, Toronto, ON, pp 103–112
Google Scholar
Marcus A, Poshyvanyk D, Ferenc R (2008) Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans Softw Eng 34(2):287–300
Article Google Scholar
Marinescu R (2004) Detection strategies: metrics-based rules for detecting design flaws. In: Proceedings of the 20th IEEE international conference on software maintenance. IEEE Computer Society, Washington, DC, pp 350–359
Google Scholar
Maruyama K, Shima K (1999) Automatic method refactoring using weighted dependence graphs. In: Proceedings of 21st international conference on software engineering. ACM Press, Los Alamitos, CA, pp 236–245
Chapter Google Scholar
Mens T, Tourwe T (2004) A survey of software refactoring. IEEE Trans Softw Eng 30(2):126–139
Article Google Scholar
Moha N, Gueheneuc YG, Duchien L, Le Meur AF (2010) Decor: a method for the specification and detection of code and design smells. IEEE Trans Softw Eng 36(1):20–36
Article Google Scholar
Moore I (1996) Automatic inheritance hierarchy restructuring and method refactoring. In: Proceedings of 11th ACM SIGPLAN conference on object-oriented programming, systems, languages, and applications. ACM Press, San Jose, CA, pp 235–250
Google Scholar
O’Keeffe M, O’Cinneide M (2006) Search-based software maintenance. In: Proceedings of 10th European conference on software maintenance and reengineering. IEEE CS Press, Bari, pp 249–260
Google Scholar
Olbrich S, Cruzes DS, Basili, V, Zazworka N (2009) The evolution and impact of code smells: a case study of two open source systems. In: Proceedings of the 2009 3rd international symposium on empirical software engineering and measurement, ESEM ’09, pp 390–400
Oliveto R, Gethers M, Bavota G, Poshyvanyk D, Lucia A (2011) Identifying method friendships to remove the feature envy bad smell (nier track). In: 33rd IEEE/ACM international conference on software engineering—NIER Track. ACM Press, Hawaii, USA, pp 820–823
Google Scholar
Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter Publishers
Poshyvanyk D, Marcus A, Ferenc R, Gyimóthy T (2009) Using information retrieval based coupling measures for impact analysis. Empir Software Eng 14(1):5–32
Article Google Scholar
Praditwong K, Harman M, Yao X (2011) Software module clustering as a multi-objective search problem. IEEE Trans Softw Eng 37(2):264–282
Article Google Scholar
Prete K, Rachatasumrit N, Sudan N, Kim M (2010) Template-based reconstruction of complex refactorings. In: 26th IEEE international conference on software maintenance (ICSM 2010). IEEE Computer Society, Timisoara, 12–18 September 2010, pp 1–10
Sartipi K, Kontogiannis K (2001) Component clustering based on maximal association. In: Proceedings of the 8th working conference on reverse engineering. Stuttgart, Germany, pp 103–114
Chapter Google Scholar
Seng O, Bauer M, Biehl M, Pache G (2005) Search-based improvement of subsystem decompositions. In: Proceedings of the genetic and evolutionary computation conference. ACM Press, Washington, DC, pp 1045–1051
Google Scholar
Seng O, Stammel J, Burkhart D (2006) Search-based determination of refactorings for improving the class structure of object-oriented systems. In: Proceedings of the genetic and evolutionary computation conference. Seattle, Washington, USA, pp 1909–1916
Simon F, Steinbr F, Lewerentz C (2001) Metrics based refactoring. In: Proceedings of the 5th European conference on software maintenance and reengineering. IEEE CS Press, Lisbon, pp 30–38
Chapter Google Scholar
Stevens W, Myers G, Constantine L (1974) Structured design. IBM Syst J 13(2):115–139
Article Google Scholar
Stewart KJ, Darcy DP, Daniel SL (2006) Opportunities and challenges applying functional data analysis to the study of open source software evolution. Stat Sci 21(2):167–178
Article MathSciNet MATH Google Scholar
Tahvildari L, Kontogiannis K (2003) A metric-based approach to enhance design quality through meta-pattern transformation. In: Proceedings of the 7st European conference on software maintenance and reengineering. Benevento, Italy, pp 183–192
Google Scholar
Tonella P (2001) Concept analysis for module restructuring. IEEE Trans Softw Eng 27(4):351–363
Article Google Scholar
Trifu A, Marinescu R (2005) Diagnosing design problems in object oriented systems. In: Proceedings of the 12th working conference on reverse engineering. IEEE Press, Pittsburgh, PA, pp 155–164
Chapter Google Scholar
Tsantalis N, Chatzigeorgiou A (2009) Identification ofmove method refactoring opportunities. IEEE Trans Softw Eng 35(3):347–367
Article Google Scholar
Wen Z, Tzerpos V (2004) An effectiveness measure for software clustering algorithms. In: Proceedings of the 12th IEEE international workshop on program comprehension, IWPC ’04. IEEE Computer Society, pp 194–203
Wiggerts TA (1997) Using clustering algorithms in legacy systems remodularization. In: Proceedings of the 4th working conference on reverse engineering. IEEE CS Press, Amsterdam, pp 33–43
Chapter Google Scholar
WRT (2011) 2011 International Workshop on Refactoring Tools. http://refactoring.info/WRT11. Accessed 22 April 2013

Download references

Acknowledgements

We would like to thank all the students who participated to our studies. We would also like to thank anonymous reviewers for their careful reading of our manuscript and high-quality feedback. Their detailed comments have helped us to substantially revise, extend, and improve the original version of this paper. Andrian Marcus was supported in part by grants from the US National Science Foundation (CCF-0845706 and CCF-1017263).

Author information

Authors and Affiliations

Software Engineering Lab, University of Salerno, Via ponte don Melillo, 84084, Fisciano, SA, Italy
Gabriele Bavota & Andrea De Lucia
SEVERE Group, Department of Computer Science, Wayne State University, 5057 Woodward Ave, Suite 14101.1, Detroit, MI, 48202, USA
Andrian Marcus
Department of Bioscience and Territory, University of Molise, C. da Fonte Lappone, 86090, Pesche, IS, Italy
Rocco Oliveto

Authors

Gabriele Bavota
View author publications
You can also search for this author in PubMed Google Scholar
Andrea De Lucia
View author publications
You can also search for this author in PubMed Google Scholar
Andrian Marcus
View author publications
You can also search for this author in PubMed Google Scholar
Rocco Oliveto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rocco Oliveto.

Additional information

Communicated by: Arie van Deursen

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bavota, G., De Lucia, A., Marcus, A. et al. Automating extract class refactoring: an improved method and its evaluation. Empir Software Eng 19, 1617–1664 (2014). https://doi.org/10.1007/s10664-013-9256-x

Download citation

Published: 04 May 2013
Issue Date: December 2014
DOI: https://doi.org/10.1007/s10664-013-9256-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automating extract class refactoring: an improved method and its evaluation

Abstract

Access this article

Similar content being viewed by others

Quality Histories of Past Extract Method Refactorings

Identification of refactoring opportunities for source code based on class association relationships

Assessment of the Code Refactoring Dataset Regarding the Maintainability of Methods

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automating extract class refactoring: an improved method and its evaluation

Abstract

Access this article

Similar content being viewed by others

Quality Histories of Past Extract Method Refactorings

Identification of refactoring opportunities for source code based on class association relationships

Assessment of the Code Refactoring Dataset Regarding the Maintainability of Methods

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation