Advertisement

Journal of Computer Science and Technology

, Volume 33, Issue 6, pp 1320–1336 | Cite as

Search-Based Cost-Effective Software Remodularization

  • Rim MahouachiEmail author
Regular Paper
  • 42 Downloads

Abstract

Software modularization is a technique used to divide a software system into independent modules (packages) that are expected to be cohesive and loosely coupled. However, as software systems evolve over time to meet new requirements, their modularizations become complex and gradually loose their quality. Thus, it is challenging to automatically optimize the classes’ distribution in packages, also known as remodularization. To alleviate this issue, we introduce a new approach to optimize software modularization by moving classes to more suitable packages. In addition to improving design quality and preserving semantic coherence, our approach takes into consideration the refactoring effort as an objective in itself while optimizing software modularization. We adapt the Elitist Non-dominated Sorting Genetic Algorithm (NSGA-II) of Deb et al. to find the best sequence of refactorings that 1) maximize structural quality, 2) maximize semantic cohesiveness of packages (evaluated by a semantic measure based on WordNet), and 3) minimize the refactoring effort. We report the results of an evaluation of our approach using open-source projects, and we show that our proposal is able to produce a coherent and useful sequence of recommended refactorings both in terms of quality metrics and from the developer’s points of view.

Keywords

remodularization search-based software engineering refactoring effort multi-objective optimization semantics dependency 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11390_2018_1892_MOESM1_ESM.pdf (300 kb)
ESM 1 (PDF 299 kb)

References

  1. [1]
    Lehman M M. On understanding laws, evolution, and conservation in the large-program life cycle. Journal of Systems and Software, 1984, 1: 213-221.CrossRefGoogle Scholar
  2. [2]
    Eick S G, Graves T L, Karr A F, Marron J S, Mockus A. Does code decay? Assessing the evidence from change management data. IEEE Transactions on Software Engineering, 2001, 27(1): 1-12.CrossRefGoogle Scholar
  3. [3]
    Lanza M, Marinescu R. Object-oriented Metrics in Practice: Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-Oriented Systems. Springer-Verlag Berlin Heidelberg, 2006.zbMATHGoogle Scholar
  4. [4]
    Fowler M, Beck K, Brant J, Opdyke W, Roberts D. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional, 1999.Google Scholar
  5. [5]
    Harman M, Hierons R M, Proctor M. A new representation and crossover operator for search-based optimization of software modularization. In Proc. the 4th Annual Conference on Genetic and Evolutionary Computation, July 2002, pp.1351-1358.Google Scholar
  6. [6]
    Mitchell B S, Mancoridis S. On the automatic modularization of software systems using the bunch tool. IEEE Transactions on Software Engineering, 2006, 32(3): 193-208.CrossRefGoogle Scholar
  7. [7]
    Seng O, Bauer M, Biehl M, Pache G. Search-based improvement of subsystem decompositions. In Proc. the 7th Annual Conference on Genetic and Evolutionary Computation, June 2005, pp.1045-1051.Google Scholar
  8. [8]
    Bavota G, de Lucia A, Marcus A, Oliveto R. Software remodularization based on structural and semantic metrics. In Proc. the 17th Working Conference on Reverse Engineering, October 2010, pp.195-204.Google Scholar
  9. [9]
    Harman M, Tratt L. Pareto optimal search based refactoring at the design level. In Proc. the 9th Annual Conference on Genetic and Evolutionary Computation, July 2007, pp.1106-1113.Google Scholar
  10. [10]
    Bavota G, Carnevale F, de Lucia A, di Penta M, Oliveto R. Putting the developer in-the-loop: An interactive GA for software re-modularization. In Proc. the 4th International Symposium on Search Based Software Engineering, September 2012, pp.75-89.CrossRefGoogle Scholar
  11. [11]
    Bavota G, de Lucia A, Marcus A, Oliveto R. Using structural and semantic measures to improve software modularization. Empirical Software Engineering, 2013, 18(5): 901-932.CrossRefGoogle Scholar
  12. [12]
    Bavota G, Gethers M, Oliveto R, Poshyvanyk D, de Lucia A. Improving software modularization via automated analysis of latent topics and dependencies. ACM Transactions on Software Engineering and Methodology, 2014, 23(1): Article No. 4.CrossRefGoogle Scholar
  13. [13]
    Mkaouer M W, Kessentini M, Shaout A, Koligheu P, Bechikh S, Deb K, Ouni A. Many-objective software remodularization using NSGA-III. ACM Trans. Softw. Eng. Methodol., 2015, 24(3): Article No. 17.CrossRefGoogle Scholar
  14. [14]
    Abdeen H, Ducasse S, Sahraoui H, Alloui I. Automatic package coupling and cycle minimization. In Proc. the 16th Working Conference on Reverse Engineering, October 2009, pp.103-112.Google Scholar
  15. [15]
    Palomba F, Tufano M, Bavota G, Oliveto R, Marcus A, Poshyvanyk D, de Lucia A. Extract package refactoring in ARIES. In Proc. the 37th IEEE/ACM International Conference on Software Engineering, Volume 2, May 2015, pp.669-672.Google Scholar
  16. [16]
    Doval D, Mancoridis S, Mitchell B S. Automatic clustering of software systems using a genetic algorithm. In Proc. the 9th International Workshop on Software Technology and Engineering Practice, September 1999, pp.73-81.Google Scholar
  17. [17]
    Paixao M, Harman M, Zhang Y, Yu Y. An empirical study of cohesion and coupling: Balancing optimization and disruption. IEEE Transactions on Evolutionary Computation, 2018, 22(3): 394-414.CrossRefGoogle Scholar
  18. [18]
    Ouni A, Kessentini M, Sahraoui H, Inoue K, Deb K. Multicriteria code refactoring using search-based software engineering: An industrial case study. ACM Transactions on Software Engineering and Methodology, 2016, 25(3): Article No. 23.CrossRefGoogle Scholar
  19. [19]
    Maqbool O, Babri H. Hierarchical clustering for software architecture recovery. IEEE Transactions on Software Engineering, 2007, 33(11): 759-780.CrossRefGoogle Scholar
  20. [20]
    Candela I, Bavota G, Russo B, Oliveto R. Using cohesion and coupling for software remodularization: Is it enough? ACM Transactions on Software Engineering and Methodology, 2016, 25(3): Article No. 24.CrossRefGoogle Scholar
  21. [21]
    Corazza A, di Martino S, Maggio V, Scanniello G. Investigating the use of lexical information for software system clustering. In Proc. the 15th European Conference on Software Maintenance and Reengineering, March 2011, pp.35-44.Google Scholar
  22. [22]
    Hall M, Khojaye M A, Walkinshaw N, McMinn P. Establishing the source code disruption caused by automated remodularisation tools. In Proc. the IEEE International Conference on Software Maintenance and Evolution, September 2014, pp.466-470.Google Scholar
  23. [23]
    Abdeen H, Sahraoui H, Shata O, Anquetil N, Ducasse S. Towards automatically improving package structure while respecting original design decisions. In Proc. the 20th Working Conference on Reverse Engineering, October 2013, pp.212-221.Google Scholar
  24. [24]
    Ouni A, Kessentini M, Sahraoui H, Boukadoum M. Maintainability defects detection and correction: A multiobjective approach. Automated Software Engineering, 2013, 20(1): 47-79.CrossRefGoogle Scholar
  25. [25]
    Deb K, Pratap A, Agarwal S, Meyarivan T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 2002, 6(2): 182-197.CrossRefGoogle Scholar
  26. [26]
    Praditwong K, Harman M, Yao X. Software module clustering as a multi-objective search problem. IEEE Transactions on Software Engineering, 2011, 37(2): 264-282.CrossRefGoogle Scholar
  27. [27]
    Vallée-Rai R, Gagnon E, Hendren L, Lam P, Pominville P, Sundaresan V. Optimizing Java bytecode using the soot framework: Is it feasible? In Proc. the 9th International Conference on Compiler Construction, March 2000, pp.18-34.Google Scholar
  28. [28]
    Farrugia A. Vertex-partitioning into fixed additive induced-hereditary properties is NP-hard. The Electronic Journal of Combinatorics, 2004, 11(1): R46.MathSciNetzbMATHGoogle Scholar
  29. [29]
    Jiang J J, Conrath D W. Semantic similarity based on corpus statistics and lexical taxonomy. In Proc. the 10th International Conference Research on Computational Linguistics, March 1997, pp.19-33.Google Scholar
  30. [30]
    Brooks R. Towards a theory of the comprehension of computer programs. International Journal of Man-Machine Studies, 1983, 18(6): 543-554.CrossRefGoogle Scholar
  31. [31]
    Merlo E, McAdam I, de Mori R. Feed-forward and recurrent neural networks for source code informal information analysis. Journal of Software Maintenance: Research and Practice, 2003, 15(4): 205-244.CrossRefGoogle Scholar
  32. [32]
    Caprile C, Tonella P. Nomen est omen: Analyzing the language of function identifiers. In Proc. the 6th Working Conference on Reverse Engineering, October 1999, pp.112-122.Google Scholar
  33. [33]
    Lawrie D, Morrell C, Feild H, Binkley D. What’s in a name? A study of identifiers. In Proc. the 14th IEEE International Conference on Program Comprehension, June 2006, pp.3-12.Google Scholar
  34. [34]
    Poshyvanyk D, Marcus A. The conceptual coupling metrics for object-oriented systems. In Proc. the 22nd IEEE International Conference on Software Maintenance, September 2006, pp.469-478.Google Scholar
  35. [35]
    Gethers M, Poshyvanyk D. Using relational topic models to capture coupling among classes in object-oriented software systems. In Proc. the 26th IEEE International Conference on Software Maintenance, September 2010, pp.1-10.Google Scholar
  36. [36]
    Arnaoudova V, Eshkevari L M, di Penta M, Oliveto R, Antoniol G, Guéhéneuc Y G. REPENT: Analyzing the nature of identifier renamings. IEEE Transactions on Software Engineering, 2014, 40(5): 502-532.CrossRefGoogle Scholar
  37. [37]
    Arnaoudova V, di Penta M, Antoniol G. Linguistic antipatterns: What they are and how developers perceive them. Empirical Software Engineering, 2016, 21(1): 104-158.CrossRefGoogle Scholar
  38. [38]
    Seco N, Veale T, Hayes J. An intrinsic information content metric for semantic similarity in WordNet. In Proc. the 16th European Conference on Artificial Intelligence, August 2004, pp.1089-1090.Google Scholar
  39. [39]
    Budanitsky A, Hirst G. Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures. In Proc. Workshop on WordNet and Other Lexical Resources, Second Meeting of the North American Chapter of the Association for Computational Linguistics, Volume 2, June 2001, pp.24-29.Google Scholar
  40. [40]
    Lin D. An information-theoretic definition of similarity. In Proc. the 15th International Conference on Machine Learning, July 1998, pp.296-304.Google Scholar
  41. [41]
    Resnik P. Using information content to evaluate semantic similarity in a taxonomy. In Proc. the 14th International Joint Conference on Artificial Intelligence, August 1995, pp.448-453.Google Scholar
  42. [42]
    Deb K, Jain H. An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints. IEEE Trans. Evolutionary Computation, 2014, 18(4): 577-601.CrossRefGoogle Scholar
  43. [43]
    Wen Z, Tzerpos V. An effectiveness measure for software clustering algorithms. In Proc. the 12th IEEE International Workshop on Program Comprehension, July 2004, pp.194-203.Google Scholar
  44. [44]
    Kuhn A, Ducasse S, Gîrba T. Semantic clustering: Identifying topics in source code. Information & Software Technology, 2007, 49(3): 230-243.CrossRefGoogle Scholar
  45. [45]
    Sahraoui H A, Godin R, Miceli T. Can metrics help to bridge the gap between the improvement of OO design quality and its automation? In Proc. the 8th International Conference on Software Maintenance, October 2000, pp.154-162.Google Scholar
  46. [46]
    Kessentini M, Mahaouachi R, Ghedira K. What you like in design use to correct bad-smells. Software Quality Journal, 2013, 21(4): 551-571.CrossRefGoogle Scholar
  47. [47]
    Bavota G, Oliveto R, Gethers M, Poshyvanyk D, de Lucia A. Methodbook: Recommending move method refactorings via relational topic models. IEEE Transactions on Software Engineering, 2014, 40(7): 671-694.CrossRefGoogle Scholar
  48. [48]
    Tsantalis N, Chatzigeorgiou A. Identification of move method refactoring opportunities. IEEE Transactions on Software Engineering, 2009, 35(3): 347-367.CrossRefGoogle Scholar
  49. [49]
    Oliveto R, Gethers M, Bavota G, Poshyvanyk D, de Lucia A. Identifying method friendships to remove the feature envy bad smell: NIER track. In Proc. the 33rd International Conference on Software Engineering, May 2011, pp.820-823.Google Scholar
  50. [50]
    Lee J, Lee D, Kim D K, Park S. A semantic-based approach for detecting and decomposing god classes. arXiv: 1204.1967, 2012. https://arxiv.org/pdf/1204.1967.pdf, Sept. 2018.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Complex Outstanding Systems Modeling Optimization and Supervision Laboratory National School of Computer ScienceUniversity of ManoubaManoubaTunisia

Personalised recommendations