Abstract
This paper studies the complex network structure of software design networks. In a software design network, each node is a class (a specific part of a piece of software) and each link represents a software code-related dependency between two classes. This work provides two main contributions. First, we reveal how typical software networks exhibit a structure very similar to other real-world networks: they are sparse, scale-free and have low average node-to-node-distances. In addition, we demonstrate how various distance, network clustering and assortativity metrics can provide important insights for software engineers related to software design decisions, coupling and inter-package relationships. Second, we propose a novel network-driven method to automatically determine the role of a software class, a frequently encountered problem by software engineers trying to understand a large-scale software system. We use a role taxonomy from literature which defines six so-called archetypes of software classes, which, once assigned to a class, can provide useful insights for engineers. In this paper we train and validate a model that is able to automatically assess which of these archetypes a class belongs to. Experiments on three unique high quality network datasets of large real-world software systems demonstrate how network features are able to realize high accuracy models for this task.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Version 5.304, obtained from https://github.com/k9mail/k-9/releases.
- 2.
Version 5.6, obtained from https://sourceforge.net/projects/sweethome3d.
- 3.
Version 3.1.0, obtained from https://github.com/mars-sim/mars-sim.
- 4.
The constructed feature sets can be found at https://github.com/Xavyr-R/thesis-labelled-files/tree/master/inputfeatures.
References
Barabási, A.L.: Network Science. Cambridge University Press, Cambridge (2016)
Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25(2), 163–177 (2001)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Chong, C.Y., Lee, S.P.: Analyzing maintainability and reliability of object-oriented software using weighted complex network. J. Syst. Softw. 110, 28–53 (2015)
Concas, G., Marchesi, M., Murgia, A., Tonelli, R.: An empirical study of social networks metrics in object-oriented software. Adv. Softw. Eng. 2010, 4 (2010)
Dragan, N., Collard, M.L., Maletic, J.I.: Automatic identification of class stereotypes. In: Proceedings of the IEEE International Conference on Software Maintenance, pp. 1–10 (2010)
Fowler, M.: UML Distilled: A Brief Guide to the Standard Object Modeling Language. Addison-Wesley Professional (2004)
Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Netw. 1(3), 215–239 (1978)
Genero, M., Piattini, M., Calero, C.: A survey of metrics for UML class diagrams. J. Object Technol. 4(9), 59–92 (2005)
Hagberg, A., Swart, P., S Chult, D.: Exploring network structure, dynamics, and function using network. Technical Report, Los Alamos National Lab. (2008)
Larman, C.: Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development (3rd Edition), 3rd edn. Prentice Hall PTR, Upper Saddle River (2004)
Menze, B.H., et al.: A comparison of random forest and its Gini importancefor the feature selection and classification of spectral data. BMC Bioinform. 10(1), 213 (2009)
Myers, C.R.: Software systems as complex networks: structure, function, and evolvability of software collaboration graphs. Phys. Rev. E 68(4), 046,116 (2003)
Pawlak, R., Monperrus, M., Petitprez, N., Noguera, C., Seinturier, L.: Spoon: a library for implementing analyses and transformations of java source code. Softw.: Pract. Exp. 46(9), 1155–1179 (2015)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Rademaker, X.T.: A network-driven feature construction approach for labelling software classes using machine learning. Technical Report, MSc thesis, Leiden University (2018)
Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)
Tekin, U., Buzluca, F.: A graph mining approach for detecting identical design structures in object-oriented design models. Sci. Comput. Program. 95(P4), 406–425 (2014)
Thung, F., Lo, D., Osman, M.H., Chaudron, M.R.V.: Condensing class diagrams by analyzing design and network metrics using optimistic classification. In: Proceedings of the 22nd International Conference on Program Comprehension (2014)
Tsantalis, N., Chatzigeorgiou, A., Stephanides, G., Halkidis, S.T.: Design pattern detection using similarity scoring. IEEE Trans. Softw. Eng. 32(11), 896–909 (2006)
Wang, J., Ai, J., Yang, Y., Su, W.: Identifying key classes of object-oriented software based on software complex network. In: Proceedings of the 2nd IEEE International Conference on System Reliability and Safety, pp. 444–449 (2017)
Wirfs-Brock, R.J.: Characterizing classes. IEEE Softw. 23(2), 9–11 (2006)
Wirfs-Brock, R.J., Johnson, R.E.: Surveying current research in object-oriented design. Commun. ACM 33(9), 104–124 (1990)
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Rademaker, X.T., Chaudron, M.R.V., Takes, F.W. (2019). Automatic Identification of Component Roles in Software Design Networks. In: Aiello, L., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L. (eds) Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol 813. Springer, Cham. https://doi.org/10.1007/978-3-030-05414-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-05414-4_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05413-7
Online ISBN: 978-3-030-05414-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)