Abstract
This paper presents an exploratory study that applies three data analysis techniques: statistical analysis, data clustering, and visualization conducted to the ISBSG R12 data set. Both SPSS and RapidMiner are used to conduct the analysis. While statistical analysis main advantage is the summarization of data, the overall behavior of the data is lost, particularly the view of outlier values. The study applied two techniques in this regard using SPSS: correlation analysis and the general linear model using multiple variables. The statistical analysis showed a high significant level of relationship between and among the selected variables. In the data mining areas, the clustering technique and visualization used both SPSS and RapidMiner (RM). For the selected variables, the number of clusters is determined after several runs, in an attempt to diversify the one larger cluster into several sub-clusters. Finally, visualization technique demonstrates how it could show concentration and trends. Statistical analysis found high correlation between speed of delivery and manpower delivery rate, and the independent factors of industry type and development methodologies vs. the dependent variable of defect density. The clustering process highlighted the importance of variables related to work efforts and defects in forming the clusters. Major conclusions of the visualization charts revealed an inverse no-linear relationship between effort of analysis and design of total effort and speed of delivery form one side and total defects delivered. Overall, multiple view of data analytics is needed to arrive at a clear and consistent understanding of the underlying behavior of the data in a complex data set such as ISBSG.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abdaoui, N., Khalifa, I., Faiz, S.: Sending a personalized advertisement to loyal customers in the ubiquitous environment. In: Proceedings of the 7th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), pp. 40–47. IEEE, Mammamet, Tunisia (2016)
Bassil, Y.: A simulation model for the waterfall software development life cycle. Int. J. Eng. Technol. (iJET) 2(5), 742–749 (2012)
Beck, K., et al.: Principles behind the agile manifesto (2001). http://www.agilemanifesto.org. Last accessed 31 Apr 2017
Bellini, C., Pereira, R., Becker, J.: Measurement in software engineering: from the roadmap to the crossroads. Int. J. Softw. Eng. Knowl. Eng. 18(1), 37–64 (2008)
Ben Fredj, I., Ouni, K.: Fuzzy k-nearest neighbors applied to phoneme recognition. In: Proceedings of the 7th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), pp. 422–426. IEEE, Mammamet, Tunisia (2016)
Bermad, N., Kechadi, M.: Evidence analysis to basis of clustering: approach based on mobile forensic investigation. In: Proceedings of the 7th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), pp. 300–307. IEEE, Mammamet, Tunisia (2016)
Cockburn, A., Highsmith, J.: Agile software development: the business of innovation. IEEE Comput. 34(9), 120–127 (2001)
Cohn, M.: CHAOS report from the Standish Group: http://www.mountaingoatsoftware.com/blog/agile-succeeds-three-times-more-often-than-waterfall. Posted 2011. Last accessed 31 Mar 2017
Guerfala, M., Sifaoui, A., Abdelkrim, A.: Data classification using logarithmic spiral method based on RBF classifiers. In: Proceedings of the 7th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), pp. 416–421. IEEE, Mammamet, Tunisia (2016)
Hannay, J., Benestad, H.: Perceived productivity threats in large agile development projects. In: Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (article # 15). ACM, New York, NY, USA (2010)
Hernandez-Lopez, A., Colomo-Palacios, R., Garcis-Crespo, Á.: Software engineering productivity: concepts, issues and challenges. Int. J. Softw. Eng. Knowl. Eng. 2(1), 37–47 (2011)
Koch, S.: Exploring the effects of SourceForge.net coordination and communication tools on the efficiency of open source projects using data envelopment analysis. Empirical Softw. Eng. 14(4), 397–417 (2009)
Lindvall, M., et al.: Agile software development in large organizations. IEEE Comput. 37(12), 26–34 (2004)
Melo, C., Cruzes, D. S., Kon, F., Conradi, R.: Agile team perceptions of productivity factors. In: Agile Conference (AGILE), pp. 57–66. IEEE Computer Society Press. Los Alamitos, CA, USA (2011)
RapidMiner Studio Manual: (2014). https://docs.rapidminer.com/downloads/RapidMiner-v6-user-manual.pdf
Rodriguez, D., Sicilia, M., Garcia, E., Harrison, R.: Empirical findings on team size and productivity in software development. J. Syst. Softw. 85(3), 562–570 (2012)
Royce, W.: Managing the development of large software systems: concepts and techniques. In: CSE ‘87 Proceedings of the 9th international conference on Software Engineering, pp. 328–338. IEEE Computer Society Press, Los Alamitos, CA, USA (1987)
Sommerville, I.: Software Engineering, 10th edn. Addison Wesley, Boston, USA (2015)
Trendowicz, A., Jürgen, M.: Factors influencing software development productivity - state-of-the-art and industrial experiences. Adv. Compt. 77, 185–241 (2009)
Wang, Y.: On the cognitive informatics foundations of software engineering. In: Chan, C., Kinsner, W., Wang, Y., Miller, D. (eds.) Proceedings of Third IEEE International Conference on Cognitive Informatics 2004, pp. 22–31. IEEE Computer Society Press, Los Alamitos, CA, USA (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Alkhatib, G., Al-Sarayrah, K., Abram, A. (2020). Exploring ISBSG R12 Dataset Using Multi-data Analytics. In: Bouhlel, M., Rovetta, S. (eds) Proceedings of the 8th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT’18), Vol.1. SETIT 2018. Smart Innovation, Systems and Technologies, vol 146. Springer, Cham. https://doi.org/10.1007/978-3-030-21005-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-21005-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21004-5
Online ISBN: 978-3-030-21005-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)