Abstract
Logistic regression (LR) and naïve Bayes (NB) extensively used for prediction of fault-proneness assume linear addition and independence that often cannot hold in practice. Hence, we propose a Bayesian network (BN) model with incorporation of data mining techniques as an integrative approach. Compared with LR and NB, BN provides a flexible modeling framework, thus avoiding the corresponding assumptions. Using the static metrics such as Chidamber and Kemerer’s (C-K) suite and complexity as predictors, the differences in performance between LR, NB and BN models were examined for fault-proneness prediction at the class level in continual releases (five versions) of Rhino, an open-source implementation of JavaScript, developed using the agile process. By cross validation and independent test of continual versions, we conclude that the proposed BN can achieve a better prediction than LR and NB for the agile software due to its flexible modeling framework and incorporation of multiple sophisticated learning algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
D’Ambros, M., Lanza, M., Robbes, R.: Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empir. Softw. Eng. 17, 531–577 (2012)
Pai, G.J., Dugan, J.B.: Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans. Software Eng. 33, 675–686 (2007)
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Software Eng. 33, 2–13 (2007)
Briand, L.C., Wust, J., Daly, J.W., Porter, D.V.: Exploring the relationships between design measures and software quality in object-oriented systems. J. Syst. Softw. 51, 245–273 (2000)
Singh Y., Kaur, A., Malhotra, R.: Application of decision trees for predicting fault proneness. In: International Conference on Information Systems, Technology and Management-Information Technology, Ghaziabad, India (2009)
Guo, L., Ma, Y., Cukic, B., Singh, H.: Robust prediction of fault proneness by random forests. In: 15th International Symposium on Software Reliability Engineering, pp. 417–428. IEEE Computer Society, Washington, DC (2004)
Singh, Y., Kaur, A., Malhotra, R.: Predicting software fault proneness model using neural network. In: Jedlitschka, A., Salo, O. (eds.) PROFES 2008. LNCS, vol. 5089, pp. 204–214. Springer, Heidelberg (2008)
Singh, Y., Kaur, A., Malhotra, R.: Software fault proneness prediction using support vector machines. In: Proceedings of the World Congress on Engineering (2009)
Hosmer, D., Lemeshow, S.: Applied Logistic Regression. Wiley, New York (2000)
Gokhale, S.S., Lyn, M.R.: Regression tree modeling for the prediction of software quality. In: Proceedings Of Third ISSAT Intl. Conference on Reliability, pp. 31–36 (1997)
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–346 (1995)
Ambler, S.W.: Agile Modeling: Effective Practices for Extreme Programming and the Unified Process. Wiley, New York (2002)
Herbsleb, J.D.: Global software development. IEEE Softw. 18, 16–20 (2001)
Olague, H.M., Etzkorn, L.H., Gholston, S., Quattlebaum, S.: Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans. Softw. Eng. 33, 402–419 (2007)
Basili, V.R., Briand, L.C., Melo, W.L.: A validation of object-oriented design metrics as quality indicators. IEEE Trans. Softw. Eng. 22, 751–761 (1996)
Cardoso, J.: Process control-flow complexity metric: an empirical validation. In: IEEE International Conference on Services Computing (IEEE SCC 06), pp. 167–173. IEEE Computer Society (2006)
Harrison, R., Counsell, S., Nithi, R.: An evaluation of the MOOD set of object oriented software metrics. IEEE Trans. Softw. Eng. 24, 150–157 (1998)
McCabe, T.J.: A complexity measure. IEEE Trans. Softw. Eng. 2, 308–320 (1976)
Spinellis, D.: Code Quality: The Open Source Perspective. Addison Wesley, Boston (2006)
Elomaa, T., Rousu, J.: Finding optimal multi-splits for numerical attributes in decision tree learning (1996)
Li, L., Wang, J., Leung, H., Jiang, C.: Assessment of catastrophic risk using bayesian network constructed from domain knowledge and spatial data. Risk Anal. 30, 1157–1175 (2010)
Bouckaert, R.R.: Bayesian Belief Network: from Construction to Inference (1995)
Kabli, R., Herrmann, F., McCall, J.: A Chain-Model Genetic Algorithm for Bayesian Network Structure Learning. GECCO, London (2007)
Larranaga, P., Murga, R., Poza, M., Kuijpers, C.: Structure learning of Bayesian network by hybrid genetic algorithms. In: Fisher, D., Lenz, H.J. (eds.) Learning from Data: AI and Statistics. Springer, New York (1996)
Korb, K.B., Nicholson, A.E.: Bayesian Artificial Intelligence. Chapman & Hall/CRC, Boca Raton (2004)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)
Dirk, V.P., Bart, L.: Customer attrition analysis for financial services using proportional hazard models. Eur. J. Oper. Res. 157, 196–217 (2004)
Menzies, T., Dekhtyar, A., Distefano, J., Greenwald, J.: Problems with precision: a response to “comments on ‘data mining static code attributes to learn defect predictors’”. IEEE Trans. Softw. Eng. 33, 637–640 (2007)
Liu, Y., Cheah, W., Kim, B., Park, H.: Predict software failure-prone by learning Bayesian network. Int. J. Adv. Sci. Technol. 1, 33–42 (2008)
Fenton, N., Neil, M., Marsh, W., Hearty, P., Radlinski, L., Krause, P.: On the effectiveness of early life cycle defect prediction with Bayesian nets. Empir. Softw. Eng. 13, 499–537 (2008)
Li, L., Leung, H.: Mining static code metrics for a robust prediction of software defect-proneness. In: ACM /IEEE International Symposium on Empirical Software Engineering and Measurement Anaheim, CA (2011)
Cox, A.L.: Risk Analysis: Foundations, Models and Methods. Springer, Heidelberg (2001)
Hoeting, A.J., Madigan, D., Raftery, E.A., Volinsky, T.C.: Bayesian model averaging: a tutorial. Stat. Sci. 14, 382–417 (1999)
Li, L., Leung, H.: Using the number of faults to improve fault-proneness prediction of the probability models. In: 2009 World Congress on Computer Science and Information Engineering, Los Angeles/Anaheim (2009)
Acknowledgements
This research is partly supported by the Hong Kong CERG grant PolyU5225/08E, NSFC grant 1171344/D010703, MOST grants (2012CB955503 and 2011AA120305–1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Li, L., Leung, H. (2014). Bayesian Prediction of Fault-Proneness of Agile-Developed Object-Oriented System. In: Hammoudi, S., Cordeiro, J., Maciaszek, L., Filipe, J. (eds) Enterprise Information Systems. ICEIS 2013. Lecture Notes in Business Information Processing, vol 190. Springer, Cham. https://doi.org/10.1007/978-3-319-09492-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-09492-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09491-5
Online ISBN: 978-3-319-09492-2
eBook Packages: Computer ScienceComputer Science (R0)