Skip to main content

Software Defect Prediction Using Augmented Bayesian Networks

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 614))

Abstract

Prediction models are built with various machine learning algorithms to identify defects prior to release to facilitate software testing, and save testing costs. Naïve Bayes classifier is one of the best performing classification techniques in defect prediction. It assumes conditional independence of features and for defect prediction problem some of the features are not actually conditionally independent. The interesting problem is to relax these conditional independence assumptions and to check whether there is any improvement in performance of classifiers. We have built Bayesian Network structures using different classes of algorithms namely score-based, constraint-based and hybrid algorithms. We propose an approach to augment these Bayesian Network structures with class node. Bayesian Network classifiers along with Random Forests, Logistic Regression and Naïve Bayes classifiers are then evaluated using measures like AUC and H-measure. We observe that RSMAX2 and Grow-Shrink classifiers (after augmentation) perform consistently better in defect prediction.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Anagnostopoulos, C.: Measuring classification performance: the hmeasure package (2012)

    Google Scholar 

  2. Cheng, J., Greiner, R.: Comparing bayesian network classifiers. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 101–108. Morgan Kaufmann Publishers Inc. (1999)

    Google Scholar 

  3. Cooper, G.F., Herskovits, E.: A bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9(4), 309–347 (1992)

    MATH  Google Scholar 

  4. D’Ambros, M., Lanza, M., Robbes, R.: An extensive comparison of bug prediction approaches. In: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pp. 31–41. IEEE (2010)

    Google Scholar 

  5. D’Ambros, M., Lanza, M., Robbes, R.: Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empirical Softw. Eng. 17(4–5), 531–577 (2012)

    Article  Google Scholar 

  6. Dejaeger, K., Verbraken, T., Baesens, B.: Toward comprehensible software fault prediction models using bayesian network classifiers. IEEE Trans. Software Eng. 39(2), 237–257 (2013)

    Article  Google Scholar 

  7. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  8. Fenton, N., Neil, M., Marquez, D.: Using bayesian networks to predict software defects and reliability. Proc. Inst. Mech. Eng. Part O J. Risk Reliab. 222(4), 701–712 (2008)

    Google Scholar 

  9. Fenton, N.E., Neil, M.: A critique of software defect prediction models. IEEE Trans. Software Eng. 25(5), 675–689 (1999)

    Article  Google Scholar 

  10. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)

    Article  MATH  Google Scholar 

  11. Graves, T.L., Karr, A.F., Marron, J.S., Siy, H.: Predicting fault incidence using software change history. IEEE Trans. Software Eng. 26(7), 653–661 (2000)

    Article  Google Scholar 

  12. Hand, D.J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77(1), 103–123 (2009)

    Article  Google Scholar 

  13. Hand, D.J., Anagnostopoulos, C.: A better beta for the h measure of classification performance. Pattern Recogn. Lett. 40, 41–46 (2014)

    Article  Google Scholar 

  14. Hassan, A.E.: Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering, pp. 78–88. IEEE Computer Society (2009)

    Google Scholar 

  15. Hatton, L.: Reexamining the fault density-component size connection. IEEE Softw. 14(2), 89 (1997)

    Article  Google Scholar 

  16. Heckerman, D., Geiger, D., Chickering, D.M.: Learning bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20(3), 197–243 (1995)

    MATH  Google Scholar 

  17. Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning (1993)

    Google Scholar 

  18. Jing, Y., Pavlović, V., Rehg, J.M.: Boosted bayesian network classifiers. Mach. Learn. 73(2), 155–184 (2008)

    Article  Google Scholar 

  19. Keogh, E., Pazzani, M.: Learning augmented bayesian classifiers: a comparison of distribution-based and classification-based approaches. In: Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, pp. 225–230. Citeseer (1999)

    Google Scholar 

  20. Khoshgoftaar, T.M., Allen, E.B.: Ordering fault-prone software modules. Software Qual. J. 11(1), 19–37 (2003)

    Article  Google Scholar 

  21. Korb, K.B., Nicholson, A.E.: Bayesian Artificial Intelligence. CRC Press, Boca Raton (2010)

    MATH  Google Scholar 

  22. Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Software Eng. 34(4), 485–496 (2008)

    Article  Google Scholar 

  23. Lipow, M.: Number of faults per line of code. IEEE Trans. Software Eng. 8(4), 437–439 (1982)

    Article  Google Scholar 

  24. Margaritis, D.: Learning Bayesian network model structure from data. Ph.D. thesis, US Army (2003)

    Google Scholar 

  25. Menzies, T., Di Stefano, J.S., Chapman, M., McGill, K.: Metrics that matter. In: Proceedings 27th Annual NASA Goddard/IEEE Software Engineering Workshop, pp. 51–57. IEEE (2002)

    Google Scholar 

  26. Moser, R., Pedrycz, W., Succi, G.: Analysis of the reliability of a subset of change metrics for defect prediction. In: Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 309–311. ACM (2008)

    Google Scholar 

  27. Muthukumaran, K., Rallapalli, A., Murthy, N.: Impact of feature selection techniques on bug prediction models. In: Proceedings of the 8th India Software Engineering Conference, pp. 120–129. ACM (2015)

    Google Scholar 

  28. Ohlsson, N., Alberg, H.: Predicting fault-prone software modules in telephone switches. IEEE Trans. Software Eng. 22(12), 886–894 (1996)

    Article  Google Scholar 

  29. Okutan, A., Yıldız, O.T.: Software defect prediction using bayesian networks. Empirical Softw. Eng. 19(1), 154–181 (2014)

    Article  Google Scholar 

  30. Russell, S., Norvig, P., Intelligence, A.: A Modern Approach. Artificial Intelligence. Prentice-Hall, Egnlewood Cliffs (1995). 25

    MATH  Google Scholar 

  31. Sacha, J.P., Goodenday, L.S., Cios, K.J.: Bayesian learning for cardiac spect image interpretation. Artif. Intell. Med. 26(1), 109–143 (2002)

    Article  Google Scholar 

  32. Sahami, M.: Learning limited dependence bayesian classifiers. In: KDD, vol. 96, pp. 335–338 (1996)

    Google Scholar 

  33. Scutari, M.: Learning bayesian networks with the bnlearn r package. arXiv preprint (2009). arXiv:0908.3817

  34. Shivaji, S., Whitehead, J.E.J., Akella, R., Kim, S.: Reducing features to improve bug prediction. In: Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering, pp. 600–604. IEEE Computer Society (2009)

    Google Scholar 

  35. Stecklein, J.M., Dabney, J., Dick, B., Haskins, B., Lovell, R., Moroney, G.: Error cost escalation through the project life cycle (2004)

    Google Scholar 

  36. Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)

    Article  Google Scholar 

  37. Wang, T., Li, W.H.: Naive bayes software defect prediction model. In: 2010 International Conference on Computational Intelligence and Software Engineering (CiSE), pp. 1–4 (2010)

    Google Scholar 

  38. Webb, G.I., Boughton, J.R., Wang, Z.: Not so naive bayes: aggregating one-dependence estimators. Mach. Learn. 58(1), 5–24 (2005)

    Article  MATH  Google Scholar 

  39. Zimmermann, T., Premraj, R., Zeller, A.: Predicting defects for eclipse. In: International Workshop on Predictor Models in Software Engineering, PROMISE 2007, ICSE Workshops 2007, p. 9. IEEE (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lalita Bhanu Murthy Neti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Muthukumaran, K., Srinivas, S., Malapati, A., Neti, L.B.M. (2018). Software Defect Prediction Using Augmented Bayesian Networks. In: Abraham, A., Cherukuri, A., Madureira, A., Muda, A. (eds) Proceedings of the Eighth International Conference on Soft Computing and Pattern Recognition (SoCPaR 2016). SoCPaR 2016. Advances in Intelligent Systems and Computing, vol 614. Springer, Cham. https://doi.org/10.1007/978-3-319-60618-7_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60618-7_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60617-0

  • Online ISBN: 978-3-319-60618-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics