Empirical Assessment of LR- and ANN-Based Fault Prediction Techniques

  • Bindu Goel
  • Yogesh Singh
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 27)


At the present time, because of our reliance on software systems, there is a need for dynamic dependability assessment to ensure that these systems will perform as specified under various conditions. One approach to achieving this is to dynamically assess the modules for software fault predictions. Software fault prediction, as well as inspection and testing, are still the prevalent methods of assuring the quality of software. Software metrics-based approaches to build quality models can predict whether a software module will be fault-prone or not. The application of these models can assist to focus quality improvement efforts on modules that are likely to be faulty during operations, thereby cost-effectively utilizing the software quality testing and enhancement resources. In the present paper, the statistical model, such as logistic regression (LR), and the machine learning approaches, such as artificial neural networks (ANN), have been investigated for predicting fault proneness. We evaluate the two predictor models on three main components: a single data sample, a common evaluation parameter, and cross validations. The study shows that ANN techniques perform better than LR; but that LR, being a simpler technique, is also a good quality indicator technique.


Receiver Operating Characteristic Curve Artificial Neural Network Model Artificial Neural Network Technique Bayesian Regularization Conditional Number 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Khoshgoftaar TM, Allen EB, Ross FD, Munikoti R, Goel N, Nandi A (1997) Predicting fault-prone modules with case-based reasoning. In: ISSRE 1997, IEEE computer society, the eighth international symposium on software engineering, pp 27–35Google Scholar
  2. 2.
    Porter AA, Selby RW (1990) Empirically guided software development using metric-based classification trees. IEEE Software 7(2):46–54CrossRefGoogle Scholar
  3. 3.
    Selby RW, Porter AA (1988) Learning from examples: generation and evaluation of decision trees for software resource analysis. IEEE T Software Eng 14(12):1743–1757CrossRefGoogle Scholar
  4. 4.
    Khoshgoftaar TM, Lanning DL, Pandya AS (1994) A comparative study of pattern-recognition techniques for quality evaluation of telecommunications software. IEEE J Sel Area Comm 12(2):279–291CrossRefGoogle Scholar
  5. 5.
    Fenton NE, Neil M (1999) A critique of software defect prediction models. IEEE T Software Eng 25(5):675–689CrossRefGoogle Scholar
  6. 6.
    Munson JC, Khoshgoftaar TM (1992) The detection of fault-prone programs. IEEE T Software Eng 18(5):423–33CrossRefGoogle Scholar
  7. 7.
    Khoshgoftaar TM, Allen EB, Halstead R, Trio GP, Flass RM (1998) Using process history to predict software quality. Computer 31(4):66–72CrossRefGoogle Scholar
  8. 8.
    Briand L, Basili V, Thomas W (1992) A pattern recognition approach for software engineering data analysis. IEEE T Software Eng 18(11):931–942CrossRefGoogle Scholar
  9. 9.
    Morasca S, Ruhe G (2000) A hybrid approach to analyze empirical software engineering data and its application to predict module fault-proneness in maintenance. J Syst Software 53(3):225–237CrossRefGoogle Scholar
  10. 10.
    Evanco W (1997) Poisson analyses of defects for small software components. J Syst Software 38:27–35CrossRefGoogle Scholar
  11. 11.
    El-Emam K, Melo W, Machado J (1999) The prediction of faulty classes using object-oriented design metrics. J Syst SoftwareGoogle Scholar
  12. 12.
    Thwin MMT, Quah TS (2002) Application of neural network for predicting software development faults using object-oriented design metrics. In: Proceedings of the 9th international conference on neural information processing, pp 2312–2316Google Scholar
  13. 13.
    Osamu M, Shiro I, Shuya N, Tohru K (2007) Spam filter based approach for finding fault-prone software modules. In: 29th international conference on software engineering workshops (ICSEW’07)Google Scholar
  14. 14.
    Munson JC, Khoshgoftaar TM (1992) The detection of fault-prone programs. IEEE T Software Eng 18(5):423–433CrossRefGoogle Scholar
  15. 15.
    Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented metrics as quality indicators. IEEE T Software Eng 22(10):751–761CrossRefGoogle Scholar
  16. 16.
    Metrics Data Program, NASA IV&V Facility:
  17. 17.
    Chidamber SR, Kemerer CF (1994) A metrics suite for object-oriented design. IEEE T Software Eng, 20(6):476–493CrossRefGoogle Scholar
  18. 18.
    McCabe TJ (1976) A complexity measure. IEEE T Software Eng SE-2(4):308–320CrossRefMathSciNetGoogle Scholar
  19. 19.
    Henry S, Kafura D (1981) Software structure metrics based on information flow. IEEE T Software Eng SE-7(5):510–518CrossRefGoogle Scholar
  20. 20.
    Hosmer D, Lemeshow S (1989) Applied logistic regression. Wiley-Interscience, New YorkGoogle Scholar
  21. 21.
    Barnett V, Price T (1995) Outliers in statistical data. John Wiley & Sons, New YorkGoogle Scholar
  22. 22.
    Belsley D, Kuh E, Welsch R (1980) Regression diagnostics: identifying influential data and sources of collinearity. John Wiley & Sons, New YorkCrossRefMATHGoogle Scholar
  23. 23.
    Hanley J, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic ROC curve. Radiology 143:29–36Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.University School of Information Technology, Guru Gobind Singh Indraprastha UniversityKashmere GateIndia

Personalised recommendations