Abstract
The aim of statistical debugging is to identify faulty predicates that have strong effect on program failure. In this paper predicates are fitted into a linear regression model to consider the vertical effect of predicates on each other and on program termination status. Prior approaches have merely considered predicates in isolation. The proposed approach in this paper is a two step procedure which includes hierarchical clustering and the Lasso regression method. Hierarchical clustering builds a tree structure of correlated predicates. The Lasso method is applied on the clusters in some specified levels of the tree. This makes the method scalable in terms of the size of a program. Unlike other statistical methods which do not provide any context of the failure, the predicates contained in the group that is provided by this method can be used as the bug signature. The method has been evaluated on two well-known test suites, Space and Siemens. The experimental results reveal the accuracy and precision of the approach comparing with similar techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Liblit, B.: Cooperative Bug Isolation. PhD thesis. University of California, Berkeley (2004)
Jones, J.A., Harrold, M.J.: Empirical evaluation of the tarantula automatic fault localization technique. In: 20th IEEE/ACM International Conference on Automated Software Engineering, pp. 273–282. ACM Press, Long Beach (2005)
Liu, C., Yan, X., Fei, L., Han, J., Midkiff, S.P.: Sober: Statistical model-based bug localization. In: 10th European Software Eng. Conf./13th ACM SIGSOFT Int’l Symposium Foundations of Software Engineering, pp. 286–295. ACM Press, Lisbon (2005)
Jiang, L., Su, Z.: Context-aware statistical debugging: from bug predictors to faulty control flow paths. In: Twenty-Second IEEE/ACM International Conference on Automated Software Engineering, pp. 184–193. ACM Press, Atlanta (2007)
Arumuga Nainar, P., Chen, T., Rosin, J., Liblit, B.: Statistical debugging using compound Boolean predicates. In: International Symposium on Software Testing and Analysis, pp. 5–15. ACM Press, London (2007)
Zeller, A.: Why Programs Fail: A Guide to Systematic Debugging. Morgan Kaufmann, San Francisco (2006)
Liblit, B., Naik, M., Zheng, A., Aiken, A., Jordan, M.: Scalable Statistical Bug Isolation. In: Int’l Conference Programming Language Design and Implementation, Chicago, pp. 15–26 (2005)
Fei, L., Lee, K., Li, F., Midkiff, S.P.: Argus: Online statistical bug detection. In: Baresi, L., Heckel, R. (eds.) FASE 2006. LNCS, vol. 3922, pp. 308–323. Springer, Heidelberg (2006)
Chatterjee, S., Hadi, A., Price, B.: Regression Analysis by Example, 4th edn. Wiley Series in Probability and Statistics, New York (2006)
Hastie, T.J., Tibshirani, R.J., Friedman, J.: The Elements of Statistical Learning: Data Mining Inference and Prediction. Springer, New York (2001)
Tibshirani, R.: Optimal Reinsertion: Regression shrinkage and selection via the lasso. J. R. Statist. Soc. 58, 267–288 (1996)
Zheng, A.X., Jordan, M.I., Liblit, B., Naik, M., Aiken, A.: Statistical debugging: simultaneous identification of multiple bugs. In: 23rd International Conference on Machine Learning, pp. 1105–1112. ACM Press, NY (2006)
Liblit, B., Aiken, A., Zheng, X., Jordan, M.I.: Bug isolation via remote program sampling. In: ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, pp. 141–154. ACM Press, San Diego (2003)
Friedman, J., Hastie, T., Tibshirani, R.: Lasso: Glmnet for Matlab - Lasso (L1) and elastic-net regularized generalized linear models
Friedman, J., Hastie, T., Tibshirani, R.: Lasso: Glmnet for R (2011), http://cran.r-project.org/web/packages/glmnet/index.html
Software-artifact infrastructure repository, http://sir.unl.edu/portal
Cleve, H., Zeller, A.: Locating causes of program failures. In: 27th International Conf. on Software Engineering, St. Louis Missouri, pp. 342–351 (2005)
Renieris, M., Reiss, S.: Fault localization with nearest neighbor queries. In: 18th IEEE International Conference on Automated Software Engineering, Montreal, pp. 30–39 (2003)
Parsa, S., Vahidi-Asl, M., Arabi, S.: Finding Causes of Software Failure Using Ridge Regression and Association Rule Generation Methods. In: Ninth ACIS International Conference on Parallel/Distributed Computing, Phuket, pp. 873–878 (2008)
Parsa, S., Arabi, S., Vahidi-Asl, M.: Statistical Software Debugging: From Bug Predictors to the Main Causes of Failure. In: Software Metrics and Measurement: SMM 2009 in Conjunction with the Second International Conference on Application of Digital Information and Web Technologies, London, pp. 802–807 (2009)
Cheng, H., Lo, D., Zhou, Y., Wang, X.: Identifying Bug Signatures Using Discriminative Graph Mining. In: International Symptoms on Software Testing and Analysis, pp. 141–151. ACM Press, Chicago (2009)
Park, M., Hastie, T., Tibshirani, R.: Averaged gene expressions for regression. Biostatistics Journal, 212–227 (2007)
Eisen, M.: Hierarchical Clustering: Cluster and TreeView are an integrated pair of programs for analyzing and visualizing the results of complex microarray experiments, http://rana.lbl.gov/EisenSoftware.htm
Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster analysis and display of genomewide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 95, 14863–14868 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Parsa, S., Asadi-Aghbolaghi, M., Vahidi-Asl, M. (2011). Statistical Debugging Using a Hierarchical Model of Correlated Predicates. In: Deng, H., Miao, D., Lei, J., Wang, F.L. (eds) Artificial Intelligence and Computational Intelligence. AICI 2011. Lecture Notes in Computer Science(), vol 7002. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23881-9_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-23881-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23880-2
Online ISBN: 978-3-642-23881-9
eBook Packages: Computer ScienceComputer Science (R0)