Advertisement

Empirical Software Engineering

, Volume 23, Issue 5, pp 2655–2694 | Cite as

Studying software logging using topic models

  • Heng Li
  • Tse-Hsun (Peter) Chen
  • Weiyi Shang
  • Ahmed E. Hassan
Article

Abstract

Software developers insert logging statements in their source code to record important runtime information; such logged information is valuable for understanding system usage in production and debugging system failures. However, providing proper logging statements remains a manual and challenging task. Missing an important logging statement may increase the difficulty of debugging a system failure, while too much logging can increase system overhead and mask the truly important information. Intuitively, the actual functionality of a software component is one of the major drivers behind logging decisions. For instance, a method maintaining network communications is more likely to be logged than getters and setters. In this paper, we used automatically-computed topics of a code snippet to approximate the functionality of a code snippet. We studied the relationship between the topics of a code snippet and the likelihood of a code snippet being logged (i.e., to contain a logging statement). Our driving intuition is that certain topics in the source code are more likely to be logged than others. To validate our intuition, we conducted a case study on six open source systems, and we found that i) there exists a small number of “log-intensive” topics that are more likely to be logged than other topics; ii) each pair of the studied systems share 12% to 62% common topics, and the likelihood of logging such common topics has a statistically significant correlation of 0.35 to 0.62 among all the studied systems; and iii) our topic-based metrics help explain the likelihood of a code snippet being logged, providing an improvement of 3% to 13% on AUC and 6% to 16% on balanced accuracy over a set of baseline metrics that capture the structural information of a code snippet. Our findings highlight that topics contain valuable information that can help guide and drive developers’ logging decisions.

Keywords

Software logging Topic model Mining software repositories 

References

  1. Apache-Commons (2016) Best practices—logging exceptions. https://commons.apache.org/logging/guide.html
  2. Asuncion H U, Asuncion A U, Taylor R N (2010) Software traceability with topic modeling. In: Proceedings of the 32nd international conference on software engineering. ICSE ’10, pp 95–104Google Scholar
  3. Baldi PF, Lopes CV, Linstead EJ, Bajracharya SK (2008a) A theory of aspects as latent topics. In: Proceedings of the 23rd ACM SIGPLAN conference on object-oriented programming systems languages and applications. OOPSLA ’08, pp 543–562Google Scholar
  4. Baldi P F, Lopes C V, Linstead E J, Bajracharya S K (2008b) A theory of aspects as latent topics. In: ACM Sigplan notices, vol 43. ACM, pp 543–562Google Scholar
  5. Bavota G, Oliveto R, Gethers M, Poshyvanyk D, Lucia A D (2014) Methodbook: recommending move method refactorings via relational topic models. IEEE Trans Softw Eng 40(7):671–694CrossRefGoogle Scholar
  6. Binkley D, Heinz D, Lawrie D, Overfelt J (2014) Understanding LDA in source code analysis. In: Proceedings of the 22nd international conference on program comprehension, pp 26–36Google Scholar
  7. Blei D M, Ng A Y, Jordan M I (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022MATHGoogle Scholar
  8. Bring J (1994) How to standardize regression coefficients. Am Stat 48(3):209–213Google Scholar
  9. Brown P F, deSouza P V, Mercer R L, Pietra V J D, Lai J C (1992) Class-based n-gram models of natural language. Comput Linguist 18:467–479Google Scholar
  10. Chang J, Gerrish S, Wang C, Boyd-graber JL, Blei D M (2009) Reading tea leaves: how humans interpret topic models. Adv Neural Inf Process Syst 22:288–296Google Scholar
  11. Chen B, Jiang Z M (2017) Characterizing and detecting anti-patterns in the logging code. In: Proceedings of the 39th international conference on software engineering. ICSE ’17, pp 71–81Google Scholar
  12. Chen T-H, Thomas S W, Nagappan M, Hassan A (2012) Explaining software defects using topic models. In: Proceedings of the 9th working conference on mining software repositories. MSR ’12, pp 189– 198Google Scholar
  13. Chen T-H, Shang W, Hassan A E, Nasser M, Flora P (2016a) Cacheoptimizer: helping developers configure caching frameworks for hibernate-based database-centric web applications. In: Proceedings of the 24th ACM SIGSOFT international symposium on foundations of software engineering. FSE ’16, pp 666– 677Google Scholar
  14. Chen T-H, Thomas S W, Hassan A E (2016b) A survey on the use of topic models when mining software repositories. Empir Softw Eng 21(5):1843–1919Google Scholar
  15. Chen T-H, Syer M D, Shang W, Jiang Z M, Hassan A E, Nasser M, Flora P (2017a) Analytics-driven load testing: an industrial experience report on load testing of large-scale systems. In: Proceedings of the 39th international conference on software engineering: software engineering in practice track. ICSE-SEIP ’17, pp 243–252Google Scholar
  16. Chen T-H, Shang W, Nagappan M, Hassan A E, Thomas S W (2017b) Topic-based software defect explanation. J Syst Softw 129:79–106Google Scholar
  17. Cleary B, Exton C, Buckley J, English M (2008) An empirical analysis of information retrieval based concept location techniques in software comprehension. Empir Softw Eng 14(1):93–130CrossRefGoogle Scholar
  18. Cohen I, Goldszmidt M, Kelly T, Symons J, Chase J S (2004) Correlating instrumentation data to system states: a building block for automated diagnosis and control. In: Proceedings of the 6th conference on symposium on opearting systems design & implementation, pp 16–16Google Scholar
  19. De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2012) Using IR methods for labeling source code artifacts: is it worthwhile? In: Proceedings of the 20th international conference on program comprehension. ICPC ’12, pp 193–202Google Scholar
  20. De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2014) Labeling source code with information retrieval methods: an empirical study. Empir Softw Eng 19(5):1383–1420CrossRefGoogle Scholar
  21. Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRefGoogle Scholar
  22. Fu Q, Zhu J, Hu W, Lou J-G, Ding R, Lin Q, Zhang D, Xie T (2014) Where do developers log? An empirical study on logging practices in industry. In: Companion proceedings of the 36th international conference on software engineering. ICSE Companion ’14, pp 24–33Google Scholar
  23. Goshtasby A A (2012) Similarity and dissimilarity measures. In: Image registration: principles, tools and methods. Springer London, London, pp 7–66Google Scholar
  24. Groeneveld R A, Meeden G (1984) Measuring Skewness and Kurtosis. J R Stat Soc D (Stat) 33(4):391–399Google Scholar
  25. Hall D, Jurafsky D, Manning C D (2008) Studying the history of ideas using topic models. In: Proceedings of the 2008 conference on empirical methods in natural language processing. EMNLP ’08, pp 363–371. Association for Computational LinguisticsGoogle Scholar
  26. Hindle A, Bird C, Zimmermann T, Nagappan N (2014) Do topics make sense to managers and developers? Empir Softw EngGoogle Scholar
  27. Hu J, Sun X, Lo D, Li B (2015) Modeling the evolution of development topics using dynamic topic models. In: Proceedings of the 22nd IEEE international conference on software analysis, evolution, and reengineering. SANER’15, pp 3–12Google Scholar
  28. Kabacoff R (2011) R in action. Manning Publications Co., GreenwichGoogle Scholar
  29. Kabinna S, Bezemer C-P, Hassan A E, Shang W (2016) Examining the stability of logging statements. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering. SANER ’16Google Scholar
  30. Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, BerlinCrossRefMATHGoogle Scholar
  31. Kuhn A, Ducasse S, Gírba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49:230–243CrossRefGoogle Scholar
  32. Lal S, Sureka A (2016) Logopt: static feature extraction from source code for automated catch block logging prediction. In: Proceedings of the 9th India software engineering conference. ISEC ’16, pp 151– 155Google Scholar
  33. Li H, Shang W, Zou Y, Hassan AE (2017a) Towards just-in-time suggestions for log changes. Empir Softw Eng 22(4):1831–1865Google Scholar
  34. Li H, Shang W, Hassan AE (2017b) Which log level should developers choose for a new logging statement? Empir Softw Eng 22(4):1684–1716Google Scholar
  35. Linstead E, Lopes C, Baldi P (2008) An application of latent Dirichlet allocation to analyzing software evolution. In: Proceedings of seventh international conference on machine learning and applications. ICMLA ’12, pp 813–818Google Scholar
  36. Liu Y, Poshyvanyk D, Ferenc R, Gyimothy T, Chrisochoides N (2009a) Modeling class cohesion as mixtures of latent topics. In: Proceedings of the 25th international conference on software maintenance. ICSE ’09, pp 233–242Google Scholar
  37. Liu Y, Poshyvanyk D, Ferenc R, Gyimothy T, Chrisochoides N (2009b) Modeling class cohesion as mixtures of latent topics. In: Proceedings of the 25th IEEE international conference on software maintenance. ICSM ’09, pp 233–242Google Scholar
  38. Macbeth G, Razumiejczyk E, Ledesma R D (2011) Cliff’s delta calculator: a non-parametric effect size program for two groups of observations. Univ Psychol 10 (2):545–555Google Scholar
  39. Mariani L, Pastore F (2008) Automated identification of failure causes in system logs. In: Proceedings of the 2008 19th international symposium on software reliability engineering, pp 117–126Google Scholar
  40. Martin T M, Harten P, Young D M, Muratov E N, Golbraikh A, Zhu H, Tropsha A (2012) Does rational selection of training and test sets improve the outcome of qsar modeling? J Chem Inf Model 52(10):2570–2578CrossRefGoogle Scholar
  41. Maskeri G, Sarkar S, Heafield K (2008) Mining business topics in source code using latent Dirichlet allocation. In: Proceedings of the 1st India software engineering conference, pp 113–120Google Scholar
  42. McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng SE-2(4):308–320MathSciNetCrossRefMATHGoogle Scholar
  43. McCallum AK (2002) Mallet: a machine learning for language toolkitGoogle Scholar
  44. Microsoft-MSDN (2016) Logging an exception. https://msdn.microsoft.com/en-us/library/ff664711(v=pandp.50).aspx
  45. Misra H, Cappé O, Yvon F (2008) Using lda to detect semantically incoherent documents. In: Proceedings of the 12th conference on computational natural language learning. CoNLL ’08. Association for Computational Linguistics, pp 41–48Google Scholar
  46. Nguyen T T, Nguyen T N, Phuong T M (2011) Topic-based defect prediction. In: Proceedings of the 33rd international conference on software engineering. ICSE ’11, pp 932–935Google Scholar
  47. Oliner A, Ganapathi A, Xu W (2012) Advances and challenges in log analysis. Commun ACM 55(2):55–61CrossRefGoogle Scholar
  48. Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In: Proceedings of the 2013 international conference on software engineering. ICSE ’13, pp 522–531Google Scholar
  49. Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2016) Parameterizing and assembling ir-based solutions for se tasks using genetic algorithms. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering. SANER ’16Google Scholar
  50. Pecchia A, Cinque M, Carrozza G, Cotroneo D (2015) Industry practices and event logging: assessment of a critical software development process. In: Proceedings of the 37th international conference on software engineering. ICSE ’15, pp 169–178Google Scholar
  51. Poshyvanyk D, Gueheneuc Y, Marcus A, Antoniol G, Rajlich V (2007) Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans Softw Eng 33(6):420–432CrossRefGoogle Scholar
  52. Rao S, Kak A (2011) Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In: Proceeding of the 8th working conference on mining software repositories. MSR ’11, pp 43–52Google Scholar
  53. Romano J, Kromrey J D, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: should we really be using t-test and cohen’sd for evaluating group differences on the nsse and other surveys. In: Annual meeting of the Florida association of institutional research, pp 1–33Google Scholar
  54. Shang W, Jiang Z M, Adams B, Hassan A E, Godfrey M W, Nasser M, Flora P (2014) An exploratory study of the evolution of communicated information about the execution of large software systems. J Softw: Evol Process 26(1):3–26Google Scholar
  55. Shang W, Nagappan M, Hassan AE (2015) Studying the relationship between logging characteristics and the code quality of platform software. Empir Softw Eng 20 (1):1–27CrossRefGoogle Scholar
  56. Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1–13CrossRefGoogle Scholar
  57. Steyvers M, Griffiths T (2007) Probabilistic topic models. In: Handbook of latent semantic analysis, vol 427(7), pp 424–440Google Scholar
  58. Sun X, Li B, Leung H, Li B, Li Y (2015a) Msr4sm: using topic models to effectively mining software repositories for software maintenance tasks. Inf Softw Technol 66:1–12Google Scholar
  59. Sun X, Li B, Li Y, Chen Y (2015b) What information in software historical repositories do we need to support software maintenance tasks? An approach based on topic model. In: Computer and information science. Springer International Publishing, Cham, pp 27–37Google Scholar
  60. Sun X, Liu X, Li B, Duan Y, Yang H, Hu J (2016) Exploring topic models in software engineering data analysis: a survey. In: Proceedings of the 17th IEEE/ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing. SNPD’, vol. 16, pp 357–362Google Scholar
  61. Swinscow TDV, Campbell MJ et al (2002) Statistics at Square One. BMJ, LondonGoogle Scholar
  62. Syer MD, Jiang Z M, Nagappan M, Hassan A E, Nasser M, Flora P (2013) Leveraging performance counters and execution logs to diagnose memory-related performance issues. In: Proceedings of the 29th IEEE international conference on software maintenance. ICSM 13’, pp 110–119Google Scholar
  63. Thomas SW (2012) A lightweight source code preprocesser. https://github.com/doofuslarge/lscp
  64. Thomas S, Adams B, Hassan A E, Blostein D (2010) Validating the use of topic models for software evolution. In: Proceedings of the 10th international working conference on source code analysis and manipulation. SCAM ’10, pp 55–64Google Scholar
  65. Thomas S W, Adams B, Hassan A E, Blostein D (2011) Modeling the evolution of topics in source code histories. In: Proceedings of the 8th working conference on mining software repositories, pp 173–182Google Scholar
  66. Thomas S W, Adams B, Hassan A E, Blostein D (2014) Studying software evolution using topic models. Sci Comput Program 80:457–479CrossRefGoogle Scholar
  67. Tian K, Revelle M, Poshyvanyk D (2009) Using latent Dirichlet allocation for automatic categorization of software. In: Proceedings of the 6th international working conference on mining software repositories. MSR ’09, pp 163–166Google Scholar
  68. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B (Methodological) 58(1):267–288MathSciNetMATHGoogle Scholar
  69. Wallach H M, Mimno D M, McCallum A (2009) Rethinking lda: why priors matter. In: Advances in neural information processing systems. NIPS ’09, pp 1973–1981Google Scholar
  70. Witten I H, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San MateoMATHGoogle Scholar
  71. Xu W, Huang L, Fox A, Patterson D, Jordan M I (2009) Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles. SOSP ’09, pp 117–132Google Scholar
  72. Yuan D, Mai H, Xiong W, Tan L, Zhou Y, Pasupathy S (2010) Sherlog: error diagnosis by connecting clues from run-time logs. SIGARCH Comput Architect News 38(1):143–154CrossRefGoogle Scholar
  73. Yuan D, Zheng J, Park S, Zhou Y, Savage S (2011) Improving software diagnosability via log enhancement. In: Proceedings of the sixteenth international conference on architectural support for programming languages and operating systems. ASPLOS ’11, pp 3–14Google Scholar
  74. Yuan D, Park S, Huang P, Liu Y, Lee M M, Tang X, Zhou Y, Savage S (2012a) Be conservative: enhancing failure diagnosis with proactive logging. In: Proceedings of the 10th USENIX conference on operating systems design and implementation. OSDI’12, pp 293–306Google Scholar
  75. Yuan D, Park S, Zhou Y (2012b) Characterizing logging practices in open-source software. In: Proceedings of the 34th international conference on software engineering. ICSE ’12, pp 102–112Google Scholar
  76. Yuan D, Luo Y, Zhuang X, Rodrigues G R, Zhao X, Zhang Y, Jain P U, Stumm M (2014) Simple testing can prevent most critical failures: an analysis of production failures in distributed data-intensive systems. In: Proceedings of the 11th USENIX conference on operating systems design and implementation. OSDI’14, pp 249–265Google Scholar
  77. Zeng L, Xiao Y, Chen H (2015) Linux auditing: overhead and adaptation. In: Proceedings of 2015 IEEE international conference on communications. ICC ’15, pp 7168–7173Google Scholar
  78. Zhang S, Cohen I, Symons J, Fox A (2005) Ensembles of models for automated diagnosis of system performance problems. In: Proceedings of the 2005 international conference on dependable systems and networks. DSN ’05, pp 644–653Google Scholar
  79. Zhu J, He P, Fu Q, Zhang H, Lyu M R, Zhang D (2015) Learning to log: helping developers make informed logging decisions. In: Proceedings of the 37th international conference on software engineering, vol 1. ICSE ’15, pp 415–425Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Heng Li
    • 1
  • Tse-Hsun (Peter) Chen
    • 2
  • Weiyi Shang
    • 2
  • Ahmed E. Hassan
    • 1
  1. 1.Software Analysis and Intelligence Lab (SAIL)Queen’s UniversityKingstonCanada
  2. 2.Department of Computer Science and Software EngineeringConcordia UniversityMontrealCanada

Personalised recommendations