Classifying Python Code Comments Based on Supervised Learning

  • Jingyi Zhang
  • Lei XuEmail author
  • Yanhui Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11242)


Code comments can provide a great data source for understanding programmer’s needs and underlying implementation. Previous work has illustrated that code comments enhance the reliability and maintainability of the code, and engineers use them to interpret their code as well as help other developers understand the code intention better. In this paper, we studied comments from 7 python open source projects and contrived a taxonomy through an iterative process. To clarify comments characteristics, we deploy an effective and automated approach using supervised learning algorithms to classify code comments according to their different intentions. With our study, we find that there does exist a pattern across different python projects: Summary covers about 75% of comments. Finally, we conduct an evaluation on the behaviors of two different supervised learning classifiers and find that Decision Tree classifier is more effective on accuracy and runtime than Naive Bayes classifier in our research.


Code comments classification Supervised learning Python 


  1. 1.
    Arafati, O., Riehle, D.: The comment density of open source software code. In: 2009 31st International Conference on Software Engineering - Companion Volume, pp. 195–198, May 2009.
  2. 2.
    Fjeldstad, R.K., Hamlen, W.T.: Application program maintenance study: report to our respondents. In: Proceedings GUIDE, vol. 48, April 1983Google Scholar
  3. 3.
    Fluri, B., Wursch, M., Gall, H.C.: Do code and comments co-evolve? On the relation between source code and comment changes. In: Proceedings of the 14th Working Conference on Reverse Engineering, WCRE 2007, pp. 70–79. IEEE Computer Society, Washington, DC (2007)Google Scholar
  4. 4.
    Lidwell, W., Holden, K., Butler, J.: Universal Principles of Design, Revised and Updated: 125 Ways to Enhance Usability, Influence Perception, Increase Appeal, Make Better Design Decisions. Rockport Publishers, Beverly (2010)Google Scholar
  5. 5.
    Nurvitadhi, E., Leung, W.W., Cook, C.: Do class comments aid Java program understanding? In: 33rd Annual Frontiers in Education, FIE 2003, vol. 1, pp. T3C-13–T3C-17, November 2003.
  6. 6.
    Howden, W.E.: Comments analysis and programming errors. IEEE Trans. Softw. Eng. 16(1), 72–81 (1990)CrossRefGoogle Scholar
  7. 7.
    Pascarella, L., Bacchelli, A.: Classifying code comments in Java open-source software systems. In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pp. 227–237, May 2017Google Scholar
  8. 8.
    Steidl, D., Hummel, B., Juergens, E.: Quality analysis of source code comments. In: 2013 21st International Conference on Program Comprehension (ICPC), pp. 83–92, May 2013.

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Management and EngineeringNanjing UniversityNanjingChina
  2. 2.Department of Computer Science and TechnologyNanjing UniversityNanjingChina

Personalised recommendations