Skip to main content

Classifying Python Code Comments Based on Supervised Learning

  • Conference paper
  • First Online:
Web Information Systems and Applications (WISA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11242))

Included in the following conference series:

Abstract

Code comments can provide a great data source for understanding programmer’s needs and underlying implementation. Previous work has illustrated that code comments enhance the reliability and maintainability of the code, and engineers use them to interpret their code as well as help other developers understand the code intention better. In this paper, we studied comments from 7 python open source projects and contrived a taxonomy through an iterative process. To clarify comments characteristics, we deploy an effective and automated approach using supervised learning algorithms to classify code comments according to their different intentions. With our study, we find that there does exist a pattern across different python projects: Summary covers about 75% of comments. Finally, we conduct an evaluation on the behaviors of two different supervised learning classifiers and find that Decision Tree classifier is more effective on accuracy and runtime than Naive Bayes classifier in our research.

Supported by the Natural Science Foundation of Jiangsu Province of China (Grant No. BK20140611), the Natural Science Foundation of China (Grant Nos. 61272080, 61403187).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://textblob.readthedocs.io/en/dev/.

References

  1. Arafati, O., Riehle, D.: The comment density of open source software code. In: 2009 31st International Conference on Software Engineering - Companion Volume, pp. 195–198, May 2009. https://doi.org/10.1109/ICSE-COMPANION.2009.5070980

  2. Fjeldstad, R.K., Hamlen, W.T.: Application program maintenance study: report to our respondents. In: Proceedings GUIDE, vol. 48, April 1983

    Google Scholar 

  3. Fluri, B., Wursch, M., Gall, H.C.: Do code and comments co-evolve? On the relation between source code and comment changes. In: Proceedings of the 14th Working Conference on Reverse Engineering, WCRE 2007, pp. 70–79. IEEE Computer Society, Washington, DC (2007)

    Google Scholar 

  4. Lidwell, W., Holden, K., Butler, J.: Universal Principles of Design, Revised and Updated: 125 Ways to Enhance Usability, Influence Perception, Increase Appeal, Make Better Design Decisions. Rockport Publishers, Beverly (2010)

    Google Scholar 

  5. Nurvitadhi, E., Leung, W.W., Cook, C.: Do class comments aid Java program understanding? In: 33rd Annual Frontiers in Education, FIE 2003, vol. 1, pp. T3C-13–T3C-17, November 2003. https://doi.org/10.1109/FIE.2003.1263332

  6. Howden, W.E.: Comments analysis and programming errors. IEEE Trans. Softw. Eng. 16(1), 72–81 (1990)

    Article  Google Scholar 

  7. Pascarella, L., Bacchelli, A.: Classifying code comments in Java open-source software systems. In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pp. 227–237, May 2017

    Google Scholar 

  8. Steidl, D., Hummel, B., Juergens, E.: Quality analysis of source code comments. In: 2013 21st International Conference on Program Comprehension (ICPC), pp. 83–92, May 2013. https://doi.org/10.1109/ICPC.2013.6613836

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, J., Xu, L., Li, Y. (2018). Classifying Python Code Comments Based on Supervised Learning. In: Meng, X., Li, R., Wang, K., Niu, B., Wang, X., Zhao, G. (eds) Web Information Systems and Applications. WISA 2018. Lecture Notes in Computer Science(), vol 11242. Springer, Cham. https://doi.org/10.1007/978-3-030-02934-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02934-0_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02933-3

  • Online ISBN: 978-3-030-02934-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics