Abstract
Annotating programs with natural language comments is a common programming practice to increase the readability of code. While researchers have attempted specific tasks like comment quality analysis or classification, there is absence of an integrated knowledge representation based on comments to aid program comprehension. We propose Comment-Mine a semantic search architecture, which extracts knowledge related to design, implementation and evolution of a software in the form of a knowledge graph. This supports program comprehension and various comment analysis tasks. We manually annotate concepts for 5600 comments extracted from 672 C/C++ files/projects crawled from code repositories like GitHub. Comment-Mine extracts 38,992 concepts, out of which 79.8% is correct and validated using manual annotation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dehaghani, S.M.H., Hajrahimi, N.: Which factors affect software projects maintenance cost more? Acta Inf. Med. Acad. Med. Sci. Bosnia Herzegovina 21(1), 63–72 (2013)
Koskinen, J.: Software maintenance costs. Information Technology Research Institute, University of Jyvaskyla, Tech. rep. (2003)
Singer, J., Lethbridge, T., Vinson, N.: An examination of software engineering work practices. In: CASCON High Impact Papers, pp. 174–188. IBM Corp. (2010)
Etzkorn, L.H., Davis, C.G., Bowen, L.L.: The language of comments in computer software: a sublanguage of english. J. Pragmatics 33(11), 1731–1756 (2001). (Elsevier)
Abebe, S.L., Haiduc, S., Marcus, A., Tonella, P., Antoniol, G.: Analyzing the evolution of the source code vocabulary. In: European Conference on Software Maintenance and Reengineering (ESMR), pp. 189–198. IEEE, New York (2009)
Stroustrup, B., Sutter, H.: C++ core guidelines. http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines. Last Accessed 1 Aug 2019
Github. https://github.com/
Corazza, A., Maggio, V., Scanniello, G.: Coherence of comments and method implementations: a dataset and an empirical investigation. Software Qual. J. 26(2), 751–777 (2018)
de Souza, S.C.B., Anquetil, N., de Oliveira, K.M.: A study of the documentation essential to software maintenance. In: International Conference on Design of Communication, pp. 68–75. ACM, New York (2005)
Aman, H., Amasaki, S., Yokogawa, T., Kawahara, M.: Empirical analysis of words in comments written for java methods. In: Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 375–379. IEEE, New York (2017)
Padioleau, Y., Tan, L., Zhou, Y.: Listening to programmers taxonomies and characteristics of comments in operating system code. In: International Conference on Software Engineering (ICSE), pp. 331–341. IEEE, New York (2009)
Haouari, D., Sahraoui, H., Langlais, P.: How good is your comment? a study of comments in Java programs. In: International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 137–146. IEEE, New York (2011)
Steidl, D., Hummel, B., Juergens, E.: Quality analysis of source code comments. In: International Conference on Program Comprehension (ICPC), pp. 83–92. IEEE, New York (2013)
Tan, L., Yuan, D., Krishna, G., Zhou, Y.: icomment: bugs or bad comments? In: Association for Computing Machinery’s Special Interest Group on Operating Systems Review (SIGOPS), pp. 145–158. ACM, New York (2007)
Miller, E.: An introduction to the resource description framework. Bull. Am. Soc. Inf. Sci. Technol. 25(1), 15–19 (1998). (Wiley Online Library)
Prud, E., Seaborne, A., et al.: SPARQL query language for RDF (2006)
Codeproject. https://www.codeproject.com/
Begel, A., Zimmermann, T.: Analyze this! 145 questions for data scientists in software engineering. In: International Conference on Software Engineering (ICSE), pp. 12–23. ACM, New York (2014)
Sillito, J., Murphy, G.C., De Volder, K.: Asking and answering questions during a programming change task. IEEE Trans. Software Eng. 34(4), 434–451 (2008). (IEEE)
Majumdar, S., Shakti, P., Das, P.P., Ghosh, S.: Smartkt: a search framework to assist program comprehension using smart knowledge transfer. In: International Conference on Software Quality, Reliability and Security (QRS), pp. 97–108. IEEE, New York (2019)
Pascarella, L., Bacchelli, A.: Classifying code comments in java open-source software systems. In: International Conference on Mining Software Repositories (MSR), pp. 227–237. IEEE, New York (2017)
Tan, L., Yuan, D., Zhou, Y.: Hotcomments: how to make program comments more useful? In: Conference on Programming Language Design and Implementation (SIGPLAN), pp. 20–27. ACM, New York (2007)
Howden, W.E.: Comments analysis and programming errors. IEEE Trans. Software Eng. 16(1), 72–81 (1990). (IEEE)
Krancher, O., Dibbern, J.: Knowledge in software-maintenance outsourcing projects: beyond integration of business and technical knowledge. In: Hawaii International Conference on System Sciences, pp. 4406–4415. IEEE, New York (2015)
Lattner, C., Adve, V.: The llvm compiler framework and infrastructure tutorial. In: International Workshop on Languages and Compilers for Parallel Computing (LCPC), pp. 15–16. Springer, Berlin (2004)
De Marneffe, M.C., Manning, C.D.: The Stanford typed dependencies representation. In: Proceedings of the Workshop on Cross-Framework and Cross-Domain Parser Evaluation, pp. 1–8. Association for Computational Linguistics (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Majumdar, S., Papdeja, S., Das, P.P., Ghosh, S.K. (2020). Comment-Mine—A Semantic Search Approach to Program Comprehension from Code Comments. In: Chaki, R., Cortesi, A., Saeed, K., Chaki, N. (eds) Advanced Computing and Systems for Security. Advances in Intelligent Systems and Computing, vol 1136. Springer, Singapore. https://doi.org/10.1007/978-981-15-2930-6_3
Download citation
DOI: https://doi.org/10.1007/978-981-15-2930-6_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-2929-0
Online ISBN: 978-981-15-2930-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)