Abstract
The availability of automated refactoring tools in modern development environments allows programmers to refactor their code with ease. Such tools, however, enable developers to inadvertently create code clones that quickly diverge in form but not in meaning. Furthermore, in the hands of those looking to confuse plagiarism-detection tools, automated refactoring may be abused to avoid discovery of copied code.
We present Cider, an algorithm that can detect code clones regardless of various refactorings that may have been applied to some of the copies but not to others. Most significant is the ability to discover interprocedural clones, where parts of one copy have been extracted to separate methods. We evaluated Cider on several open-source Java projects, attempting to detect interprocedural clones between successive versions of each project. Interprocedural clones were detected in all evaluated projects, demonstrating the pervasive nature of the problem. Compared to a manual assessment, Cider performed well in terms of both recall and precision.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Balint, M., Marinescu, R., Girba, T.: How developers copy. In: Proc. 14th IEEE Int’l Conf. Program Comprehension (ICPC 2006), pp. 56–68 (2006)
Opdyke, W.F.: Refactoring Object-Oriented Frameworks. PhD thesis, University of Illinois at Urbana-Champaign (1992)
Fowler, M.: Refactoring: Improving the Design of Existing Code. Addison-Wesley (2000)
Abadi, A., Ettinger, R., Feldman, Y.A.: Fine slicing: Theory and applications for computation extraction. In: de Lara, J., Zisman, A. (eds.) FASE 2012. LNCS, vol. 7212, pp. 471–485. Springer, Heidelberg (2012)
Schleimer, S., Wilkerson, D.S., Aiken, A.: Winnowing: Local algorithms for document fingerprinting. In: Proc. 2003 ACM SIGMOD Int’l Conf. Management of Data (SIGMOD), pp. 76–85 (2003)
Jia, Y., Binkley, D., Harman, M., Krinke, J., Matsushita, M.: KClone: A proposed approach to fast precise code clone detection. In: Proc. Third Int’l Workshop on Software Clones, IWSC (2009)
Jiang, L., Misherghi, G., Su, Z., Glondu, S.: DECKARD: Scalable and accurate tree-based detection of code clones. In: Proc. 29th Int’l Conf. Software Engineering (ICSE), pp. 96–105 (2007)
Kamiya, T., Kusumoto, S., Inoue, K.: CCFinder: A multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Software Engineering 28(7), 654–670 (2002)
Baxter, I., Yahin, A., Moura, L., Sant’Anna, M., Bier, L.: Clone detection using abstract syntax trees. In: Proceedings of the International Conference on Software Maintenance, pp. 368–377 (November 1998)
Komondoor, R., Horwitz, S.: Using slicing to identify duplication in source code. In: Cousot, P. (ed.) SAS 2001. LNCS, vol. 2126, pp. 40–56. Springer, Heidelberg (2001)
Krinke, J.: Identifying similar code with program dependence graphs. In: Proc. Eighth Working Conference on Reverse Engineering (WCRE 2001), pp. 301–309 (2001)
Gabel, M., Jiang, L., Su, Z.: Scalable detection of semantic clones. In: Proc. 30th Int’l Conf. Software Engineering (ICSE), pp. 321–330 (2008)
Roy, C.K., Cordy, J.R., Koschke, R.: Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Sci. of Comp. Prog. 74(7), 470–495 (2009)
Soares, G., Catao, B., Varjao, C., Aguiar, S., Gheyi, R., Massoni, T.: Analyzing refactorings on software repositories. In: Proc. 25th Brazilian Symp. Software Engineering (SBES), pp. 164–173 (2011)
Liu, C., Chen, C., Han, J., Yu, P.S.: GPLAG: Detection of software plagiarism by program dependence graph analysis. In: Proc. 12th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining (KDD), pp. 872–881 (2006)
Rich, C., Waters, R.C.: The Programmer’s Apprentice. ACM Press and Addison Wesley (1990)
Rich, C.: A formal representation for plans in the Programmer’s Apprentice. In: Proc. 7th Int. Joint Conf. Artificial Intelligence, Vancouver, British Columbia, Canada, pp. 1044–1052 (August 1981)
Feldman, Y.A., Friedman, D.A.: Portability by automatic translation: A large-scale case study. Artificial Intelligence 107(1), 1–28 (1999)
Cohen, Y., Feldman, Y.A.: Automatic high-quality reengineering of database programs by abstraction, transformation, and reimplementation. ACM Trans. Software Engineering and Methodology 12(3), 285–316 (2003)
Ferrante, J., Ottenstein, K.J., Warren, J.D.: The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. 9(3), 319–349 (1987)
Dig, D., Comertoglu, C., Marinov, D., Johnson, R.: Automated detection of refactorings in evolving components. In: Thomas, D. (ed.) ECOOP 2006. LNCS, vol. 4067, pp. 404–428. Springer, Heidelberg (2006)
Weiser, M.: Program slicing. IEEE Trans. Software Engineering 10(4) (July 1984)
Godfrey, M.W., Zou, L.: Using origin analysis to detect merging and splitting of source code entities. IEEE Trans. Software Engineering 31, 166–181 (2005)
Horwitz, S., Reps, T., Binkley, D.: Interprocedural slicing using dependence graphs. ACM Trans. Program. Lang. Syst. 12(1), 26–60 (1990)
Higo, Y., Kusumoto, S.: Code clone detection on specialized PDGs with heuristics. In: Proc. 15th European Conf. Soft. Maintenance and Reengineering, CSMR (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shomrat, M., Feldman, Y.A. (2013). Detecting Refactored Clones. In: Castagna, G. (eds) ECOOP 2013 – Object-Oriented Programming. ECOOP 2013. Lecture Notes in Computer Science, vol 7920. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39038-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-39038-8_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39037-1
Online ISBN: 978-3-642-39038-8
eBook Packages: Computer ScienceComputer Science (R0)