Abstract
Easy availability of information and code on Internet has increased, leading to an exponential rise in plagiarism. Plagiarism can be broadly classified into two subsections, i.e., Text/Image and Source code. With freely available plagiarism detection techniques, people are becoming aware of how to abuse the system by using some tips and tricks. This paper focuses on techniques of detection, plagiarism tools and a brief discussion on similarity measures currently available. The paper discusses some of the traditional techniques established to detect plagiarism on source code such as Measure of Software Similarity (MOSS) by MIT, JPLAG popular plagiarism detection tools on Java source code, and some of the unique and better ways to detect plagiarism using Parse Trees, Program Dependency Graph, and Machine learning along with the advantages and disadvantages of each technique. It also includes a brief comparison of similarity measures used by various techniques and evaluation techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Plagiarism-Wikipedia, https://en.wikipedia.org/wiki/Plagiarism#cite_note-22. Accessed 04 Nov 2019
H. Chowdhury, D. Bhattacharyya, Plagiarism: taxonomy, tools and detection techniques. arXiv preprint arXiv:1801.06323 (2018)
S.M. Alzahrani, N. Salim, A. Abraham, Understanding plagiarism linguistic patterns, textual features and detection methods, in IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 2 (IEEE, New York, 2012), pp. 133–149
V. Kelly Adam, Method for detection plagiarism, Patent No. US6976170
M. Wise, String similarity via greedy string tiling and running Karp − Rabin matching, Unpublished Basser Department of Computer Science Report (1993)
Wikipedia, http://en.wikipedia.org/wiki/Obfuscation_(software). Accessed 12 Nov 2019
S. Schleimer, D. Wilkerson, A. Aiken, Winnowing: local algorithms for document fingerprinting, in Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (ACM, 2003), pp. 76–85
E. Stamatatos, Intrinsic plagiarism detection using character n-gram profiles, Threshold 2(1), 500 (2009)
M. Peveler, T. Gurjar, E. Maicus, A. Aikens, A. Christoforides, B. Cutler, Lichen: customizable, open source plagiarism detection in submitty, in 50th ACM Technical Symposium on Computer Science Education, Minneapolis, USA (2019)
J. Son, S. Park, S. Park, Program plagiarism detection using parse tree kernels, in Pacific Rim International Conference on Artificial Intelligence (Springer, Berlin, Heidelberg, 2006), pp. 1000–1004
ANTLR Homepage, https://www.antlr.org/. Accessed 04 Nov 2019
C. Liu, C. Chen, J. Han, P.S. Yu, GPLAG: detection of software plagiarism by program dependence graph analysis, in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, Philadelphia, USA, 2006), pp. 872–881
J.A. Faidhi, S.K. Robinson, An empirical approach for detecting program similarity and plagiarism within a university programming environment. Comput. Educ. 11(1), pp. 11–19 (1987)
M. Halstead, Elements of Software Science (Elsevier, New York, 1977)
A. Asadullah, M. Basavaraju, I. Stern, V. Bhat, Design patterns based pre-processing of source code for plagiarism detection, in 2012 19th Asia-Pacific Software Engineering Conference vol. 2 (IEEE, Hongkong, China, 2012), pp. 128–135
O.M. Mirza, M. Joy, G. Cosma, Style analysis for source code plagiarism detection—an analysis of a dataset of student coursework, in IEEE 17th International Conference on Advanced Learning Technologies (ICALT) (Timisoara, Romania, 2017), pp. 296–297
J. Yasawi, B. Katta, G. Srikailash, A. Chilupuri, S. Purini, C. Jawahar, Unsupervised learning based approach for plagiarism detection in programming assignments, in ISEC. 2017, Jaipur, India (2017)
J. Yasaswi, S. Purini, C.V. Jawahar, Plagiarism detection in programming assignments using deep features, in 4th Asian Conference on Pattern Recognition (ACPR 2017), Nanjing, China (2017)
M. Abuhamad, J. Rhim, T. AbuHmed, S. Ullah, D. Nyang, Code authorship identification using convolutional neural networks. Future Gen. Comput. Syst. 104–115 (2018)
L. Prechelt, G. Malpohl, Finding plagiarisms among a set of programs with JPlag. J. Univ. Comput. Sci. 8(11), 1016–1038 (2003)
H. Song, S. Park, S. Young Park, Computation of program source code similarity by composition of parse tree and call graph, in Mathematical Problems in Engineering, vol. 2015 (Hindawi, United Kingdom, 2015)
L. Moussiades, A. Vakali, PDetect: a clustering approach for detecting plagiarism in source code datasets. Comput. J. 48(6), 651–661 (2005)
L. Sulistiani, O. Karnalim, ES-Plag: efficient and sensitive source code plagiarism detection tool for academic environment. Comput. Appl. Eng. Educ. 27(1), 166–182 (2019)
StackOverflow, https://stackoverflow.com/questions/46872521/draw-a-program-dependence-graph-with-graphviz. Accessed 04 Nov 2019
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pandit, A.A., Toksha, G. (2020). Review of Plagiarism Detection Technique in Source Code. In: Singh Tomar, G., Chaudhari, N.S., Barbosa, J.L.V., Aghwariya, M.K. (eds) International Conference on Intelligent Computing and Smart Communication 2019. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-0633-8_38
Download citation
DOI: https://doi.org/10.1007/978-981-15-0633-8_38
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0632-1
Online ISBN: 978-981-15-0633-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)