Abstract
We report on an exploratory analysis of the forms of plagiarism observable in mathematical publications, which we identified by investigating editorial notes from zbMATH. While most cases we encountered were simple copies of earlier work, we also identified several forms of disguised plagiarism. We investigated 11 cases in detail and evaluate how current plagiarism detection systems perform in identifying these cases. Moreover, we describe the steps required to discover these and potentially undiscovered cases in the future.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
As of 2019-03-19 see https://zbmath.org/about/#id_4 for updated numbers.
- 9.
- 10.
We use zbMATH identifiers for referring to cases throughout the paper. The identifiers resolve via, e.g., https://zbmath.org/1191.35223 to documents accessible without subscription.
- 11.
See https://zbmath.org/general-help/ for the details of the search syntax.
- 12.
- 13.
- 14.
For example: http://cs231n.stanford.edu/reports/2017/pdfs/815.pdf.
- 15.
A value above 0.2 is considered as suspicious.
- 16.
https://www.hyplag.org/ user cicm@hyplag.org pw: cicm2019.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
References
Aizawa, A., et al.: NTCIR-11 Math-2 task overview. In: Proceedings of NTCIR Conference on Evaluation of Information Access Technologies (2014)
Alzahrani, S.M., Salim, N., Abraham, A.: Understanding plagiarism linguistic patterns, textual features, and detection methods. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 42(2) (2012). https://doi.org/10.1109/TSMCC.2011.2134847
Baker, J.B., Sexton, A.P., Sorge, V.: MaxTract: converting PDF to LaTeX, MathML and text. In: Jeuring, J., et al. (eds.) CICM 2012. LNCS, vol. 7362, pp. 422–426. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31374-5_29
Eisa, T.A.E., Salim, N., Alzahrani, S.M.: Existing plagiarism detection techniques: a systematic mapping of the scholarly literature. Online Inf. Rev. 39(3), 383–400 (2015)
Fishman, T.: ‘We know it when we see it’? is not good enough: toward a standard definition of plagiarism that transcends theft, fraud, and copyright. In: Proceedings of Asia Pacific Conference on Educational Integrity (2009)
Foltynek, T., Meuschke, N., Gipp, B.: Academic plagiarism detection: a systematic literature review. Journal article in review (2019)
Gipp, B.: Citation-Based Plagiarism Detection - Detecting Disguised and Cross-Language Plagiarism Using Citation Pattern Analysis. Springer, Wiesbaden (2014). https://doi.org/10.1007/978-3-658-06394-8
Gipp, B., Meuschke, N.: Citation pattern matching algorithms for citation-based plagiarism detection: greedy citation tiling, citation chunking and longest common citation sequence. In: Proceedings of ACM Symposium on Document Engineering (DocEng) (2011). https://doi.org/10.1145/2034691.2034741
Gipp, B., Meuschke, N., Beel, J.: Comparative evaluation of text- and citation-based plagiarism detection approaches using GuttenPlag. In: Proceedings of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (2011). https://doi.org/10.1145/1998076.1998124
Gipp, B., Meuschke, N., Breitinger, C.: Citation-based plagiarism detection: practicability on a large-scale scientific corpus. JASIST 65(2) (2014). https://doi.org/10.1002/asi.23228
Gipp, B., et al.: Web-based demonstration of semantic similarity detection using citation pattern visualization for a cross language plagiarism case. In: Proceedings of International Conference on Enterprise Information Systems (2014). https://doi.org/10.5220/0004985406770683
Guidi, F., Sacerdoti Coen, C.: A survey on retrieval of mathematical knowledge. Math. Comput. Sci. 10(4) (2016). https://doi.org/10.1007/s11786-016-0274-0
Halevi, G., Bar-Ilan, J.: Post retraction citations in context. In: Proceedings of BIRNDL Workshop at JCDL (2016). https://doi.org/10.1007/s11192-017-2242-0
Long, T.C., et al.: Responding to possible plagiarism. Science 323(5919) (2009). https://doi.org/10.1126/science.1167408
McCabe, D.L.: Cheating among college and university students: a North American perspective. Int. J. Educ. Integrity 1(1) (2005). https://doi.org/10.21913/IJEI.v1i1.14
Meuschke, N., Gipp, B.: Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space. In: Proceedings of IEEE/ACM Joint Conference on Digital Libraries (JCDL) (2014). https://doi.org/10.1109/JCDL.2014.6970168
Meuschke, N., Gipp, B.: State of the art in detecting academic plagiarism. Int. J. Educ. Integrity 9(1) (2013). https://doi.org/10.21913/IJEI.v9i1.847
Meuschke, N., et al.: An adaptive image-based plagiarism detection approach. In: Proceedings of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (2018). https://doi.org/10.1145/3197026.3197042
Meuschke, N., et al.: Analyzing mathematical content to detect academic plagiarism. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp. 2211–2214 (2017). https://doi.org/10.1145/3132847.3133144
Meuschke, N., et al.: Analyzing semantic concept patterns to detect academic plagiarism. In: Proceedings of WOSP Workshop held at ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (2017). https://doi.org/10.1145/3127526.3127535
Meuschke, N., et al.: HyPlag: a hybrid approach to academic plagiarism detection. In: Proceedings of International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) (2018). https://doi.org/10.1145/3209978.3210177
Meuschke, N., et al.: Improving academic plagiarism detection for STEM documents by analyzing mathematical content and citations. In: Proceedings of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (2019). https://doi.org/10.1109/JCDL.2019.00026
de Lurdes Pertile, S., Moreira, V.P., Rosso, P.: Comparing and combining Content- and Citation-based approaches for plagiarism detection. JASIST 67(10), 2511–2526 (2016)
Stein, B., zu Eissen, S.M., Potthast, M.: Strategies for retrieving plagiarized documents. In: Proceedings of International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) (2007). https://doi.org/10.1145/1277741.1277928
Suzuki, M., Kanahori, T., Ohtake, N., Yamaguchi, K.: An integrated OCR software for mathematical documents and its output with accessibility. In: Miesenberger, K., Klaus, J., Zagler, W.L., Burger, D. (eds.) ICCHP 2004. LNCS, vol. 3118, pp. 648–655. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27817-7_97
Swazey, J.P., Anderson, M.S., Louis, K.S.: Ethical problems in academic research. Am. Sci. 81(6), 542–553 (1993)
Vani, K., Gupta, D.: Study on extrinsic text plagiarism detection techniques and tools. J. Eng. Sci. Technol. Rev. 9(4) (2016)
Wager, E.: Defining and responding to plagiarism. Learn. Publ. 27(1) (2014). https://doi.org/10.1087/20140105
Weber-Wulff, D.: False Feathers: A Perspective on Academic Plagiarism. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-39961-9
Weber-Wulff, D.: Portal Plagiat - Tests of Plagiarism Software. Online Source (2019). http://plagiat.htw-berlin.de/software-en/. Accessed 12 Mar 2019
Wolska, M.: A language engineering architecture for processing informal mathematical discourse. In: Proceedings of DML WS Towards Digital Mathematics Library (2008)
Acknowledgements
This work was supported by the German Research Foundation (DFG grant GI-1259-1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A List of Documents with Noticeable Content Reuse
A List of Documents with Noticeable Content Reuse
1360.53021, 1357.30013, 1353.39029, 1353.30019, 1345.15011, 1359.62073, 1356.01026, 1337.16003, 1354.47018, 1340.90030, 1360.47003, 1345.92082, 1318.46035, 1400.34041, 1388.42037, 1388.42036, 1343.65150, 1330.35490, 1322.93076, 1321.81036, 1307.65177, 1308.81133, 1358.47017, 1309.65163, 1304.57008, 1325.47059, 1301.16002, 1293.65167, 1359.62055, 1291.30077, 1323.65125, 1328.47074, 1328.47073, 1299.65168, 1295.35151, 1294.35189, 1290.26023, 1282.91334, 1279.91096, 1281.35058, 1306.90186, 1311.90164, 1273.91086, 1287.81012, 1386.18011, 1301.45006, 1290.18001, 1278.68235, 1278.53033, 1271.54029, 1271.39024, 1266.65214, 1266.33002, 1264.34048, 1342.34118, 1266.30001, 1265.39016, 1264.81239, 1257.11089, 1250.78038, 1246.90035, 1246.90034, 1250.78039, 1252.68177, 1234.34034, 1364.47004, 1399.35153, 1274.76184, 1252.83109, 1288.49015, 1249.60023, 1250.47059, 1231.83033, 1227.34015, 1219.30004, 1236.58009, 1230.46033, 1213.60020, 1211.34093, 1211.34092, 1295.91090, 1242.49079, 1234.60021, 1262.11083, 1221.81113, 1234.60020, 1211.46021, 1217.34137, 1211.11127, 1203.06007, 1203.06006, 1212.49026, 1193.35074, 1191.35223, 1253.60034, 1235.37020, 1186.54007, 1189.35123, 1188.16002, 1183.37156, 1371.91006, 1371.91005, 1258.74210, 1257.78018, 1192.34093, 1195.55004, 1212.60016, 1201.60017, 1184.20030, 1176.91147, 1173.90327, 1206.34097, 1177.35217, 1170.34353, 1173.34354, 1279.90096, 1153.91544, 1189.35124, 1177.35218, 1175.86006, 1162.30319, 1170.42304, 1165.35336, 1162.30309, 1153.86318, 1154.94319, 1250.49003, 1166.47308, 1153.91523, 1155.26016, 1157.05036, 1162.83357, 1139.81335, 1213.35364, 1169.46304, 1169.42310, 1144.81475, 1141.90010, 1231.93121, 1132.14304, 1144.46044, 1134.60382, 1129.83326, 06921286.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Schubotz, M., Teschke, O., Stange, V., Meuschke, N., Gipp, B. (2019). Forms of Plagiarism in Digital Mathematical Libraries. In: Kaliszyk, C., Brady, E., Kohlhase, A., Sacerdoti Coen, C. (eds) Intelligent Computer Mathematics. CICM 2019. Lecture Notes in Computer Science(), vol 11617. Springer, Cham. https://doi.org/10.1007/978-3-030-23250-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-23250-4_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23249-8
Online ISBN: 978-3-030-23250-4
eBook Packages: Computer ScienceComputer Science (R0)