Skip to main content

Forms of Plagiarism in Digital Mathematical Libraries

  • Conference paper
  • First Online:
Intelligent Computer Mathematics (CICM 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11617))

Included in the following conference series:

Abstract

We report on an exploratory analysis of the forms of plagiarism observable in mathematical publications, which we identified by investigating editorial notes from zbMATH. While most cases we encountered were simple copies of earlier work, we also identified several forms of disguised plagiarism. We investigated 11 cases in detail and evaluate how current plagiarism detection systems perform in identifying these cases. Moreover, we describe the steps required to discover these and potentially undiscovered cases in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.crunchbase.com/organization/iparadigms-inc.

  2. 2.

    https://www.turnitin.com.

  3. 3.

    https://www.turnitin.com/products/ithenticate.

  4. 4.

    https://www.crossref.org/services/similarity-check/.

  5. 5.

    http://www.retractionwatch.com.

  6. 6.

    http://www.vroniplag.wikia.com.

  7. 7.

    http://dlmf.nist.gov/LaTeXML/.

  8. 8.

    As of 2019-03-19 see https://zbmath.org/about/#id_4 for updated numbers.

  9. 9.

    https://zbmath.org.

  10. 10.

    We use zbMATH identifiers for referring to cases throughout the paper. The identifiers resolve via, e.g., https://zbmath.org/1191.35223 to documents accessible without subscription.

  11. 11.

    See https://zbmath.org/general-help/ for the details of the search syntax.

  12. 12.

    https://grobid.readthedocs.io.

  13. 13.

    https://pypi.org/project/pdfminer/.

  14. 14.

    For example: http://cs231n.stanford.edu/reports/2017/pdfs/815.pdf.

  15. 15.

    A value above 0.2 is considered as suspicious.

  16. 16.

    https://www.hyplag.org/ user cicm@hyplag.org pw: cicm2019.

  17. 17.

    https://www.emis.de/misc/articles/ext05526289.html.

  18. 18.

    https://www.scirp.org/journal/PaperInformation.aspx?PaperID=3820.

  19. 19.

    https://link.springer.com/article/10.1007%2FBF02463791.

  20. 20.

    https://doi.org/10.1016/j.camwa.2011.01.043.

  21. 21.

    https://math.berkeley.edu/~kwray/papers/string_theory.pdf.

  22. 22.

    https://www.cs.rit.edu/~crohme2019/index.html.

References

  1. Aizawa, A., et al.: NTCIR-11 Math-2 task overview. In: Proceedings of NTCIR Conference on Evaluation of Information Access Technologies (2014)

    Google Scholar 

  2. Alzahrani, S.M., Salim, N., Abraham, A.: Understanding plagiarism linguistic patterns, textual features, and detection methods. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 42(2) (2012). https://doi.org/10.1109/TSMCC.2011.2134847

    Article  Google Scholar 

  3. Baker, J.B., Sexton, A.P., Sorge, V.: MaxTract: converting PDF to LaTeX, MathML and text. In: Jeuring, J., et al. (eds.) CICM 2012. LNCS, vol. 7362, pp. 422–426. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31374-5_29

    Chapter  Google Scholar 

  4. Eisa, T.A.E., Salim, N., Alzahrani, S.M.: Existing plagiarism detection techniques: a systematic mapping of the scholarly literature. Online Inf. Rev. 39(3), 383–400 (2015)

    Article  Google Scholar 

  5. Fishman, T.: ‘We know it when we see it’? is not good enough: toward a standard definition of plagiarism that transcends theft, fraud, and copyright. In: Proceedings of Asia Pacific Conference on Educational Integrity (2009)

    Google Scholar 

  6. Foltynek, T., Meuschke, N., Gipp, B.: Academic plagiarism detection: a systematic literature review. Journal article in review (2019)

    Google Scholar 

  7. Gipp, B.: Citation-Based Plagiarism Detection - Detecting Disguised and Cross-Language Plagiarism Using Citation Pattern Analysis. Springer, Wiesbaden (2014). https://doi.org/10.1007/978-3-658-06394-8

    Book  Google Scholar 

  8. Gipp, B., Meuschke, N.: Citation pattern matching algorithms for citation-based plagiarism detection: greedy citation tiling, citation chunking and longest common citation sequence. In: Proceedings of ACM Symposium on Document Engineering (DocEng) (2011). https://doi.org/10.1145/2034691.2034741

  9. Gipp, B., Meuschke, N., Beel, J.: Comparative evaluation of text- and citation-based plagiarism detection approaches using GuttenPlag. In: Proceedings of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (2011). https://doi.org/10.1145/1998076.1998124

  10. Gipp, B., Meuschke, N., Breitinger, C.: Citation-based plagiarism detection: practicability on a large-scale scientific corpus. JASIST 65(2) (2014). https://doi.org/10.1002/asi.23228

    Article  Google Scholar 

  11. Gipp, B., et al.: Web-based demonstration of semantic similarity detection using citation pattern visualization for a cross language plagiarism case. In: Proceedings of International Conference on Enterprise Information Systems (2014). https://doi.org/10.5220/0004985406770683

  12. Guidi, F., Sacerdoti Coen, C.: A survey on retrieval of mathematical knowledge. Math. Comput. Sci. 10(4) (2016). https://doi.org/10.1007/s11786-016-0274-0

    Article  MathSciNet  Google Scholar 

  13. Halevi, G., Bar-Ilan, J.: Post retraction citations in context. In: Proceedings of BIRNDL Workshop at JCDL (2016). https://doi.org/10.1007/s11192-017-2242-0

    Article  Google Scholar 

  14. Long, T.C., et al.: Responding to possible plagiarism. Science 323(5919) (2009). https://doi.org/10.1126/science.1167408

    Article  Google Scholar 

  15. McCabe, D.L.: Cheating among college and university students: a North American perspective. Int. J. Educ. Integrity 1(1) (2005). https://doi.org/10.21913/IJEI.v1i1.14

  16. Meuschke, N., Gipp, B.: Reducing computational effort for plagiarism detection by using citation characteristics to limit retrieval space. In: Proceedings of IEEE/ACM Joint Conference on Digital Libraries (JCDL) (2014). https://doi.org/10.1109/JCDL.2014.6970168

  17. Meuschke, N., Gipp, B.: State of the art in detecting academic plagiarism. Int. J. Educ. Integrity 9(1) (2013). https://doi.org/10.21913/IJEI.v9i1.847

  18. Meuschke, N., et al.: An adaptive image-based plagiarism detection approach. In: Proceedings of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (2018). https://doi.org/10.1145/3197026.3197042

  19. Meuschke, N., et al.: Analyzing mathematical content to detect academic plagiarism. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp. 2211–2214 (2017). https://doi.org/10.1145/3132847.3133144

  20. Meuschke, N., et al.: Analyzing semantic concept patterns to detect academic plagiarism. In: Proceedings of WOSP Workshop held at ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (2017). https://doi.org/10.1145/3127526.3127535

  21. Meuschke, N., et al.: HyPlag: a hybrid approach to academic plagiarism detection. In: Proceedings of International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) (2018). https://doi.org/10.1145/3209978.3210177

  22. Meuschke, N., et al.: Improving academic plagiarism detection for STEM documents by analyzing mathematical content and citations. In: Proceedings of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) (2019). https://doi.org/10.1109/JCDL.2019.00026

  23. de Lurdes Pertile, S., Moreira, V.P., Rosso, P.: Comparing and combining Content- and Citation-based approaches for plagiarism detection. JASIST 67(10), 2511–2526 (2016)

    Google Scholar 

  24. Stein, B., zu Eissen, S.M., Potthast, M.: Strategies for retrieving plagiarized documents. In: Proceedings of International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) (2007). https://doi.org/10.1145/1277741.1277928

  25. Suzuki, M., Kanahori, T., Ohtake, N., Yamaguchi, K.: An integrated OCR software for mathematical documents and its output with accessibility. In: Miesenberger, K., Klaus, J., Zagler, W.L., Burger, D. (eds.) ICCHP 2004. LNCS, vol. 3118, pp. 648–655. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27817-7_97

    Chapter  Google Scholar 

  26. Swazey, J.P., Anderson, M.S., Louis, K.S.: Ethical problems in academic research. Am. Sci. 81(6), 542–553 (1993)

    Google Scholar 

  27. Vani, K., Gupta, D.: Study on extrinsic text plagiarism detection techniques and tools. J. Eng. Sci. Technol. Rev. 9(4) (2016)

    Article  Google Scholar 

  28. Wager, E.: Defining and responding to plagiarism. Learn. Publ. 27(1) (2014). https://doi.org/10.1087/20140105

    Article  Google Scholar 

  29. Weber-Wulff, D.: False Feathers: A Perspective on Academic Plagiarism. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-39961-9

    Book  Google Scholar 

  30. Weber-Wulff, D.: Portal Plagiat - Tests of Plagiarism Software. Online Source (2019). http://plagiat.htw-berlin.de/software-en/. Accessed 12 Mar 2019

  31. Wolska, M.: A language engineering architecture for processing informal mathematical discourse. In: Proceedings of DML WS Towards Digital Mathematics Library (2008)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the German Research Foundation (DFG grant GI-1259-1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moritz Schubotz .

Editor information

Editors and Affiliations

A List of Documents with Noticeable Content Reuse

A List of Documents with Noticeable Content Reuse

1360.53021, 1357.30013, 1353.39029, 1353.30019, 1345.15011, 1359.62073, 1356.01026, 1337.16003, 1354.47018, 1340.90030, 1360.47003, 1345.92082, 1318.46035, 1400.34041, 1388.42037, 1388.42036, 1343.65150, 1330.35490, 1322.93076, 1321.81036, 1307.65177, 1308.81133, 1358.47017, 1309.65163, 1304.57008, 1325.47059, 1301.16002, 1293.65167, 1359.62055, 1291.30077, 1323.65125, 1328.47074, 1328.47073, 1299.65168, 1295.35151, 1294.35189, 1290.26023, 1282.91334, 1279.91096, 1281.35058, 1306.90186, 1311.90164, 1273.91086, 1287.81012, 1386.18011, 1301.45006, 1290.18001, 1278.68235, 1278.53033, 1271.54029, 1271.39024, 1266.65214, 1266.33002, 1264.34048, 1342.34118, 1266.30001, 1265.39016, 1264.81239, 1257.11089, 1250.78038, 1246.90035, 1246.90034, 1250.78039, 1252.68177, 1234.34034, 1364.47004, 1399.35153, 1274.76184, 1252.83109, 1288.49015, 1249.60023, 1250.47059, 1231.83033, 1227.34015, 1219.30004, 1236.58009, 1230.46033, 1213.60020, 1211.34093, 1211.34092, 1295.91090, 1242.49079, 1234.60021, 1262.11083, 1221.81113, 1234.60020, 1211.46021, 1217.34137, 1211.11127, 1203.06007, 1203.06006, 1212.49026, 1193.35074, 1191.35223, 1253.60034, 1235.37020, 1186.54007, 1189.35123, 1188.16002, 1183.37156, 1371.91006, 1371.91005, 1258.74210, 1257.78018, 1192.34093, 1195.55004, 1212.60016, 1201.60017, 1184.20030, 1176.91147, 1173.90327, 1206.34097, 1177.35217, 1170.34353, 1173.34354, 1279.90096, 1153.91544, 1189.35124, 1177.35218, 1175.86006, 1162.30319, 1170.42304, 1165.35336, 1162.30309, 1153.86318, 1154.94319, 1250.49003, 1166.47308, 1153.91523, 1155.26016, 1157.05036, 1162.83357, 1139.81335, 1213.35364, 1169.46304, 1169.42310, 1144.81475, 1141.90010, 1231.93121, 1132.14304, 1144.46044, 1134.60382, 1129.83326, 06921286.

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schubotz, M., Teschke, O., Stange, V., Meuschke, N., Gipp, B. (2019). Forms of Plagiarism in Digital Mathematical Libraries. In: Kaliszyk, C., Brady, E., Kohlhase, A., Sacerdoti Coen, C. (eds) Intelligent Computer Mathematics. CICM 2019. Lecture Notes in Computer Science(), vol 11617. Springer, Cham. https://doi.org/10.1007/978-3-030-23250-4_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23250-4_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23249-8

  • Online ISBN: 978-3-030-23250-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics