Proposal for an Evaluation Framework for Compliance Checkers for Long-Term Digital Preservation

  • Nicola FerroEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 701)


In this paper, we discuss the problem of how to model and evaluate tools that allow memory institutions to check the conformance of documents with respect to their reference standards in order to ensure their appropriateness for long-term preservation. In particular, we propose to model the conformance checking problem as a classification task and to evaluate it as a multi-classification problem using a Cranfield-like approach.



The reported work has been partially supported by the PREFORMA project(, as part of the Seventh Framework Programme of the European Commission, grant agreement no. 619568.


  1. 1.
    Alonso, O.: Implementing crowdsourcing-based relevance experimentation: an industrial perspective. Inf. Retrieval 16(2), 101–120 (2013)CrossRefGoogle Scholar
  2. 2.
    Alpaydin, E.: Introduction to Machine Learning. The MIT Press, Cambridge (2014)zbMATHGoogle Scholar
  3. 3.
    Amigó, E., Gonzalo, J., Artiles, J., Verdejo, M.F.: A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf. Retrieval 12(4), 461–486 (2009)CrossRefGoogle Scholar
  4. 4.
    Amigó, E., Gonzalo, J., Verdejo, M.F.: A general evaluation measure for document organization tasks. In: Jones, G.J.F., Sheridan, P., Kelly, D., de Rijke, M., Sakai, T. (eds.) Proceeding 36th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2013), pp. 643–652. ACM Press, New York (2013)Google Scholar
  5. 5.
    Becker, C., Duretec, K.: Free benchmark corpora for preservation experiments: using model-driven engineering to generate data sets. In: Downie, J.S., McDonald, R.H., Cole, T.W., Sanderson, R., Shipman, F. (eds.) Proceeding 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2013), pp. 349–358. ACM Press, New York (2013)Google Scholar
  6. 6.
    Becker, C., Duretec, K., Rauber, A.: The Challenge of Test Data Quality in Data Processing. ACM J. Data Inf. Qual. (JDIQ) 8(2) (2016)Google Scholar
  7. 7.
    Becker, C., Rauber, A.: Decision criteria in digital preservation: what to measure and how. J. Am. Soc. Inform. Sci. Technol. (JASIST) 62(6), 1009–1028 (2011)CrossRefGoogle Scholar
  8. 8.
    Cappellato, L., Ferro, N., Fresa, A., Geber, M., Justrell, B., Lemmens, B., Prandoni, C., Silvello, G.: The PREFORMA project: federating memory institutions for better compliance of preservation formats. In: Calvanese, D., De Nart, D., Tasso, C. (eds.) IRCDL 2015. CCIS, vol. 612, pp. 86–91. Springer, Cham (2016). doi: 10.1007/978-3-319-41938-1_10 CrossRefGoogle Scholar
  9. 9.
    Chanod, J.P., Dobreva, M., Rauber, A., Ross, S., Casarosa, V.: Issues in digital preservation: towards a new research agenda. In: Chanod, J.P., Dobreva, M., Rauber, A., Ross, S. (eds.) Report from Dagstuhl Seminar 10291: Automation in Digital Preservation, pp. 1–14. Dagstuhl Reports, Schloss Dagstuhl-Leibniz-Zentrum für Informatik, Germany (2010)Google Scholar
  10. 10.
    Cleverdon, C.W.: The Cranfield tests on index languages devices. In: Spärck Jones, K., Willett, P. (eds.) Readings in Information Retrieval, pp. 47–60. Morgan Kaufmann Publisher Inc., San Francisco (1997)Google Scholar
  11. 11.
    Cormack, G., Lynam, T.: TREC 2005 spam track overview. In: Voorhees, E.M., Buckland, L.P. (eds.) The Fourteenth Text REtrieval Conference Proceedings (TREC 2005). National Institute of Standards and Technology (NIST), Special Publication 500–266, Washington, USA (2005)Google Scholar
  12. 12.
    Duretec, K., Kulmukhametov, A., Rauber, A., Becker, C.: Benchmarks for digital preservation tools. In: Proceeding of 11th International Conference on Preservation of Digital Objects (iPRES 2015) (2015)Google Scholar
  13. 13.
    Elfner, P., Justrell, B.: Deliverable D2.1 - Overall Roadmap. PREFORMA PCP Project, EU 7FP, Contract N. 619568, June 2014.
  14. 14.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recogn. Lett. 30(1), 27–38 (2009)CrossRefGoogle Scholar
  16. 16.
    Ferro, N.: Reproducibility challenges in information retrieval evaluation. ACM J. Data Inf. Qual. (JDIQ) 8(2), 8:1–8:4 (2017)Google Scholar
  17. 17.
    Ferro, N., Buelinckx, E., Doubrov, B., Jadeglans, K., Lemmens, B., Martinez, J., Muñoz, V., Prandoni, C., Rice, D., Rohde-Enslin, S., Tarres, X., Verbruggen, E., Yousefi, B., Wilson, C.: Deliverable D8.1R2 - Competitive Evaluation Strategy. PREFORMA PCP Project, EU 7FP, Contract N. 619568, October 2016Google Scholar
  18. 18.
    Ferro, N., Fuhr, N., Järvelin, K., Kando, N., Lippold, M., Zobel, J.: Increasing reproducibility in IR: findings from the Dagstuhl Seminar on “Reproducibility of Data-Oriented Experiments in e-Science”. SIGIR Forum 50(1), 68–82 (2016)CrossRefGoogle Scholar
  19. 19.
    IEC 60958: Digital audio interface - Part 1: General. Standard IEC 60958–1 Ed. 3.1 b:2014 (2014)Google Scholar
  20. 20.
    Innocenti, P., Ross, S., Maceviciute, E., Wilson, T., Ludwig, J., Pempe, W.: Assessing digital preservation frameworks: the approach of the SHAMAN project. In: Spyratos, N., Kapetanios, E., Traina, A. (eds.) Proceeding of ACM International Conference on Management of Emergent Digital EcoSystems (MEDES 2009), pp. 412–416. ACM Press, New York (2009)Google Scholar
  21. 21.
    ISO 12234–2: Electronic still-picture imaging - Removable memory - Part 2: TIFF/EP image data format. Recommendation ISO 12234–2:2001 (2001)Google Scholar
  22. 22.
    ISO 12639: Graphic technology - Prepress digital data exchange - Tag image file format for image technology (TIFF/IT). Recommendation ISO 12639:2004 (2004)Google Scholar
  23. 23.
    ISO 14721: Space data and information transfer systems - Open archival information system (OAIS) - Reference model. Recommendation ISO 14721:2012 (2012)Google Scholar
  24. 24.
    ISO 19005–1: Document management - Electronic document file format for long-term preservation - Part 1: Use of PDF 1.4 (PDF/A-1). Recommendation ISO 19005–1:2005 (2005)Google Scholar
  25. 25.
    ISO 19005–2: Document management - Electronic document file format for long-term preservation - Part 2: Use of ISO 32000–1 (PDF/A-2). Recommendation ISO 19005–2:2011 (2011)Google Scholar
  26. 26.
    ISO 19005–3: Document management - Electronic document file format for long-term preservation - Part 3: Use of ISO 32000–1 with support for embedded files (PDF/A-3). Recommendation ISO 19005–3:2012 (2012)Google Scholar
  27. 27.
    Kowalczyk, S.T.: Before the repository: defining the preservation threats to research data in the lab. In: Logasa Bogen II, P., Allard, S., Mercer, H., Beck, M. (eds.) Proceeding of 15th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2015), pp. 215–222. ACM Press, New York (2015)Google Scholar
  28. 28.
    Lease, M., Yilmaz, E.: Crowdsourcing for information retrieval: introduction to the special issue. Inf. Retrieval 16(2), 91–100 (2013)CrossRefGoogle Scholar
  29. 29.
    Ross, S.: Digital preservation, archival science and methodological foundations for digital libraries. New Rev. Inf. Networking 17(1), 43–68 (2012)CrossRefGoogle Scholar
  30. 30.
    Sanderson, M.: Test collection based evaluation of information retrieval systems. Found. Trends Inf. Retrieval (FnTIR) 4(4), 247–375 (2010)CrossRefzbMATHGoogle Scholar
  31. 31.
    Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)CrossRefGoogle Scholar
  32. 32.
    Smucker, M.D., Kazai, G., Lease, M.: Overview of the TREC 2012 crowdsourcing track. In: Voorhees, E.M., Buckland, L.P. (eds.) The Twenty-First Text REtrieval Conference Proceedings (TREC 2012). National Institute of Standards and Technology (NIST), Special Publication 500–298, Washington, USA (2013)Google Scholar
  33. 33.
    Soboroff, I., Nicholas, C., Cahan, P.: Ranking retrieval systems without relevance judgments. In: Kraft, D.H., Croft, W.B., Harper, D.J., Zobel, J. (eds.) Proceeding of 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pp. 66–73. ACM Press, New York (2001)Google Scholar
  34. 34.
    Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)CrossRefGoogle Scholar
  35. 35.
    Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. Inf. Process. Manage. 36(5), 697–716 (2000)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Information EngineeringUniversity of PaduaPaduaItaly

Personalised recommendations