Abstract
In this paper we present a way to use precision and recall measures in total absence of ground truth. We develop a probabilistic interpretation of both measures and show that, provided a sufficient number of data sources are available, it offers a viable performance measure to compare methods if no ground truth is available. This paper also shows the limitations of the approach, in case a systematic bias is present in all compared methods, but shows that it maintains a very high level of overall coherence and stability. It opens broader perspectives and can be extended to handling partial or unreliable ground truth, as well as levels of prior confidence in the methods it aims to compare.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Library of congress, http://memory.loc.gov/
Antonacopoulos, A., Karatzas, D., Bridson, D.: Ground Truth for Layout Analysis Performance Evaluation. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 302–311. Springer, Heidelberg (2006)
Baraldi, A., Bruzzone, L., Blonda, P.: Quality assessment of classification and cluster maps without ground truth knowledge. IEEE Transactions on Geoscience and Remote Sensing 43(4), 857–873 (2005)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)
Finkel, J.R., Grenager, T., Manning, C.D.: Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL. The Association for Computer Linguistics (2005)
Goutte, C., Gaussier, E.: A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005)
Grosicki, E., Carree, M., Brodin, J.M., Geoffrois, E.: Results of the rimes evaluation campaign for handwritten mail processing. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 941–945 (July 2009)
Hauff, C., Hiemstra, D., de Jong, F., Azzopardi, L.: Relying on topic subsets for system ranking estimation. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 1859–1862. ACM, New York (2009)
Kankanhalli, M.S., Mehtre, B.M., Wu, J.K.: Cluster-based color matching for image retrieval. Pattern Recognition 29, 701–708 (1995)
Santosh, K.C., Lamiroy, B., Wendling, L.: Spatio-structural symbol description with statistical feature add-on. In: The Ninth International Workshop on Graphics Recognition (2011)
Kuncheva, L., Whitaker, C., Shipp, C., Duin, R.: Limits on the majority vote accuracy in classifier fusion. Pattern Analysis & Applications 6, 22–31 (2003)
Lamiroy, B., Lopresti, D., Korth, H., Jeff, H.: How carefully designed open resource sharing can help and expand document analysis research. In: Agam, G., Viard-Gaudin, C. (eds.) Document Recognition and Retrieval XVIII. SPIE Proceedings, vol. 7874. SPIE, San Francisco (2011)
Lamiroy, B., Lopresti, D., Sun, T.: Document Analysis Algorithm Contributions in End-to-End Applications. In: 11th International Conference on Document Analysis and Recognition - ICDAR 2011. International Association for Pattern Recognition, Beijing (2011)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man and Cybernetics 9(1), 62–66 (1979)
van Rijsbergen, C.J.: Information Retrieval. Butterworth (1979)
Sauvola, J.J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recognition 33(2), 225–236 (2000)
Smith, R.: An overview of the tesseract ocr engine. In: ICDAR 2007: Proceedings of the Ninth International Conference on Document Analysis and Recognition, pp. 629–633. IEEE Computer Society (2007), http://www.google.de/research/pubs/archive/33418.pdf
Thompson, J.D., Higgins, D.G., Gibson, T.J.: Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22(22), 4673–4680 (1994)
Tombre, K., Lamiroy, B.: Pattern Recognition Methods for Querying and Browsing Technical Documentation. In: Ruiz-Shulcloper, J., Kropatsch, W.G. (eds.) CIARP 2008. LNCS, vol. 5197, pp. 504–518. Springer, Heidelberg (2008)
Valveny, E., Dosch, P., Winstanley, A., Zhou, Y., Yang, S., Yan, L., Wenyin, L., Elliman, D., Delalandre, M., Trupin, E., Adam, S., Ogier, J.M.: A general framework for the evaluation of symbol recognition methods. International Journal on Document Analysis and Recognition 9, 59–74 (2007)
Wolf, C., Doermann, D.S.: Binarization of low quality text using a markov random field model. In: ICPR, vol. (3), pp. 160–163 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lamiroy, B., Sun, T. (2013). Computing Precision and Recall with Missing or Uncertain Ground Truth. In: Kwon, YB., Ogier, JM. (eds) Graphics Recognition. New Trends and Challenges. GREC 2011. Lecture Notes in Computer Science, vol 7423. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36824-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-36824-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36823-3
Online ISBN: 978-3-642-36824-0
eBook Packages: Computer ScienceComputer Science (R0)