Abstract
This paper addresses performance evaluation in the presence of imprecise ground-truth. Indeed, the most common assumption when performing benchmarking measures is that the reference data is flawless. In previous work, we have shown that this assumption cannot be taken for granted, and that, in the case of perceptual interpretation problems it is most certainly always wrong but for the most trivial cases.
We are presenting a statistical test that will allow measuring the confidence one can have in the results of a benchmarking test ranking multiple algorithms. More specifically, we can express the probability of the ranking not being respected in the presence of a given level of errors in the ground truth data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Khaffaf, H.S.M., Talib, A.Z., Osman, M.A., Wong, P.L.: GREC’09 Arc Segmentation Contest: Performance Evaluation on Old Documents. In: Ogier, J.-M., Liu, W., Lladós, J. (eds.) GREC 2009. LNCS, vol. 6020, pp. 251–259. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13728-0_23
Al-Khaffaf, H.S.M., Talib, A.Z., Osman, M.A.: Final report of GREC’11 arc segmentation contest: performance evaluation on multi-resolution scanned documents. In: Kwon, Y.-B., Ogier, J.-M. (eds.) GREC 2011. LNCS, vol. 7423, pp. 187–197. Springer, Heidelberg (2013). doi:10.1007/978-3-642-36824-0_18
Al-Khaffaf, H.S.M., Talib, A.Z., Osman, M.A., Wong, P.L.: GREC’09 arc segmentation contest: performance evaluation on old documents. In: Ogier, J.-M., Liu, W., Lladós, J. (eds.) GREC 2009. LNCS, vol. 6020, pp. 251–259. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13728-0_23
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Statist. 22(1), 79–86 (1951). http://dx.doi.org/10.1214/aoms/1177729694
Lamiroy, B.: Interpretation, evaluation and the semantic gap.. what if we were on a side-track? In: Lamiroy, B., Ogier, J.-M. (eds.) GREC 2013. LNCS, vol. 8746, pp. 221–233. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44854-0_17
Lamiroy, B., Sun, T.: Computing precision and recall with missing or uncertain ground truth. In: Kwon, Y.-B., Ogier, J.-M. (eds.) GREC 2011. LNCS, vol. 7423, pp. 149–162. Springer, Heidelberg (2013). doi:10.1007/978-3-642-36824-0_15
Wenyin, L.: The third report of the arc segmentation contest. In: Liu, W., Lladós, J. (eds.) GREC 2005. LNCS, vol. 3926, pp. 358–361. Springer, Heidelberg (2006). doi:10.1007/11767978_32
Metropolis, N., Ulam, S.M.: The Monte Carlo method. J. Am. Stat. Assoc. 44(247), 335–341 (1949). http://dx.doi.org/10.2307/2280232
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Lamiroy, B., Pierrot, P. (2017). Statistical Performance Metrics for Use with Imprecise Ground-Truth. In: Lamiroy, B., Dueire Lins, R. (eds) Graphic Recognition. Current Trends and Challenges. GREC 2015. Lecture Notes in Computer Science(), vol 9657. Springer, Cham. https://doi.org/10.1007/978-3-319-52159-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-52159-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52158-9
Online ISBN: 978-3-319-52159-6
eBook Packages: Computer ScienceComputer Science (R0)