A New Monte Carlo-Based Error Rate Estimator
Estimating the classification error rate of a classifier is a key issue in machine learning. Such estimation is needed to compare classifiers or to tune the parameters of a parameterized classifier. Several methods have been proposed to estimate error rate, most of which rely on partitioning the data set or drawing bootstrap samples from it. Error estimators can suffer from bias (deviation from actual error rate) and/or variance (sensitivity to the data set). In this work, we propose an error rate estimator that estimates a generative and a posterior probability models to represent the underlying process that generates the data and exploits these models in a Monte Carlo style to provide two biased estimators whose best combination is determined by an iterative solution. We test our estimator against state of the art estimators and show that it provides a reliable estimate in terms of mean-square-error.
KeywordsError Rate Mean Square Error Gaussian Process Bootstrap Sample Posterior Probability Model
- 3.Jiang, W., Simon, R.: A comparison of bootstrap methods and an adjusted bootstrap approach for estimating prediction error in microarray classification. Statistics in Medicine (2008)Google Scholar
- 15.Asuncion, A., Newman, D.: UCI machine learning repository (2007)Google Scholar
- 16.Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC (April 1986)Google Scholar