Skip to main content

Individualized Error Estimation for Classification and Regression Models

  • Conference paper
  • First Online:
Challenges at the Interface of Data Analysis, Computer Science, and Optimization

Abstract

Estimating the error of classification and regression models is one of the most crucial tasks in machine learning. While the global error is capable to measure the quality of a model, local error estimates are even more interesting: on the one hand they contribute to better understanding of prediction models (where does and where does not work the model well), on the other hand they may provide powerful means to build successful ensembles that select for each region the most appropriate model(s). In this paper we introduce an extremely localized error estimation, called individualized error estimation (IEE), that estimates the error of a prediction model M for each instance x individually. To solve the problem of individualized error estimation, we apply a meta model \({M}^{{_\ast}}\). We systematically investigate various combinations of elementary models M and meta models M on publicly available real-world data sets. Further, we illustrate the power of IEE in the context of time series classification: on 35 publicly available real-world time series data sets, we show that IEE is capable to enhance state-of-the art time series classification methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Hubs are time series that appear most frequently as nearest neighbors of other time series. Denote the set of time series for which t is the nearest neighbor as N t . A hub t is a bad hub if its class label is different from the class labels of many time series in N t . See also (Radovanovic et al. 2010).

References

  • Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: Experimental comparison of representations and distance measures. VLDB Endowment 1(2):1542–1552

    Google Scholar 

  • Domeniconi C, Gunopulos D (2001) Adaptive nearest neighbor classification using support vector machines. Adv NIPS 14:665–672

    Google Scholar 

  • Domeniconi C, Peng J, Gunopulos D (2002) Locally adaptive metric nearest-neighbor classification. IEEE Trans Pattern Anal Machine Intell 24(9):1281–1285

    Article  Google Scholar 

  • Duffy N, Helmbold D (2002) Boosting methods for regression. Mach Learn 47:153–200

    Article  MATH  Google Scholar 

  • Frank A, Asuncion A (2010) UCI machine learning repository. Tech. rep., University of California, School of Information and Computer Sciences, Irvine, URL http://archive.ics.uci.edu/ml

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: An update. SIGKDD Explor 11(1):10–18

    Article  Google Scholar 

  • Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans Pattern Anal Mach Intell 18(6):607–616

    Article  Google Scholar 

  • Jain AK, Dubes RC, Chen CC (1987) Bootstrap techniques for error estimation. IEEE Trans Pattern Anal Mach Intell 5(9):606–633

    Google Scholar 

  • Molinaro AM, Simon R, Pfeiffer RM (2005) Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15):3301–3307

    Article  Google Scholar 

  • Radovanovic M, Nanopoulos A, Ivanovic M (2010) Time-series classification in many intrinsic dimensions. In: Proc. 10th SIAM International Conference on Data Mining, SIAM, pp 677–688

    Google Scholar 

  • Tsuda K, Rätsch G, Mika S, Müller KR (2001) Learning to predict the leave-one-out error of kernel based classifiers. ICANN 2001, LNCS 2130/2001:331–338

    Google Scholar 

  • Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction. In: Proc. 23th Int’l. Conf. on Machine Learning, ACM, pp 1033–1040

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krisztian Buza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Buza, K., Nanopoulos, A., Schmidt-Thieme, L. (2012). Individualized Error Estimation for Classification and Regression Models. In: Gaul, W., Geyer-Schulz, A., Schmidt-Thieme, L., Kunze, J. (eds) Challenges at the Interface of Data Analysis, Computer Science, and Optimization. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24466-7_19

Download citation

Publish with us

Policies and ethics