Skip to main content

PAGER: Parameterless, Accurate, Generic, Efficient kNN-Based Regression

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6262))

Abstract

The problem of regression is to estimate the value of a dependent numeric variable based on the values of one or more independent variables. Regression algorithms are used for prediction (including forecasting of time-series data), inference, hypothesis testing, and modeling of causal relationships. Although this problem has been studied extensively, most of these approaches are not generic in that they require the user to make an intelligent guess about the form of the regression equation. In this paper we present a new regression algorithm PAGER – Parameterless, Accurate, Generic, Efficient kNN-based Regression. PAGER is also simple and outlier-resilient. These desirable features make PAGER a very attractive alternative to existing approaches. Our experimental study compares PAGER with 12 other algorithms on 4 standard real datasets, and shows that PAGER is more accurate than its competitors.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  2. Jammalamadaka, N., Pudi, V., Jawahar, C.V.: Efficient Search with Changing Similarity Measures on Large Multimedia Datasets. In: Proc. of the International Multimedia Modelling Conference (2007)

    Google Scholar 

  3. Wang, Y.: A new approach to fitting linear models in high dimensional spaces, PhD thesis, Department of Computer Science, University of Waikato, New Zealand (2000)

    Google Scholar 

  4. Wang, Y., Witten, I.H.: Modeling for optimal probability prediction (2002)

    Google Scholar 

  5. Barreto, H.: An Introduction to Least Median of Squares. Chapter contribution to Barreto and Howland, Econometrics via Monte Carlo Simulation

    Google Scholar 

  6. Lingjaerde, O.C., Liestøl, K.: Generalized projection pursuit regression. SIAM Journal on Scientific Computing (1999)

    Google Scholar 

  7. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Wadsworth Inc. (1984)

    Google Scholar 

  8. Todorovski, L.: Declarative bias in equation discovery. M.Sc. Thesis. Faculty of Computer and Information Science, Ljubljana, Slovenia (1998)

    Google Scholar 

  9. Dzeroski, S., Todorovski, L.: Discovering dynamics: from inductive logic programming to machine discovery. Journal of Intelligent Information Systems 4, 89–108 (1995)

    Article  Google Scholar 

  10. Smola, A.J., Scholkopf, B.: A tutorial on support vector regression. Technical Report NC2-TR-1998-030, NeuroCOLT2 Technical Report Series (1998)

    Google Scholar 

  11. Shevade, S., Keerthi, S., Bhattacharyya, C., Murthy, K.: Improvements to smo algorithm for svm regression. Technical Report CD-99-16, Control Division Dept of Mechanical and Production Engineering, National University of Singapore (1999)

    Google Scholar 

  12. Chu, W., Keerthi, S.S.: New approaches to support vector ordinal regression. In: Proc. of International Conference on Machine Learning (ICML 2005), pp. 142–152 (2005)

    Google Scholar 

  13. Ware, M.: Implementation of multilayer perceptron backpropagation (2005), http://weka.sourceforge.net/doc/weka/classifiers/functions/MultilayerPerceptron.html

  14. Mielniczuk, J., Tyrcha, J.: Consistency of multilayer perceptron regression estimators. Neural Networks 53(2), 1019–1022 (1993)

    Article  Google Scholar 

  15. Haykin, S.: Self-organizing maps. In: Neural networks - A comprehensive foundation, 2nd edn. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  16. Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)

    MATH  MathSciNet  Google Scholar 

  17. Schapire, R.E.: A Brief Introduction to Boosting. In: Proc. 16th International Joint Conf. Artificial Intelligence, pp. 1401–1406 (1999)

    Google Scholar 

  18. Fix, E., Hodges Jr., J.L.: Discriminatory analysis, non-parameteric discrimination: Consistency properties. Technical Report 21-49-004(4), USAF school of aviation medicine, Randolf field, Texas (1951)

    Google Scholar 

  19. Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, Chichester (1987)

    Book  MATH  Google Scholar 

  20. Asuncion, A., Newman, D.: UCI Machine learning repository (2007)

    Google Scholar 

  21. The body fat dataset (1985), http://lib.stat.cmu.edu/datasets/bodyfat

  22. Friedman, J.H.: Stochastic Gradient Boosting. Technical Report Stanford University (1999), http://www-stat.stanford.edu/~jhf/ftp/stobst.ps

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Singh, H., Desai, A., Pudi, V. (2010). PAGER: Parameterless, Accurate, Generic, Efficient kNN-Based Regression. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15251-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15251-1_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15250-4

  • Online ISBN: 978-3-642-15251-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics