Regression by classification

Torgo, Luís; Gama, João

doi:10.1007/3-540-61859-7_6

Luís Torgo¹ &
João Gama¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1159))

Included in the following conference series:

Brazilian Symposium on Artificial Intelligence

396 Accesses
26 Citations

Abstract

We present a methodology that enables the use of existent classification inductive learning systems on problems of regression. We achieve this goal by transforming regression problems into classification problems. This is done by transforming the range of continuous goal variable values into a set of intervals that will be used as discrete classes. We provide several methods for discretizing the goal variable values. These methods are based on the idea of performing an iterative search for the set of final discrete classes. The search algorithm is guided by a N-fold cross validation estimation of the prediction error resulting from using a set of discrete classes. We have done extensive empirical evaluation of our discretization methodologies using C4.5 and CN2 on four real world domains. The results of these experiments show the quality of our discretization methods compared to other existing methods.

Our method is independent of the used classification inductive system. The method is easily applicable to other inductive algorithms. This generality turns our method into a powerful tool that extends the applicability of a wide range of existing classification systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Breiman,L., Friedman,J.H., Olshen,R.A. & Stone,C.J. (1984): Classification and Regression Trees, Wadsworth Int. Group, Belmont, California, USA, 1984.
Google Scholar
Clark, P. and Niblett, T. (1988): The CN2 induction algorithm. In Machine Learning, 3, 261–283.
Google Scholar
Dillon,W. and Goldstein,M. (1984): Multivariale Analysis. John Wiley & Sons, Inc.
Google Scholar
Fayyad, U.M., and Irani, K.B. (1993): Multi-interval Discretization of Continuous-valued Attributes for Classification Learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI-93). Morgan Kaufmann Publishers.
Google Scholar
Friedman, J. (1991): Multivariate Adaptative Regression Splines, In Annals of Statistics, 19:1.
Google Scholar
Ginsberg, M. (1993): Essentials of Artificial Intelligence. Morgan Kaufmann Publishers.
Google Scholar
Holland, J. (1992): Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. MIT Press.
Google Scholar
John,G.H., Kohavi,R. and Pfleger, K. (1994): Irrelevant features and the subset selection problem. In Machine Learning: proceedings of the 11th International Conference. Morgan Kaufmann.
Google Scholar
Kohavi, R. (1995): Wrappers for performance enhancement and oblivious decision graphs. PhD Thesis.
Google Scholar
Langley, Pr., and Sage, S. (1994): Induction of selective bayesian classifiers. In Proceedings of the 10th conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers.
Google Scholar
Lee, C. and Shin, D. (1994): A context-sensitive Discretization of Numeric Attributes for classification learning. In Proceedings of the 11th European Conference on Artificial Intelligence (ECAI-94), Cohn, A.G. (ed.). John Wiley & Sons.
Google Scholar
Michie,D., Spiegelhalter,D.J. & Taylor,C.C. (1994): Machine Learning, Neural and Statistical Classification, Ellis Horwood Series in Artificial Intelligence, 1994.
Google Scholar
Mladenic, D. (1995): Automated model selection. In Mlnet workshop on Knowledge Level Modelling and Machine Learning. Heraklion, Crete, Greece.
Google Scholar
Pazzani, M.J. (1995): Searching for dependencies in bayesian classifiers. In Proceedings of the 5th international workshop on Artificial Intelligence and Statitics. Ft. Laurderdale, FL.
Google Scholar
Quinlan, J. R. (1993): C4.5: programs for machine learning. Morgan Kaufmann Publishers.
Google Scholar
Quinlan, J.R. (1992): Learning with Continuos Classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. Singapore: World Scientific, 1992.
Google Scholar
Torgo, L. (1995): Data Fitting with Rule-based Regression. In Proceedings of the 2nd international workshop on Artificial Intelligence Techniques (AIT95), Zizka,J. and Brazdil,P. (eds.). Brno, Czech Republic.
Google Scholar
van Laarhoven,P. and Aarts,E. (1987): Simulated annealing: Theory and Applications. Kluwer Academic Publishers.
Google Scholar
Weiss, S. and Indurkhya, N. (1993): Rule-base Regression. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 1072–1078.
Google Scholar
Weiss, S. and Indurkhya, N. (1995): Rule-based Machine Learning Methods for Functional Prediction. In Journal Of Artificial Intelligence Research (JAIR), volume 3, pp.383–403.
Google Scholar

Download references

Author information

Authors and Affiliations

LIACC-University of Porto, R. Campo Alegre, 823-4150, Porto, Portugal
Luís Torgo & João Gama

Authors

Luís Torgo
View author publications
You can also search for this author in PubMed Google Scholar
João Gama
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Díbio L. Borges Celso A. A. Kaestner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Torgo, L., Gama, J. (1996). Regression by classification. In: Borges, D.L., Kaestner, C.A.A. (eds) Advances in Artificial Intelligence. SBIA 1996. Lecture Notes in Computer Science, vol 1159. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61859-7_6

Download citation

DOI: https://doi.org/10.1007/3-540-61859-7_6
Published: 09 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61859-1
Online ISBN: 978-3-540-70742-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics