Abstract
Classification learning can be considered as a regression problem with dependent variable consisting of 0s and 1s. Reducing classification to the problem of finding numerical dependencies we gain an opportunity to utilize powerful regression methods implemented in the PolyAnalyst data mining system. Resulting regression functions can be considered as fuzzy membership indicators for a recognized class. In order to obtain classifying rules, the optimum threshold values which minimize the number of misclassified cases can be found for these functions. We show that this approach allows one to solve the over-fit problem satisfactorily and provides results that are at least not worse than results obtained by the most popular decision tree algorithms.
Chapter PDF
Similar content being viewed by others
References
Belsley, D.A., Kuh, E., Welsch, R.E. Regression diagnostics: identifying influential data and sources of collinearity, New York, John Wiley & Sons, 1980.
Bloedorn, E., Michalski, R.S. The AQ17-DCI system for data-driven constructive induction and its application to the analysis of world economics, in: Proceeding of ISMIS'96 (Ninth International Symposium on Methodologies for Intelligent Systems), Zakopane, Poland, Springer, 1996, pp 108–117.
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J. Classification and Regression Trees, Belmont, CA: Wardsworth, 1984.
Jensen, D. Knowledge discovery through induction with radomization testing, in: Proceedings of the AAAIKDD-91 Workshop, Anaheim, CA, 1991, pp 148–159.
Kass, G.V. An exploratory technique for investigating large quantities of categorical data, Applied Statistics, 24(2), 1974
Kiselev, M.V. PolyAnalyst — a machine discovery system inferring functional programs, in Proceedings of AAAI Workshop on Knowledge Discovery in Databases'94, Seattle, 1994, pp. 237–249.
Kiselev, M.V. PolyAnalyst 2.0: combination of statistical data preprocessing and symbolic KDD technique, in: Proceedings of ECML-95 Workshop on Statistics, Machine Learning and Knowledge Discovery in Databases, Heraklion, Greece, 1995, pp. 187–192.
Kiselev, M.V., Arseniev, S.B. Discovery of numerical dependencies in form of rational expressions, in; Proceedings of ISMIS'96 (Ninth International Symposium on Methodologies for Intelligent Systems), Zakopane, Poland, Springer, 1996, pp. 134–145.
Quinlan, J.R. C4.5 Programs for machine learning. Morgan Kaufmann, 1993.
Zighed, D.A., Auray, J.P., Duru, G. SIPINA: Méthode et logiciel. Lyon Lacassagne, 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kiselev, M.V., Ananyan, S.M., Arseniev, S.B. (1997). Regression-based classification methods and their comparison with decision tree algorithms. In: Komorowski, J., Zytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1997. Lecture Notes in Computer Science, vol 1263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63223-9_113
Download citation
DOI: https://doi.org/10.1007/3-540-63223-9_113
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63223-8
Online ISBN: 978-3-540-69236-2
eBook Packages: Springer Book Archive