Stratification Before Discriminant Analysis: A Must?

Rasson, Jean-Paul; Pirçon, Jean-Yves; Roland, François

doi:10.1007/3-540-26981-9_7

Jean-Paul Rasson²¹,
Jean-Yves Pirçon²¹ &
François Roland²¹

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

2527 Accesses

Abstract

It could be said as a tautology that, if we want to make a discriminant analysis between two or more populations and if we are able to divide these populations and training sets into some homogeneous subsets, it will be more efficient to make it on each of these subsets and then to combine the results. This can be done using one or two variables highly correlated with the one we want to predict. Our point of view will be a bit different: we will use a classification tree on all the available variables. We will first recall the first attempt (presented at IFCS2002 in Krakow). This one allowed us to obtain on an example of prediction of failure of the enterprises a gain of 5% of well classified data, using, after and before stratification, the classical Fisher’s linear discriminant rule or the logistic regression. We intend to present a new method, still a classification tree, but with a multivariate criterion and in an agglomerative way. We compare both methods. In the same conditions and with the same data set, the gain is as high as 20%! Results will obviously also be presented when the methods are applied to test sets. Finally, we will conclude.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

BOCK, H.H. (1989): Probabilistic Aspects in Cluster Analysis. Conceptual and Numerical Analysis of Data, 12–44.
Google Scholar
BREIMAN, L., FRIEDMAN, J.H., OLSHEN, R.A., and STONE, C.J. (1984): Clas-sification and Regression Trees. Belmont, Wadsworth.
Google Scholar
CHAVENT, M. (1997): Analyse des données symboliques. Une méthode divisive de classification. PhD thesis. Universit de Paris IX, Dauphine.
Google Scholar
DAUDIN, J-J., MASSON, J-P., TOMASSONE, R., and DANZART, M. (1988): Discrimination et classement. Masson, Paris.
Google Scholar
SILVERMAN, B.W. (1986): Density Estimation for Statistics and Data Analysis. Chapman and Hall, London.
Google Scholar
WILLIAMS, W.T. and LAMBERT, J.M. (1959): Multivariate Method in Plant Ecology. Journal of Ecology, 47, 83–101.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Namur, 8, Rempart de la Vierge, B-5000, Namur, Belgium
Jean-Paul Rasson, Jean-Yves Pirçon & François Roland

Authors

Jean-Paul Rasson
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Yves Pirçon
View author publications
You can also search for this author in PubMed Google Scholar
François Roland
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Business Administration and Economics, Brandenburg University of Technology Cottbus, Konrad-Wachsmann-Allee 1, 03046, Cottbus, Germany
Daniel Baier (Chair of Marketing and Innovation Management) (Chair of Marketing and Innovation Management)
Department of Medical Biometrics Charité Virchow-Klinikum, Humboldt University Berlin, 13344, Berlin, Germany
Klaus-Dieter Wernecke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rasson, JP., Pirçon, JY., Roland, F. (2005). Stratification Before Discriminant Analysis: A Must?. In: Baier, D., Wernecke, KD. (eds) Innovations in Classification, Data Science, and Information Systems. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-26981-9_7

Download citation

DOI: https://doi.org/10.1007/3-540-26981-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23221-6
Online ISBN: 978-3-540-26981-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics