Two-Stage Multi-Sample Cluster Analysis as a General Approach to Discriminant Analysis

Eisenblätter, Dorothea; Bozdogan, Hamparsum

doi:10.1007/978-94-009-3977-6_6

Dorothea Eisenblätter⁶ &
Hamparsum Bozdogan⁷

Part of the book series: Theory and Decision Library ((TDLB,volume 8))

157 Accesses
3 Citations

Abstract

This paper introduces Two-Stage Multi-Sample Cluster Analysis (TSMSCA), i.e., the problem of grouping samples and improving upon homogeneity via reassigning individual objects, as a general approach to ‘classical’ discriminant analysis (DA).

Akaike’s Information Criterion (AIC) and Bozdogan’s CAIC are derived and used in TSMSCA to choose the best fitting model and the best partition among all possible clustering alternatives. With this approach the dimension of the discriminant space is determined, and using a decision-tree classifier, the best lower dimensional models are identified, yielding a hierarchy of efficient separation and assignment rules. On each step of the hierarchy, the performance of the classification of the best discriminant model is evaluated either by a cross-validation method or the method of conditional clustering.

Cross-validation reassigns one object at a time based only on the tentatively updated model, whereas the conditional clustering method actually executes reassignments of objects via a transfer and swapping algorithm given the best discriminant model as the initial partition.

Numerical examples are carried out on real data sets to demonstrate the generality and versatility of the proposed new approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Akaike, H. (1973). ‘Information Theory and an Extension of the Maximum Likelihood Principle’ in Second International Symposium on Information Theory, (B.N. Petrov and F. Csaki, editors). Akademiai Kiado: Budapest, 267–281.
Google Scholar
Akaike, H. (1974). ‘A New Look at the Statistical Model Identification,’ IEEE Transactions on Automatic Control AC-19, 716–723.
Article MathSciNet Google Scholar
Akaike, H. (1977). ‘On Entropy Maximization Principle’ in Proceedings on Applications of Statistics (P.R. Krishnaiah, editor). North-Holland: Amsterdam, 27–47.
Google Scholar
Akaike, H. (1979). ‘A Bayesian Analysis of the Minimum AIC Procedure,’ Annals of the Institute of Statistical Mathematics (Part A) 30, 9–14.
Article MathSciNet Google Scholar
Akaike, H. (1981). ‘Likelihood of a Model and Information Criteria,’ Journal of Econometrics 16, 3–14.
Article MATH Google Scholar
Andrews, D.F., and Herzberg, A.M. (1985). Data. A Collection of Problems from Many Fields for the Student and Research Worker. Springer: New York.
MATH Google Scholar
Banfield, C.F., and Bassill, L.C. (1977). ‘Algorithm AS 113: A Transfer Algorithm for Non-hierarchical Classification,’ Applied Statistics 26, 206–210.
Article Google Scholar
Box, G.E.P. (1949). ‘A General Distribution Theory for a Class of Likelihood Criteria,’ Biometrika 36, 317–346.
MathSciNet Google Scholar
Box, G.E.P., and Cox, D.R. (1964). ‘An Analysis of Transformations,’ (with discussion), Journal of the Royal Statistical Society (B) 26, 211–252.
MathSciNet MATH Google Scholar
Bozdogan, H. (1983). ‘Determining the Number of Component Clusters in the Standard Multivariate Normal Mixture Model Using Model-Selection Criteria,’ Technical Report No. UIC/DQM/A83-1, June 16, 1983, Army Research Office Contract DAAG29-82-K-0155, University of Illinois at Chicago, Box 4348, Chicago, Illinois 60680.
Google Scholar
Bozdogan, H. (1984). ‘AIC-Replacements for Multivariate Multi-Sample Conventional Tests of Homogeneity Models,’ Technical Paper #4 in Statistics, Department of Mathematics, University of Virginia, Charlottesville, VA, 22903.
Google Scholar
Bozdogan, H. (1986). ‘Multi-Sample Cluster Analysis as a General Alternative to Multiple Comparison Procedures,’ Bulletin of Informatics and Cybernetics Research Association of Statistical Sciences 22, 95–130.
MATH Google Scholar
Bozdogan, H. (1987). ‘Model Selection and Akaike’s Information Criterion (AIC): The General Theory and Its Analytical Extensions,’ (to appear in the Special Issue of Psychometrika).
Google Scholar
Bozdogan, H., and Sclove, S.L. (1984). ‘Multi-Sample Cluster Analysis Using Akaike’s Information Criterion,’ Annals of the Institute of Statistical Mathematics (Part B) 36, 243–253.
Google Scholar
Duran, B.S., and Odell, P.L. (1974). Cluster Analysis: A Survey. Springer: New York.
MATH Google Scholar
Eisenblätter, D. (1987). Two-Stage Multi-Sample Cluster Analysis, Ph.D. Thesis (anticipated), Seminar für Wirtschafts- und Sozialstatistik der Universität zu Köln.
Google Scholar
Fahrmeir, L., and Hamerle, A., editors (1984). Multivariate statistische Verfahren, de Gruyter: Berlin.
MATH Google Scholar
Fisher, R.A. (1936). ‘The Use of Multiple Measurements in Taxonomic Problems,’ Annals of Eugenics 7, 179–188.
Article Google Scholar
Ganesalingam, S., and McLachlan, G.J. (1979). ‘A Case Study of Two Clustering Methods Based on Maximum Likelihood,’ Statistical Neerlandica 33, 81–90.
Article MathSciNet MATH Google Scholar
Johnson, R.A., and Wichern, D. (1983). Applied Multivariate Statistical Analysis. Prentice Hall: New York.
Google Scholar
Lachenbruch, P.A. (1975). Discriminant Analysis. Hafner Press: New York.
MATH Google Scholar
Lachenbruch, P.A., and Mickey, M.R. (1968). ‘Estimation of Error Rates in Discriminant Analysis,’ Technometrics 10, 1–11.
Article MathSciNet Google Scholar
Mardia, K.V., Kent, J.T., and Bibby, J.M. (1979). Multivariate Analysis. Academic Press: New York.
MATH Google Scholar
Schwarz, G. (1978). ‘Estimating the Dimension of a Model,’ Annals of Statistics 6, 461–464.
Article MathSciNet MATH Google Scholar
Sclove, S.C. (1977). ‘Population Mixture Models and Clustering Algorithms,’ Communications in Statistics A 6, 417–434.
Article MathSciNet Google Scholar
Sclove, S.C. (1983). ‘Application of the Conditional Population Mixture Model to Image Segmentation,’ IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-5, 428–433.
Article Google Scholar
Seber, G.A. (1984). Multivariate Observations. Wiley: New York.
Book MATH Google Scholar
Späth, H. (1975). Cluster-Analyse-Algorithmen. Oldenbourg: München.
MATH Google Scholar
Späth, H. (1983). Cluster-Formation und -Analyse. Oldenbourg: München.
MATH Google Scholar
Symons, M.J. (1981). ‘Clustering Criteria and Multivariate Normal Mixtures,’ Biometrics 37, 35–43.
Article MathSciNet MATH Google Scholar
Titterington, D.M., Smith, A.F.M., and Makov, U.E. (1985). Statistical Analysis of Finite Mixture Distributions. Wiley: New York.
MATH Google Scholar
Wilks, S.S. (1932). ‘Certain Generalization in the Analysis of Variance,’ Biometrika 24, 471–494.
Google Scholar

Download references

Author information

Authors and Affiliations

Seminar für Wirtschaftsund Sozialstatistik, Universität zu Köln, Albertus-Magnus-Platz, 5000, Köln 41, Federal Republic of Germany
Dorothea Eisenblätter
Department of Mathematics Math./ Astronomy Building, University of Virginia, Charlottesville, Virginia, 22903, USA
Hamparsum Bozdogan

Authors

Dorothea Eisenblätter
View author publications
You can also search for this author in PubMed Google Scholar
Hamparsum Bozdogan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics, University of Virginia, Charlottesville, Virginia, USA
H. Bozdogan
Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, Ohio, USA
A. K. Gupta

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Eisenblätter, D., Bozdogan, H. (1987). Two-Stage Multi-Sample Cluster Analysis as a General Approach to Discriminant Analysis. In: Bozdogan, H., Gupta, A.K. (eds) Multivariate Statistical Modeling and Data Analysis. Theory and Decision Library, vol 8. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-3977-6_6

Download citation

DOI: https://doi.org/10.1007/978-94-009-3977-6_6
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-8264-8
Online ISBN: 978-94-009-3977-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics