Skip to main content

Two-Stage Multi-Sample Cluster Analysis as a General Approach to Discriminant Analysis

  • Chapter
Multivariate Statistical Modeling and Data Analysis

Part of the book series: Theory and Decision Library ((TDLB,volume 8))

Abstract

This paper introduces Two-Stage Multi-Sample Cluster Analysis (TSMSCA), i.e., the problem of grouping samples and improving upon homogeneity via reassigning individual objects, as a general approach to ‘classical’ discriminant analysis (DA).

Akaike’s Information Criterion (AIC) and Bozdogan’s CAIC are derived and used in TSMSCA to choose the best fitting model and the best partition among all possible clustering alternatives. With this approach the dimension of the discriminant space is determined, and using a decision-tree classifier, the best lower dimensional models are identified, yielding a hierarchy of efficient separation and assignment rules. On each step of the hierarchy, the performance of the classification of the best discriminant model is evaluated either by a cross-validation method or the method of conditional clustering.

Cross-validation reassigns one object at a time based only on the tentatively updated model, whereas the conditional clustering method actually executes reassignments of objects via a transfer and swapping algorithm given the best discriminant model as the initial partition.

Numerical examples are carried out on real data sets to demonstrate the generality and versatility of the proposed new approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Akaike, H. (1973). ‘Information Theory and an Extension of the Maximum Likelihood Principle’ in Second International Symposium on Information Theory, (B.N. Petrov and F. Csaki, editors). Akademiai Kiado: Budapest, 267–281.

    Google Scholar 

  • Akaike, H. (1974). ‘A New Look at the Statistical Model Identification,’ IEEE Transactions on Automatic Control AC-19, 716–723.

    Article  MathSciNet  Google Scholar 

  • Akaike, H. (1977). ‘On Entropy Maximization Principle’ in Proceedings on Applications of Statistics (P.R. Krishnaiah, editor). North-Holland: Amsterdam, 27–47.

    Google Scholar 

  • Akaike, H. (1979). ‘A Bayesian Analysis of the Minimum AIC Procedure,’ Annals of the Institute of Statistical Mathematics (Part A) 30, 9–14.

    Article  MathSciNet  Google Scholar 

  • Akaike, H. (1981). ‘Likelihood of a Model and Information Criteria,’ Journal of Econometrics 16, 3–14.

    Article  MATH  Google Scholar 

  • Andrews, D.F., and Herzberg, A.M. (1985). Data. A Collection of Problems from Many Fields for the Student and Research Worker. Springer: New York.

    MATH  Google Scholar 

  • Banfield, C.F., and Bassill, L.C. (1977). ‘Algorithm AS 113: A Transfer Algorithm for Non-hierarchical Classification,’ Applied Statistics 26, 206–210.

    Article  Google Scholar 

  • Box, G.E.P. (1949). ‘A General Distribution Theory for a Class of Likelihood Criteria,’ Biometrika 36, 317–346.

    MathSciNet  Google Scholar 

  • Box, G.E.P., and Cox, D.R. (1964). ‘An Analysis of Transformations,’ (with discussion), Journal of the Royal Statistical Society (B) 26, 211–252.

    MathSciNet  MATH  Google Scholar 

  • Bozdogan, H. (1983). ‘Determining the Number of Component Clusters in the Standard Multivariate Normal Mixture Model Using Model-Selection Criteria,’ Technical Report No. UIC/DQM/A83-1, June 16, 1983, Army Research Office Contract DAAG29-82-K-0155, University of Illinois at Chicago, Box 4348, Chicago, Illinois 60680.

    Google Scholar 

  • Bozdogan, H. (1984). ‘AIC-Replacements for Multivariate Multi-Sample Conventional Tests of Homogeneity Models,’ Technical Paper #4 in Statistics, Department of Mathematics, University of Virginia, Charlottesville, VA, 22903.

    Google Scholar 

  • Bozdogan, H. (1986). ‘Multi-Sample Cluster Analysis as a General Alternative to Multiple Comparison Procedures,’ Bulletin of Informatics and Cybernetics Research Association of Statistical Sciences 22, 95–130.

    MATH  Google Scholar 

  • Bozdogan, H. (1987). ‘Model Selection and Akaike’s Information Criterion (AIC): The General Theory and Its Analytical Extensions,’ (to appear in the Special Issue of Psychometrika).

    Google Scholar 

  • Bozdogan, H., and Sclove, S.L. (1984). ‘Multi-Sample Cluster Analysis Using Akaike’s Information Criterion,’ Annals of the Institute of Statistical Mathematics (Part B) 36, 243–253.

    Google Scholar 

  • Duran, B.S., and Odell, P.L. (1974). Cluster Analysis: A Survey. Springer: New York.

    MATH  Google Scholar 

  • Eisenblätter, D. (1987). Two-Stage Multi-Sample Cluster Analysis, Ph.D. Thesis (anticipated), Seminar für Wirtschafts- und Sozialstatistik der Universität zu Köln.

    Google Scholar 

  • Fahrmeir, L., and Hamerle, A., editors (1984). Multivariate statistische Verfahren, de Gruyter: Berlin.

    MATH  Google Scholar 

  • Fisher, R.A. (1936). ‘The Use of Multiple Measurements in Taxonomic Problems,’ Annals of Eugenics 7, 179–188.

    Article  Google Scholar 

  • Ganesalingam, S., and McLachlan, G.J. (1979). ‘A Case Study of Two Clustering Methods Based on Maximum Likelihood,’ Statistical Neerlandica 33, 81–90.

    Article  MathSciNet  MATH  Google Scholar 

  • Johnson, R.A., and Wichern, D. (1983). Applied Multivariate Statistical Analysis. Prentice Hall: New York.

    Google Scholar 

  • Lachenbruch, P.A. (1975). Discriminant Analysis. Hafner Press: New York.

    MATH  Google Scholar 

  • Lachenbruch, P.A., and Mickey, M.R. (1968). ‘Estimation of Error Rates in Discriminant Analysis,’ Technometrics 10, 1–11.

    Article  MathSciNet  Google Scholar 

  • Mardia, K.V., Kent, J.T., and Bibby, J.M. (1979). Multivariate Analysis. Academic Press: New York.

    MATH  Google Scholar 

  • Schwarz, G. (1978). ‘Estimating the Dimension of a Model,’ Annals of Statistics 6, 461–464.

    Article  MathSciNet  MATH  Google Scholar 

  • Sclove, S.C. (1977). ‘Population Mixture Models and Clustering Algorithms,’ Communications in Statistics A 6, 417–434.

    Article  MathSciNet  Google Scholar 

  • Sclove, S.C. (1983). ‘Application of the Conditional Population Mixture Model to Image Segmentation,’ IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-5, 428–433.

    Article  Google Scholar 

  • Seber, G.A. (1984). Multivariate Observations. Wiley: New York.

    Book  MATH  Google Scholar 

  • Späth, H. (1975). Cluster-Analyse-Algorithmen. Oldenbourg: München.

    MATH  Google Scholar 

  • Späth, H. (1983). Cluster-Formation und -Analyse. Oldenbourg: München.

    MATH  Google Scholar 

  • Symons, M.J. (1981). ‘Clustering Criteria and Multivariate Normal Mixtures,’ Biometrics 37, 35–43.

    Article  MathSciNet  MATH  Google Scholar 

  • Titterington, D.M., Smith, A.F.M., and Makov, U.E. (1985). Statistical Analysis of Finite Mixture Distributions. Wiley: New York.

    MATH  Google Scholar 

  • Wilks, S.S. (1932). ‘Certain Generalization in the Analysis of Variance,’ Biometrika 24, 471–494.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1987 D. Reidel Publishing Company, Dordrecht, Holland

About this chapter

Cite this chapter

Eisenblätter, D., Bozdogan, H. (1987). Two-Stage Multi-Sample Cluster Analysis as a General Approach to Discriminant Analysis. In: Bozdogan, H., Gupta, A.K. (eds) Multivariate Statistical Modeling and Data Analysis. Theory and Decision Library, vol 8. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-3977-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-94-009-3977-6_6

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-8264-8

  • Online ISBN: 978-94-009-3977-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics