Skip to main content

Exploring Multivariate Modality by Unsupervised Mixture of Cubic B-Splines in 1-D Using Model Selection Criteria

  • Chapter
Data Analysis
  • 985 Accesses

Abstract

This paper proposes and considers a new distribution-free technique based on the work of Atilgan and Bozdogan (1990, 1992) to explore multivariate modality of a given multivariate data set using an unsupervised mixture of cubic B-Spline in density estimation in one dimensional spece. In the multivariate case, this is achieved by utilizing the Mahalanobis (1936) distance of each point from the multivariate mean (centroid), Mahalanobis distance data depth (MDD), jackknife Mahalanobis distance data depth (JMDD), principal components (PC) transformation, and common PC transformation. Analysis is carried out under the assumption that we do not know a priori of the classification or the grouping of the data. These dimension reduction techniques result in a better description of the shape of the underlying distribution which has many attractive properties. The EM algorithm is developed for the mixture of cubic B-splines in an interactive symbolic computational environment to obtain the maximum likelihood estimators of the parameters and to evaluate model selection criteria such as AIC (Akaike, 1973), CAIC (Bozdogan, 1987), and ICOMP (Bozdogan 1988,1990, 1993, 1994) in objectively detecting the modality of the data and in density estimation to determine the most parsimonious fit. A real numerical example is shown with multivariate data set with a known number of mixture clusters and configurations to illustrate the versatility and efficiency of the proposed approach in detecting the multimodality and the density concentration points (clusters) of multivariate data with a high degree of accuracy in univariate space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • AKAIKE, H. (1973): Information theory and an extension of the maximum likelihood principle, In: Second International Symposium on Information Theory, Petrov, B.N. and Csaki, F. (eds.), 267–281, Akademiai Kiado, Budapest.

    Google Scholar 

  • ANDERSON, E. (1935): The irises of the Gaspe Peninsula,Bulletin of the American Iris Society, 9, 59, 2–5.

    Google Scholar 

  • ATILGAN, T. and BOZDOGAN, H. (1990): Selecting the number of knots in fitting cardinal B-splines for density estimation using AIC, Journal of Japan Statistical Society, 9, 179–190.

    Google Scholar 

  • ATILGAN, T. and BOZDOGAN, H. (1992): Convergence properties of MLE’s and asymptotic simultaneous confidence intervals in fitting cardinal B-splines for density estimation, Statistics and Probability Letters, 9, 13, 89–98.

    Article  Google Scholar 

  • BIRKHOFF, G. and de BOOR, C. (1965): Piecewise polynomial interpolation and approximation, In: Approximation of Functions, Garabedian, H.L. (ed.), 164–190, Elsevier Publishing Company, Amsterdam.

    Google Scholar 

  • BOCK, H.H. (1987): On the interface between cluster analysis, principal component analysis, and multidimensional scaling, In: Multivariate Statistical Modeling and Data Analysis, Bozdogan, H. and Gupta, A.K. (eds.), 17–34, D. Reidel Publishing Company, Dordrecht.

    Chapter  Google Scholar 

  • BOZDOGAN, H. (1987): Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions, Psychometrika,9 52, No. 3, 345–370. Special Section (invited paper).

    Article  Google Scholar 

  • BOZDOGAN, H. (1988): ICOMP: A new model selection criterion, In: Classification and related methods of data analysis, Bock, H.H. (ed.), 599–608, North-Holland, Amsterdam.

    Google Scholar 

  • BOZDOGAN, H. (1990): On the information-based measure of covariance complexity and its application on the evaluation of multivariate linear models. Communications in Statistics, Theory and Methods, 9 19 (1), 221–278.

    Article  Google Scholar 

  • BOZDOGAN, H. (1993): Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse Fisher information matrix, In: Studies in Classification, Data Analysis, and Knowledge Organization, Opitz, O. et al. (eds.), 40–54, Springer-Verlag, Heidelberg.

    Google Scholar 

  • BOZDOGAN, H. (1994): Mixture-model cluster analysis using a new informational complexity and model selection criteria, In: Multivariate Statistical Modeling, Vol. 2, Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach, Bozdogan, H. (ed.), 69–113, Kluwer Academic Publishers, Dordrecht.

    Google Scholar 

  • BOZDOGAN, H. and HAUGHTON, D.(1996): Informational complexity criteria for regression models, Submitted to Communications in Statistics, Theory and Methods.

    Google Scholar 

  • CURRY, H.B. and SCHOENBERG, I.J. (1966): On Polya. Frequency functions. IV. The Fundamental spline functions and their limits, J. Analyse. Math.9, 17, 71–107.

    Article  Google Scholar 

  • DEMPSTER, A.P., et al. (1977): Maximum likelihood from incomplete data via the EM algorithm, Journal of Royal Statistical Society, Series B, 9, 39, 1–38.

    Google Scholar 

  • FISHER, R.A. (1936): The use of multiple measurements in taxonomic problems, Annals of Eugenics, 9, 7, 179–188.

    Article  Google Scholar 

  • FLURY, B. (1984): Common principal components in k groups, J. of the American Statistical Association, 9, 79, 892–898.

    Google Scholar 

  • FLURY, B. (1988): Common principal components and related multivariate models, John Wiley and Sons, New York.

    Google Scholar 

  • LIU, R.Y. (1995): Control charts for multivariate processes, J. of the American Statistical Association, 9, 90, 1380–1387.

    Article  Google Scholar 

  • MAHALANOBIS, P.C. (1936): On the generalized distance in statistics, Proc. Nat. Inst. Sci. India, 9, 12, 49–55.

    Google Scholar 

  • POSKITT, D.S. (1987): Precision, complexity and Bayesian model determination, Journal of Royal Statistical Society, Series B, 9, 49, 199–208.

    Google Scholar 

  • SCHUMAKER, L.L. (1969): Approximation by splines, In: Theory and Applications of Spline Functions, Greville, T.N.E. (ed.), 65–85, Academic Press, New York.

    Google Scholar 

  • SCLOVE, S.L. (1987): Metric considerations in clustering:implications for algorithms, In: Multivariate Statistical Modeling and Data AnalysisBozdogan, H. and Gupta, A.K. (eds.), 163–186, D. Reidel Publishing Company, Dordrecht.S

    Chapter  Google Scholar 

  • SCOTT, D.W. (1992): Multivariate density estimation, John Wiley and Sons, New York.

    Book  Google Scholar 

  • WU, C.F. (1983): On the convergence of the EM algorithm, Annals of Statistics, 9, 11, 95–103.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin · Heidelberg

About this chapter

Cite this chapter

Bozdogan, H. (2000). Exploring Multivariate Modality by Unsupervised Mixture of Cubic B-Splines in 1-D Using Model Selection Criteria. In: Gaul, W., Opitz, O., Schader, M. (eds) Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-58250-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-58250-9_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67731-4

  • Online ISBN: 978-3-642-58250-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics