Skip to main content
Log in

A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We propose a family of multivariate heavy-tailed distributions that allow variable marginal amounts of tailweight. The originality comes from introducing multidimensional instead of univariate scale variables for the mixture of scaled Gaussian family of distributions. In contrast to most existing approaches, the derived distributions can account for a variety of shapes and have a simple tractable form with a closed-form probability density function whatever the dimension. We examine a number of properties of these distributions and illustrate them in the particular case of Pearson type VII and t tails. For these latter cases, we provide maximum likelihood estimation of the parameters and illustrate their modelling flexibility on simulated and real data clustering examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Andrews, J.L., McNicholas, P.D.: Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions. Stat. Comput. 22(5), 1021–1029 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  • Archambeau, C., Verleysen, M.: Robust Bayesian clustering. Neural Netw. 20(1), 129–138 (2007)

    Article  MATH  Google Scholar 

  • Arnaud, E., Christensen, H., Lu, Y.-C., Barker, J., Khalidov, V., Hansard, M., Holveck, B., Mathieu, H., Narasimha, R., Taillant, E., Forbes, F., Horaud, R.: The CAVA corpus: synchronised stereoscopic and binaural datasets with head movements. In: 10th International Conference on Multimodal Interfaces, ICMI 2008, pp. 109–116. Chania, Crete, Greece (2008). ACM

    Google Scholar 

  • Azzalini, A., Genton, M.G.: Robust likelihood methods based on the skew-t and related distributions. Int. Stat. Rev. 76(1), 106–129 (2008)

    Article  MATH  Google Scholar 

  • Barndorff-Nielsen, O., Kent, J., Sorensen, M.: Normal variance-mean mixtures and z distributions. Int. Stat. Rev. 50(2), 145–159 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  • Bishop, C.M., Svensen, M.: Robust Bayesian mixture modelling. Neurocomputing 64, 235–252 (2005)

    Article  Google Scholar 

  • Bouveyron, C., Girard, S., Schmid, C.: High dimensional data clustering. Comput. Stat. Data Anal. 52(1), 502–519 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Browne, R., McNicholas, P.: Orthogonal Stiefel manifold optimization for eigen-decomposed covariance parameter estimation in mixture models. In: Statistics and Computing (2012). Published online doi:10.1007/s11222-012-9364-2

    Google Scholar 

  • Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recognit. 28(5), 781–793 (1995)

    Article  Google Scholar 

  • Cuesta-Albertos, J.A., Gordaliza, A., Matran, C.: Trimmed k-means: an attempt to robustify quantizers. Ann. Stat. 25(2), 553–576 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  • Cuesta-Albertos, J.A., Matrán, C., Mayo-Iscar, A.: Robust estimation in the normal mixture model based on robust clustering. J. R. Stat. Soc., Ser. B, Stat. Methodol. 70(4), 779–802 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Daul, S., DeGiorgi, E., Lindskog, F., McNeil, A.J.: The grouped t-copula with an application to credit risk. Risk 16, 73–76 (2003)

    Google Scholar 

  • Demarta, S., McNeil, A.J.: The t copula and related copulas. Int. Stat. Rev. 73(1), 111–129 (2005)

    Article  MATH  Google Scholar 

  • Eltoft, T., Kim, T., Lee, T.-W.: Multivariate scale mixture of Gaussians modeling. In: Rosca, J., Erdogmus, D., Principe, J., Haykin, S. (eds.) Independent Component Analysis and Blind Signal Separation. Lecture Notes in Computer Science, vol. 3889, pp. 799–806. Springer, Berlin/Heidelberg (2006)

    Chapter  Google Scholar 

  • Fang, H.-B., Fang, K.-T., Kotz, S.: The meta-elliptical distributions with given marginals. J. Multivar. Anal. 82(1), 1–16 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  • Finegold, M., Drton, M.: Robust graphical modeling of gene networks using classical and alternative t-distributions. Ann. Appl. Stat. 5(2A), 1057–1080 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  • Flury, B.N.: Common principal components in K groups. J. Am. Stat. Assoc. 79(388), 892–898 (1984)

    MathSciNet  Google Scholar 

  • Flury, B.N., Gautschi, W.: An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form. SIAM J. Sci. Stat. Comput. 7(1), 169–184 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  • Forbes, F., Doyle, S., Garcia-Lorenzo, D., Barillot, C., Dojat, M.: A weighted multi-sequence Markov model for brain lesion segmentation. In: 13th International Conference on Artificial Intelligence and Statistics (AISTATS10), pp. 13–15. Sardinia, Italy (2010)

    Google Scholar 

  • Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  • Giordani, R., Mun, X., Tran, M.-N., Kohn, R.: Flexible multivariate density estimation with marginal adaptation. J. Comput. Graph. Stat. (2012). Published on line doi:10.1080/10618600.2012.672784

    Google Scholar 

  • Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 2, 2nd edn. Wiley, New York (1994)

    MATH  Google Scholar 

  • Jones, M.C.: A dependent bivariate t distribution with marginals on different degrees of freedom. Stat. Probab. Lett. 56(2), 163–170 (2002)

    Article  MATH  Google Scholar 

  • Karlis, D., Santourian, A.: Model-based clustering with non-elliptically contoured distributions. Stat. Comput. 19(1), 73–83 (2009)

    Article  MathSciNet  Google Scholar 

  • Khalidov, V.: Conjugate mixture models for the modelling of visual and auditory perception. PhD thesis, Grenoble University (October 2010)

  • Khalidov, V., Forbes, F., Horaud, R.: Conjugate mixture models for clustering multimodal data. Neural Comput. 23(2), 517–557 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  • Kotz, S., Nadarajah, S.: Multivariate t Distributions and their Applications. Cambridge (2004)

  • McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000a)

    Book  MATH  Google Scholar 

  • McLachlan, G.J., Peel, D.: Robust mixture modelling using the t distribution. Stat. Comput. 10(4), 339–348 (2000b)

    Article  Google Scholar 

  • Nadarajah, S., Dey, D.K.: Multitude of multivariate t distributions. J. Theor. Appl. Stat. 39(2), 149–181 (2005)

    MATH  MathSciNet  Google Scholar 

  • Nadarajah, S., Kotz, S.: Multitude of bivariate t distributions. J. Theor. Appl. Stat. 38(6), 527–539 (2004)

    MATH  MathSciNet  Google Scholar 

  • Shaw, W.T., Lee, K.T.A.: Bivariate Student distributions with variable marginal degrees of freedom and independence. J. Multivar. Anal. 99(6), 1276–1287 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  • Shephard, N.: From characteristic function to distribution function: a simple framework for the theory. Econom. Theory 7(4), 519–529 (1991)

    Article  MathSciNet  Google Scholar 

  • Shoham, S.: Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions. Pattern Recognit. 35(5), 1127–1142 (2002)

    Article  MATH  Google Scholar 

  • Witkovský, V.: On the exact computation of the density and of the quantiles of linear combinations of t and F random variables. J. Stat. Plan. Inference 94(1), 1–13 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florence Forbes.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Forbes, F., Wraith, D. A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering. Stat Comput 24, 971–984 (2014). https://doi.org/10.1007/s11222-013-9414-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-013-9414-4

Keywords

Navigation