Abstract
We propose new affine invariant tests for multivariate normality, based on independence characterizations of the sample moments of the normal distribution. The test statistics are obtained using canonical correlations between sets of sample moments in a way that resembles the construction of Mardia’s skewness measure and generalizes the Lin–Mudholkar test for univariate normality. The tests are compared to some popular tests based on Mardia’s skewness and kurtosis measures in an extensive simulation power study and are found to offer higher power against many of the alternatives.
Similar content being viewed by others
References
Bartlett MS (1939) A note on tests of significance in multivariate analysis. Math Proc Camb Philos Soc 35:180–185
Cerioli A, Farcomeni A, Riani M (2013) Robust distances for outlier-free goodness-of-fit testing. Comput Stat Data Anal 65:29–45
Doornik JA, Hansen H (2008) An omnibus test for univariate and multivariate normality. Oxf Bull Econ Stat 70:927–939
Dubkov AA, Malakhov AN (1976) Properties and interdependence of the cumulants of a random variable. Radiophys Quantum Electron 19:833–839
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Henderson HV, Searle SR (1979) Vec and vech operators for matrices, with some uses in Jacobians and multivariate statistics. Can J Stat 7:65–81
Henze N (2002) Invariant tests for multivariate normality: a critical review. Stat Pap 43:467–506
Kankainen A, Taskinen S, Oja H (2007) Tests of multinormality based on location vectors and scatter matrices. Stat Methods Appl 16:357–359
Kaplan EL (1952) Tensor notation and the sampling cumulants of k-statistics. Biometrika 39:319–323
Kollo T (2002) Multivariate skewness and kurtosis measures with an application in ICA. J Multivar Anal 99:2328–2338
Kollo T, von Rosen D (2005) Advanced multivariate statistics with matrices. Springer, Berlin. ISBN 978-1-4020-3418-3
Kotz S, Kozubowski TJ, Podgórski K (2000) An asymmetric multivariate Laplace distribution, Technical Report No. 367, Department of Statistics and Applied Probability, University of California at Santa Barbara
Kshirsagar AM (1972) Multivariate analysis. Marcel Dekker, ISBN 0-8247-1386-9
Lin C-C, Mudholkar GS (1980) A simple test for normality against asymmetric alternatives. Biometrika 67:455–61
Mardia KV (1970) Measures of multivariate skewness and kurtosis with applications. Biometrika 57:519–530
Mardia KV (1974) Applications of some measures of multivariate skewness and kurtosis in testing normality and robustness studies. Sankhya Indian J Stat 36:115–128
Mardia KV, Kent JT (1991) Rao score tests for goodness of fit and independence. Biometrika 78:355–363
Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, ISBN 0-12-471250-9
McCullagh P (1987) Tensor methods in statistics. University Press, ISBN 0-412-27480-9
Mecklin CJ, Mundfrom DJ (2004) An appraisal and bibliography of tests for multivariate normality. Int Stat Rev 72:123–128
Mecklin CJ, Mundfrom DJ (2005) A Monte Carlo comparison of the type I and type II error rates of tests of multivariate normality. J Stat Comput Simul 75:93–107
Mudholkar GS, Marchetti CE, Lin CT (2002) Independence characterizations and testing normality against restricted skewness–kurtosis alternatives. J Stat Plan Inference 104:485–501
Stehlík M, Fabián Z, Střelec L (2012) Small sample robust testing for normality against Pareto tails. Commun Stat Simul Comput 41:1167–1194
Stehlík M, Střelec L, Thulin M (2014) On robust testing for normality in chemometrics. Chemom Intell Lab Syst 130:98–109
Thulin M (2010) On two simple tests for normality with high power. Pre-print, arXiv:1008.5319
Acknowledgments
The author wishes to thank the editor and two anonymous referees for comments that helped improve the paper, and Silvelyn Zwanzig for several helpful suggestions.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: proofs and tables
Appendix: proofs and tables
For the proof of Theorems 3 and Theorems 4 we need some basic properties of the Kronecker product \(\otimes \) and \(\mathrm{vech}\) and \(\mathrm{vec}\) operators from Henderson and Searle (1979). See also Kollo and von Rosen (2005) and Kollo (2002) for more on these tools from matrix algebra.
For a \(p\times q\) matrix \({\varvec{{A}}}=\{a_{ij}\}\) and an \(r\times s\) matrix \({\varvec{{B}}}\), the Kronecker product \({\varvec{{A}}}\otimes {\varvec{{B}}}\) is the \(pr\times qs\) matrix \(\{a_{ij}{\varvec{{B}}}\}\), \(i=1,\ldots ,p\), \(j=1,\ldots ,q\). The \(\mathrm{vec}\) operator stacks the columns of a matrix underneath eachother, forming a single vector. If the columns of the \(p\times q\) matrix \({\varvec{{A}}}\) are denoted \({\varvec{{a_1}}},\ldots ,{\varvec{{a_q}}}\) then \(\mathrm{vec}({\varvec{{A}}})=({\varvec{{a_1'}}},\ldots ,{\varvec{{a_q'}}})'\) is a vector of length \(pq\).
We will use that
and that if \({\varvec{{A}}}\) is a \(p\times p\) matrix and \({\varvec{{B}}}\) a \(q\times q\) matrix,
The \(\mathrm{vech}\) operator works as the \(\mathrm{vec}\) operator, except that it only contains each distinct element of the matrix once. For a symmetric matrix \({\varvec{{A}}}\), \(\mathrm{vech}({\varvec{{A}}})\) thus contains only the diagonal and the elements above the diagonal, whereas \(\mathrm{vec}({\varvec{{A}}})\) contains the diagonal elements and the off-diagonal elements twice.
We have the following relationship between the \(\mathrm{vec}\) operator and the Kronecker product:
Furthermore, for a given symmetric \(p\times p\) matrix \({\varvec{{A}}}\) there exists a \(p(p+1)/2\times p^2\) matrix \({\varvec{{H}}}\) and a \(p^2\times p(p+1)/2\) matrix \({\varvec{{G}}}\) such that
As a preparation for the proof of Theorem 3, we prove the following auxiliary lemma.
Lemma 1
Assume that \({\varvec{{X}}},{\varvec{{X_1}}}, \ldots , {\varvec{{X_n}}}\) are i.i.d. \(p\)-variate random variables fulfilling the conditions of Theorem 1. Let \(S_{ij}=(n-1)^{-1}\sum _{k=1}^n(X_{k,i}-\bar{X}_i)(X_{k,j}-\bar{X}_j)\) be the elements of the sample covariance matrix \({\varvec{{S}}}\).
is a vector with \(q=p(p+1)/2\) distinct elements. Denote its covariance matrix \(\mathrm{Cov}({\varvec{{u_X}}})={\varvec{{\Lambda _{22}}}}\).
Let \({\varvec{{A}}}\) be a nonsingular \(p\times p\) matrix and let \({\varvec{{b}}}\) be a \(p\)-dimensional vector. Then there exists a nonsingular \(q\times q\) matrix \({\varvec{{D}}}\) such that
-
(i)
the sample variances and covariances of \({\varvec{{Y}}}={\varvec{{AX}}}+{\varvec{{b}}}\) are given by \({\varvec{{u_Y}}}={\varvec{{Du_X}}}\),
-
(ii)
\(\mathrm{Cov}({\varvec{{u_Y}}})={\varvec{{D\Lambda _{22}D'}}}\) and
-
(iii)
\(\det ({\varvec{{D}}})=\det ({\varvec{{A}}})^{p+1}\),
Proof
The transformed sample \({\varvec{{AX}}}+{\varvec{{b}}}\) has sample covariance matrix \({\varvec{{ASA'}}}\), so we wish to study \(\mathrm{vech}({\varvec{{ASA'}}})\). We have
Moreover, since \({\varvec{{S}}}\) is symmetric there exist nonsingular matrices \({\varvec{{G}}}\) and \({\varvec{{H}}}\) such that
Thus
which establishes the existence of \({\varvec{{D}}}\). From Section 4.2 of Henderson and Searle (1979) we have
which is nonzero, since \({\varvec{{A}}}\) is nonsingular. \({\varvec{{D}}}\) is hence also nonsingular. In conclusion, we have established the existence and nonsingularity of \({\varvec{{D}}}\) as well as (i) and (iii). Finally, (ii) follows immediately from (i). \(\square \)
We now have the tools necessary to tackle Theorem 3.
Proof of Theorem 3
-
(i)
From Theorem 10.2.4 in Mardia et al. (1979) we have that the canonical correlations between the random vectors \({\varvec{{Y}}}\) and \({\varvec{{Z}}}\) are invariant under the nonsingular linear transformations \({\varvec{{AY}}}+{\varvec{{b}}}\) and \({\varvec{{CZ}}}+{\varvec{{d}}}\). Clearly all five statistics are invariant under changes in location, since \({\varvec{{S_{11}}}}\), \({\varvec{{S_{22}}}}\), \({\varvec{{S_{12}}}}\) and \({\varvec{{S_{21}}}}\) all share that invariance property. It therefore suffices to show that the nonsingular linear transformation \({\varvec{{AX}}}\) induces nonsingular linear transformations \({\varvec{{C\bar{X}}}}\) and \({\varvec{{Du}}}\). \({\varvec{{C}}}={\varvec{{A}}}\) is immediate and the existence of \({\varvec{{D}}}\) is given by Lemma 1.
-
(ii)
By part (ii) of Theorem 1, \(\mu _{ijk}=0\) for all \(i,j,k\) implies that \({\varvec{{\Lambda }}}_{12}={\varvec{{0}}}\). But then \({\varvec{{\Lambda _{11}}}}^{-1}{\varvec{{\Lambda _{12}}}}{\varvec{{\Lambda _{22}}}}^{-1}{\varvec{{\Lambda _{21}}}}={\varvec{{0}}}\) and all canonical correlations are 0. If \(\mu _{ijk}\ne 0\) then \(\rho (\bar{X}_i,S_{jk})\ne 0\). Thus the linear combinations \({\varvec{{a'\bar{X}}}}=\bar{X}_i\) and \({\varvec{{b'u}}}=S_{jk}\) have nonzero correlation. \(\lambda _1\) must therefore be greater than 0.
-
(iii)
Follows from the fact that the statistics are continuous function of sample moments that converge almost surely.\(\square \)
The proofs of parts (ii) and (iii) of Theorem 4 are analog to the previous proof. The proof for part (i) is however slightly different as we omit to explicitly give a matrix that gives a nonsingular linear transformation of \({\varvec{{v_X}}}\).
Proof of Theorem 4
(i) Let the third order central moment of a multivariate random variable \({\varvec{{Z}}}\) be
Given a sample \({\varvec{{X}}}_1,\ldots ,{\varvec{{X}}}_p\), let \(S_{ijk}=\frac{n}{(n-1)(n-2)}\sum _{r=1}^n(X_{r,i}-\bar{X}_i)(X_{r,j}-\bar{X}_j)(X_{r,k}-\bar{X}_k)\). When the distribution of \({\varvec{{Z}}}\) is the empirical distribution of said sample,
Similarly \(\mathrm{vec}\left( \bar{m}_3({\varvec{{Z}}})\right) \) stacks the elements of \(\bar{m}_3({\varvec{{Z}}})\) in a vector that simply is \(\mathrm{vech}\left( \bar{m}_3({\varvec{{Z}}})\right) \) with a few repetitions:
Thus, for each linear combination \({\varvec{{a'}}}{\varvec{{w_X}}}\) there exists a \({\varvec{{b}}}\) so that \({\varvec{{b'}}}{\varvec{{v_X}}}={\varvec{{a'}}}{\varvec{{w_X}}}\) and therefore, by the definition of canonical correlations, the (sample) canonical correlations between \({\varvec{{\bar{X}}}}\) and \({\varvec{{v_X}}}\) are the same as those between \({\varvec{{\bar{X}}}}\) and \({\varvec{{w_X}}}\).
Writing \({\varvec{{Y}}}={\varvec{{Z}}}-{ E }{\varvec{{Z}}}\), we have \(\bar{m}_3({\varvec{{Z}}})={ E }\left( {\varvec{{Y}}}({\varvec{{Y}}}\otimes {\varvec{{Y}}})'\right) \) and
Hence
Now, \(\det ({\varvec{{A}}}\otimes {\varvec{{A}}}\otimes {\varvec{{A}}})=\det ({\varvec{{A}}}\otimes {\varvec{{A}}})^p\det ({\varvec{{A}}})^{p^2}=\det ({\varvec{{A}}})^{3p^2}>0\), so \({\varvec{{E}}}:=({\varvec{{A}}}\otimes {\varvec{{A}}}\otimes {\varvec{{A}}})\) is a nonsingular matrix such that \(\bar{m}_3({\varvec{{AZ}}})={\varvec{{E}}}\bar{m}_3({\varvec{{Z}}})\). Since canonical correlations are invariant under nonsingular linear transformations of the two sets of variables, this means that the canonical correlations between \({\varvec{{\bar{X}}}}\) and \({\varvec{{w_X}}}\) remain unchanged under the transformation \({\varvec{{AX}}}+{\varvec{{b}}}\). Thus the canonical correlations between \({\varvec{{\bar{X}}}}\) and \({\varvec{{v_Y}}}\) must also necessarily remain unchanged. This proves the affine invariance of the statistics. \(\square \)
Rights and permissions
About this article
Cite this article
Thulin, M. Tests for multivariate normality based on canonical correlations. Stat Methods Appl 23, 189–208 (2014). https://doi.org/10.1007/s10260-013-0252-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-013-0252-5