Growth Curve Model with Bilinear Random Coefficients

Abstract

In the present paper, we derive a new multivariate model to fit correlated data, representing a general model class. Our model is an extension of the Growth Curve model (also called generalized multivariate analysis of variance model) by additionally assuming randomness of regression coefficients like in linear mixed models. Each random coefficient has a linear or a bilinear form with respect to explanatory variables. In our model, the covariance matrices of the random coefficients is allowed to be singular. This yields flexible covariance structures of response data but the parameter space includes a boundary, and thus maximum likelihood estimators (MLEs) of the unknown parameters have more complicated forms than the ordinary Growth Curve model. We derive the MLEs in the proposed model by solving an optimization problem, and derive sufficient conditions for consistency of the MLEs. Through simulation studies, we confirmed performance of the MLEs when the sample size and the size of the response variable are large.

This is a preview of subscription content, log in to check access.

References

  1. Demidenko, E. (2004). Mixed Models: Theory and Applications. Wiley Series in Probability and Statistics. Wiley-Interscience, New York.

    Google Scholar 

  2. Filipiak, K. and Klein, D. (2017). Estimation of parameters under a generalized growth curve model. Journal of Multivariate Analysis158, 73–86.

    MathSciNet  Article  Google Scholar 

  3. Fujikoshi, Y. and von Rosen, D. (2000). LR tests for random-coefficient covariance structures in an extended growth curve model. Journal of Multivariate Analysis75, 245–268.

    MathSciNet  Article  Google Scholar 

  4. Imori, S. and von Rosen, D. (2015). Covariance components selection in high-dimensional growth curve model with random coefficients. Journal of Multivariate Analysis136, 86–94.

    MathSciNet  Article  Google Scholar 

  5. Ip, W. C., Wu, M. X., Wang, S. G. and Wong, H. (2007). Estimation for parameters of interest in random effects growth curve models. Journal of Multivariate Analysis98, 317–327.

    MathSciNet  Article  Google Scholar 

  6. Kariya, T. (1985). Testing in the Multivariate General Linear Model. Kinokuniya, Tokyo.

    Google Scholar 

  7. Kshirsagar, A. and Smith, W. (1995). Growth Curves, 145. CRC Press, Boca Raton.

    Google Scholar 

  8. Lange, N. and Laird, N. M. (1989). The effect of covariance structure on variance estimation in balanced growth-curve models with random parameters. Journal of the American Statistical Association84, 241–247.

    MathSciNet  Article  Google Scholar 

  9. Lehmann, E. L. (2004). Elements of Large-Sample Theory. Springer Science & Business Media, Berlin.

    Google Scholar 

  10. Li, Z. (2015). Testing for random effects in growth curve models. Communications in Statistics-Theory and Methods44, 564–572.

    MathSciNet  Article  Google Scholar 

  11. Potthoff, R. F. and Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika51, 313–326.

    MathSciNet  Article  Google Scholar 

  12. Reinsel, G. (1982). Multivariate repeated-measurement or growth curve models with multivariate random-effects covariance structure. Journal of the American Statistical Association77, 190–195.

    MathSciNet  Article  Google Scholar 

  13. Rutter, C. M. and Elashoff, R. M. (1994). Analysis of longitudinal data: random coefficient regression modelling. Statistics in Medicine13, 1211–1231.

    Article  Google Scholar 

  14. Schott, J. R. (1985). Multivariate maximum likelihood estimators for the mixed linear model. Sankhyā: The Indian Journal of Statistics, Series B, 179–185.

  15. Siotani, T., Fujikoshi, Y. and Hayakawa, T. (1985). Modern multivariate statistical analysis, a graduate course and handbook. American Sciences Press.

  16. Srivastava, M. S., von Rosen, T. and von Rosen, D. (2009). Estimation and testing in general multivariate linear models with Kronecker product covariance structure. Sankhyā: The Indian Journal of Statistics, Series A, 137–163.

  17. Volaufova, J. and Lamotte, L. R. (2013). A simulation comparison of approximate tests for fixed effects in random coefficients growth curve models. Communications in Statistics-Simulation and Computation42, 344–359.

    MathSciNet  Article  Google Scholar 

  18. von Rosen, D. (1991). The growth curve model: a review. Communications in Statistics-Theory and Methods20, 2791–2822.

    MathSciNet  Article  Google Scholar 

  19. von Rosen, D. (2018). Bilinear Regression Analysis: An Introduction, 220. Springer, Berlin.

    Google Scholar 

  20. Yokoyama, T. (2005). Efficiency of the mle in a multivariate parallel profile model with random effects. Hiroshima Mathematical Journal35, 2, 197–203.

    MathSciNet  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Shinpei Imori.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors thank the reviewers for their constructive criticism which has improved the work. Shinpei Imori is supported in part by JSPS KAKENHI Grant Number JP17K12650 and “Funds for the Development of Human Resources in Science and Technology” under MEXT, through the “Home for Innovative Researchers and Academic Knowledge Users (HIRAKU)” consortium. Dietrich von Rosen is supported by the Swedish Research Council (2017-03003). Ryoya Oda is supported by Research Fellowship for Young Scientists from the Japan Society for the Promotion of Science.

Appendices

Appendix: Transformation From General Form to Canonical Form

We present one way of transforming the Growth Curve model to a canonical form. Let PA and PC be defined by

$$ P_{A}=(A(A^{\top} A)^{-1/2}, Q_{A}), ~~ P_{C}=(C^{\top}(CC^{\top})^{-1/2}, Q_{C}), $$

where \(Q_{A}^{\top } A=O_{p-q, q}\), \(Q_{A}^{\top } Q_{A}=I_{p-q}\), CQC = Ok,nk and \(Q_{C}^{\top } Q_{C}=I_{n-k}\) are fulfilled. Thus,

$$ P_{A}^{\top} A=((A^{\top} A)^{1/2}, O_{q, p-q})^{\top}, ~~ CP_{C}=((CC^{\top})^{1/2}, O_{k, n-k}). $$

Moreover, it follows that \(P_{A}^{\top } P_{A}=I_{p}\), \(P_{C}^{\top } P_{C}=I_{n}\) and

$$ \begin{array}{@{}rcl@{}} P_{A}^{\top} D^{2}P_{A}&=&(I_{q}, O_{q, p-q})^{\top}(A^{\top} A)^{1/2}{\varSigma}(A^{\top} A)^{1/2}(I_{q}, O_{q, p-q}) +\delta^{2}I_{p}, \\ P_{C}^{\top} F^{2}P_{C}&=&(I_{k}, O_{k, n-k})^{\top} (CC^{\top})^{1/2}{\varPsi}(CC^{\top})^{1/2}(I_{k}, O_{k, n-k}) +I_{n}. \end{array} $$

Using PA and PC indicates that

$$ \tilde{X}:=P_{A}^{\top} X P_{C}=P_{A}^{\top}(ABC+DEF)P_{C}= \tilde{A}\tilde{B}\tilde{C} + \tilde{D}E\tilde{F}, $$

where \(\tilde {A}=(I_{q}, O_{q, p-q})^{\top }\), \(\tilde {B}=(A^{\top } A)^{1/2}B(CC^{\top })^{1/2}\), \(\tilde {C}=(I_{k}, O_{k, n-k})\), \(\tilde {D}^{2}=\tilde {A}\tilde {{\varSigma }}\tilde {A}^{\top } + \delta ^{2}I_{p}\), \(\tilde {F}^{2}=\tilde {C}^{\top }\tilde {{\varPsi }}\tilde {C} + I_{n}\), \(\tilde {{\varSigma }}=(A^{\top } A)^{1/2}{\varSigma }(A^{\top } A)^{1/2}\) and \(\tilde {{\varPsi }}=(CC^{\top })^{1/2}{\varPsi }(CC^{\top })^{1/2}\). Hence, by replacing X, A, B, C, Σ and Ψ by \(\tilde {X}\), \(\tilde {A}\), \(\tilde {B}\), \(\tilde {C}\), \(\tilde {{\varSigma }}\) and \(\tilde {{\varPsi }}\), respectively, we obtain the canonical form in Eq. 7.

Appendix B. Proof of Theorem 1

Proof.

Since the fourth term on the right hand side in Eq. 10, \(\delta ^{2}\text {tr}\{{\varPsi }_{*}^{-1}\\(X_{11}-B){\varSigma }_{*}^{-1}(X_{11}-B)^{\top }\}\) is non-negative, the optimum value of B is given by \(\hat {B}=X_{11}\). On the other hand, it follows from Schott (1985) that the maximum point of \(\ell (\hat {B}, {\varSigma }_{*}, {\varPsi }_{*}, \delta ^{2}|X)\) attains when Σ and Ψ satisfy that \(Q_{12}{\varSigma }_{*}Q_{12}^{\top }\) and \(Q_{21}{\varPsi }_{*}Q_{21}^{\top }\) are diagonal matrices, respectively. Hence, we re-parametrize \(Q_{12}{\varSigma }_{*}Q_{12}^{\top }=\text {diag}({\sigma _{1}^{2}}, \ldots , {\sigma _{q}^{2}})\) and \(Q_{21}{\varPsi }_{*}Q_{21}^{\top }=\text {diag}({\psi _{1}^{2}}, \ldots , {\psi _{k}^{2}})\). Moreover, the assumptions Σδ2Iq ≥ 0 and Ψδ2Ik ≥ 0 imply \({\sigma _{s}^{2}} \geq \delta ^{2}\) and \({\psi _{t}^{2}} \geq \delta ^{2}\), respectively. Therefore, instead of minimizing (10),

$$ \begin{array}{@{}rcl@{}} \min\limits_{\tilde{\sigma}_{s}^{2}, \tilde{\psi}_{t}^{2}, \tilde{\delta}^{2}} &-&(np-nq-kp)\log{\tilde{\delta}^{2}}+\tilde{\delta}^{2}\text{tr}(X_{22}X_{22}^{\top})\\ &&+\sum\limits_{s=1}^{q}\{-n\log\tilde{\sigma}_{s}^{2} + \lambda_{s}^{(12)}\tilde{\sigma}_{s}^{2}\}\\ &&+\sum\limits_{t=1}^{k}\{-p\log\tilde{\psi}_{t}^{2}+\lambda_{t}^{(21)}\tilde{\psi}_{t}^{2}\}. \end{array} $$
(B.1)
$$ \begin{array}{@{}rcl@{}} \text{subject~to} && \tilde{\sigma}_{s}^{2} - \tilde{\delta}^{2}\leq 0, ~~ s=1, \ldots, q, \\ &&\tilde{\psi}_{t}^{2} - \tilde{\delta}^{2}\leq 0, ~~ t=1, \ldots, k, \end{array} $$
(B.2)

where \(\tilde {\delta }^{2}=\delta ^{-2}\), \(\tilde {\sigma }_{s}^{2} = \sigma _{s}^{-2}\) and \(\tilde {\psi }_{t}^{2} = \psi _{t}^{-2}\). We note that this optimization problem consisting of Eqs. B.1 and B.2 is a convex problem because npnqkp is assumed to be positive. Hence, the Karush-Kuhn-Tucker (KKT) conditions are necessary and sufficient conditions for the optimization of Eqs. B.1 given B.2. By using KKT multipliers λσ,s ≥ 0 and λψ,t ≥ 0, along with Eq. 16, we obtain the following KKT conditions:

$$ \begin{array}{@{}rcl@{}} &-&(np-nq-kp)\tilde{\delta}^{-2}+\text{tr}(X_{22}X_{22}^{\top})-\sum\limits_{s=1}^{q}\lambda_{\sigma, s}-\sum\limits_{t=1}^{k}\lambda_{\psi, t}=0, \\ &-&n\tilde{\sigma}_{s}^{-2}+\lambda_{s}^{(12)} + \lambda_{\sigma, s}=0, ~~ s=1, \ldots, q, \\ &-&p\tilde{\psi}_{t}^{-2}+\lambda_{t}^{(21)}+\lambda_{\psi, t}=0, ~~ t=1, \ldots, k, \\ &&\lambda_{\sigma, s}(\tilde{\sigma}_{s}^{2} - \tilde{\delta}^{2})=0, ~~ s=1, \ldots, q, \\ &&\lambda_{\psi, t}(\tilde{\psi}_{t}^{2} - \tilde{\delta}^{2})=0, ~~ t=1, \ldots, k. \end{array} $$

Let two index sets Iσ and Iψ be defined by

$$ I_{\sigma} = \{1 \leq s \leq q | \lambda_{\sigma, s} = 0\}, ~~ I_{\psi} = \{1 \leq t \leq k | \lambda_{\psi, t} = 0\}. $$

The 2nd − 5th KKT conditions show that

$$ \begin{array}{@{}rcl@{}} \tilde{\sigma}_{s}^{-2}&=&\left\{\begin{array}{ll} \tilde{\delta}^{-2}, & \lambda_{\sigma, s} \neq 0 \Leftrightarrow s \not\in I_{\sigma}, \\ \lambda_{s}^{(12)}/n, & \lambda_{\sigma, s} = 0 \Leftrightarrow s \in I_{\sigma}, \end{array} \right. \\ \tilde{\psi}_{t}^{-2}&=&\left\{\begin{array}{ll} \tilde{\delta}^{-2}, & \lambda_{\psi, t} \neq 0 \Leftrightarrow t \not\in I_{\psi}, \\ \lambda_{t}^{(21)}/p, & \lambda_{\psi, t} = 0 \Leftrightarrow t \in I_{\psi}. \end{array}\right. \end{array} $$
(B.3)

Moreover, it follows from Eq. B.3 and the 1st KKT condition that

$$ \begin{array}{@{}rcl@{}} 0&=&-(np-nq-kp)\tilde{\delta}^{-2}+\text{tr}(X_{22}X_{22}^{\top})-{\sum}_{s=1}^{q}\lambda_{\sigma, s}-{\sum}_{t=1}^{k}\lambda_{\psi, t}\\ &=&-(np-nq-kp)\tilde{\delta}^{-2}+\text{tr}(X_{22}X_{22}^{\top})-{\sum}_{s \not\in I_{\sigma}}\lambda_{\sigma, s}-{\sum}_{t \not\in I_{\psi}}\lambda_{\psi, t}\\ &=&-(np-nq-kp)\tilde{\delta}^{-2}+\text{tr}(X_{22}X_{22}^{\top})-{\sum}_{s \not\in I_{\sigma}}\{n\tilde{\delta}^{-2}-\lambda_{s}^{(12)}\}-{\sum}_{t \not\in I_{\psi}}\{p\tilde{\delta}^{-2}-\lambda_{t}^{(21)}\}\\ &=&-(pn-n\#I_{\sigma}-p\#I_{\psi})\tilde{\delta}^{-2}+\text{tr}(X_{22}X_{22}^{\top})+{\sum}_{s \not\in I_{\sigma}}\lambda_{s}^{(12)}+{\sum}_{t \not\in I_{\psi}}\lambda_{t}^{(21)}, \end{array} $$

where #Iσ and #Iψ are the number of elements in the sets Iσ and Iψ, respectively. Hence, given Iσ and Iψ, estimators of \({\sigma _{s}^{2}}\), \({\psi _{t}^{2}}\) and δ2 equal

$$ \begin{array}{@{}rcl@{}} \hat{\delta}^{2}(I_{\sigma}, I_{\psi})&=&\frac{\text{tr}(X_{22}X_{22}^{\top})+{\sum}_{s \not\in I_{\sigma}}\lambda_{s}^{(12)}+{\sum}_{t \not\in I_{\psi}}\lambda_{t}^{(21)}}{np-n\#I_{\sigma}-p\#I_{\psi}}, \\ \hat{\sigma}_{s}^{2}(I_{\sigma}, I_{\psi})&=&\left\{\begin{array}{ll} \hat{\delta}^{2}(I_{\sigma}, I_{\psi}), & s \not\in I_{\sigma}, \\ \lambda_{s}^{(12)}/n, & s \in I_{\sigma}, \end{array} \right. \\ \hat{\psi}_{t}^{2}(I_{\sigma}, I_{\psi})&=&\left\{\begin{array}{ll} \hat{\delta}^{2}(I_{\sigma}, I_{\psi}), & t \not\in I_{\psi}, \\ \lambda_{t}^{(21)}/p, & t \in I_{\psi}. \end{array}\right. \end{array} $$
(18)

Because of Eq. B.2, they have to satisfy that \(\hat {\delta }^{2}(I_{\sigma }, I_{\psi }) \leq \hat {\sigma }_{s}^{2}(I_{\sigma }, I_{\psi })\) and \(\hat {\delta }^{2}(I_{\sigma }, I_{\psi }) \leq \hat {\psi }_{t}^{2}(I_{\sigma }, I_{\psi })\) for all sIσ and tIψ. Note that if Iσ and Iψ are full sets, then the MLEs of δ2, Σ and Ψ are given by \(\hat {\delta }^{2}(I_{\sigma }, I_{\psi })=\text {tr}(X_{22}X_{22}^{\top })/(np-nq-kp)\), \(\hat {{\varSigma }}_{*}(I_{\sigma }, I_{\psi }) = X_{12}X_{12}^{\top }/n\) and \(\hat {{\varPsi }}_{*}(I_{\sigma }, I_{\psi }) = X_{21}^{\top } X_{21}/p\), respectively.

Next, we want to optimize the index sets Iσ and Iψ. Let (Iσ,Iψ) be the maximum likelihood function given Iσ and Iψ, where

$$ -2\ell(I_{\sigma}, I_{\psi}) \propto (pn-n\#I_{\sigma}-p\#I_{\psi})\log\hat{\delta}^{2}(I_{\sigma}, I_{\psi})+n{\sum}_{s \in I_{\sigma}}\log\frac{\lambda_{s}^{(12)}}{n} +p{\sum}_{t \in I_{\psi}}\log\frac{\lambda_{t}^{(21)}}{p}. $$
(B.5)

At first, we fix Iψ, and compare the index sets \(I_{\sigma }^{(1)}=I_{\sigma }^{(0)} \cup \{s_{1}\}\) and \(I_{\sigma }^{(2)} =I_{\sigma }^{(0)}\cup \{s_{2}\}\), where \(I_{\sigma }^{(0)} \subsetneq \{1, \ldots , q\}\) and \(s_{1}, s_{2} \not \in I_{\sigma }^{(0)}\) satisfy s1 < s2. When s1 < s2, it holds that \(\lambda _{s_{1}}^{(12)} \geq \lambda _{s_{2}}^{(12)}\). Thus, it also holds from Eq. B.4 that \(\hat {\delta }^{2}(I_{\sigma }^{(1)}, I_{\psi }) \leq \hat {\delta }^{2}(I_{\sigma }^{(2)}, I_{\psi })\) and \(\hat {\sigma }_{s_{2}}^{2}(I_{\sigma }^{(2)}, I_{\psi }) \leq \hat {\sigma }_{s_{1}}^{2}(I_{\sigma }^{(1)}, I_{\psi })\). Suppose that the estimators based on \(I_{\sigma }^{(2)}\) satisfies (B.2), i.e., \(\hat {\delta }^{2}(I_{\sigma }^{(2)}, I_{\psi }) \leq \hat {\sigma }_{s}^{2}(I_{\sigma }^{(2)}, I_{\psi })\) for all \(s \in I_{\sigma }^{(2)}\). Then, it follows that

$$ \hat{\delta}^{2}(I_{\sigma}^{(1)}, I_{\psi}) \leq \hat{\delta}^{2}(I_{\sigma}^{(2)}, I_{\psi}) \leq \hat{\sigma}_{s_{2}}^{2}(I_{\sigma}^{(2)}, I_{\psi}) \leq \hat{\sigma}_{s_{1}}^{2}(I_{\sigma}^{(1)}, I_{\psi}). $$
(B.6)

Hence, \(\hat {\delta }^{2}(I_{\sigma }^{(1)}, I_{\psi }) \leq \hat {\sigma }_{s}^{2}(I_{\sigma }^{(1)}, I_{\psi })\) is established for all \(s \in I_{\sigma }^{(1)}\), that is, the estimators based on \(I_{\sigma }^{(1)}\) satisfy (B.2).

Now we want to show that \(-2\ell (I_{\sigma }^{(1)}, I_{\psi }) \leq -2\ell (I_{\sigma }^{(2)}, I_{\psi })\), which indicates that \(I_{\sigma }^{(1)}\) is better than \(I_{\sigma }^{(2)}\) in terms of the likelihood function. To simplify notation, we denote

$$ \begin{array}{@{}rcl@{}} K&=&\text{tr}(X_{22}X_{22}^{\top})+\sum\limits_{s \not\in I_{\sigma}^{(0)}}\lambda_{s}^{(12)}+\sum\limits_{t \not\in I_{\psi}}\lambda_{t}^{(21)}, \\ L&=&n\sum\limits_{s \in I_{\sigma}^{(0)}}\log\frac{\lambda_{s}^{(12)}}{n} +p\sum\limits_{t \in I_{\psi}}\log\frac{\lambda_{t}^{(21)}}{p}, \\ M&=&np-n\#I_{\sigma}^{(0)}-p\#I_{\psi}. \end{array} $$

Here, we use the inequality that for a,b,c,d > 0,

$$ \frac{b}{a} \leq \frac{d}{c} \Rightarrow \frac{b}{a} \leq \frac{b+d}{a+c} \leq \frac{d}{c}. $$
(B.7)

Let a = Mn, \(b=K-\lambda _{s_{2}}^{(12)}\), c = n and \(d=\lambda _{s_{2}}^{(12)}\). Because \(b/a=\hat {\delta }^{2}(I_{\sigma }^{(2)}, I_{\psi }) \leq \hat {\sigma }_{s_{2}}^{2}(I_{\sigma }^{(2)}, I_{\psi })=\lambda _{s_{2}}^{(12)}/n=d/c\) is established from Eq. B.6, it follows from Eq. B.7 that

$$ \frac{K}{M}=\frac{b+d}{a+c} \leq \frac{d}{c} = \frac{\lambda_{s_{2}}^{(12)}}{n} \leq \frac{\lambda_{s_{1}}^{(12)}}{n}, $$
(B.8)

where the last inequality follows from the assumption that s1 < s2. For x ∈ (0,K/n), denote

$$ f(x)=(M-n)\log\frac{K-nx}{M-n}+n\log{x}+L, $$

which satisfies \(f(\lambda _{s_{m}}^{(12)}/n) = -2\ell (I_{\sigma }^{(m)}, I_{\psi })\) for m = 1,2. It is easy to show that f is concave and its critical point is given by x = K/M, which implies that f is monotonically decreasing when xK/M. This and Eq. B.8 clarify that \(-2\ell (I_{\sigma }^{(1)}, I_{\psi }) \leq -2\ell (I_{\sigma }^{(2)}, I_{\psi })\) when the estimators based on \(I_{\sigma }^{(2)}\) satisfies (B.2).

Hence, there exists a non-negative integer qσ such that the optimum set for Iσ is obtained by \(\hat {I}_{\sigma }\equiv \{1, \ldots , q_{\sigma }\}\). In a similar way, we can show that an optimal index set of Iψ can be expressed by \(\hat {I}_{\psi }\equiv \{1, \ldots , k_{\psi }\}\), where kψ is a non-negative integer. Note that qσ = 0 and kψ = 0 imply \(\hat {I}_{\sigma }\) and \(\hat {I}_{\psi }\) are empty, respectively. Thus, the proof is completed. □

Appendix C. Proof of Theorem 2

Proof

Let \(I_{\sigma }^{(0)}=\{1, \ldots , q_{\sigma }\}\), \(I_{\sigma }^{(1)}=\{1, \ldots , q_{\sigma }+1\}\), \(I_{\psi }^{(0)}=\{1, \ldots , k_{\psi }\}\) and \(I_{\psi }^{(1)}=\{1, \ldots , k_{\psi }+1\}\). At first, we show that \(-2\ell (I_{\sigma }^{(1)}, I_{\psi }^{(0)}) \leq -2\ell (I_{\sigma }^{(0)}, \\I_{\psi }^{(0)})\) when \((q_{\sigma }+1, k_{\psi }) \in {\mathscr{L}}\). Denoting K, L and M as in Appendix B, we can see that

$$ -2\ell(I_{\sigma}^{(0)}, I_{\psi}^{(0)}) +2\ell(I_{\sigma}^{(1)}, I_{\psi}^{(0)}) = M\log\frac{K}{M} - (M -n)\log\frac{K-\lambda_{q_{\sigma}+1}^{(12)}}{M-n} -n\log\frac{\lambda_{q_{\sigma}+1}^{(12)}}{n}. $$

Note that since (qσ + 1,kψ) belongs to \({\mathscr{L}}\), it holds that \(\{K-\lambda _{q_{\sigma }+1}^{(12)}\}/(M-n) = \hat {\delta }^{2}(I_{\sigma }^{(1)}, I_{\psi }^{(0)}) \leq \hat {\sigma }^{2}_{q_{\sigma }+1}(I_{\sigma }^{(1)}, I_{\psi }^{(0)}) = \lambda _{q_{\sigma }+1}^{(12)}/n\). From Eq. B.7 with a = Mn, \(b=K-\lambda _{q_{\sigma }+1}^{(12)}\), c = n and \(d=\lambda _{q_{\sigma }+1}^{(12)}\), it follows that \(K/M = (b+d)/(a+c) \leq d/c = \lambda _{q_{\sigma }+1}^{(12)}/n < K/n\). Here, for x ∈ (0,K/n), denote

$$ g(x)=M\log\frac{K}{M} - (M-n)\log\frac{K - nx}{M-n}-n\log{x}. $$

Because g is convex and its critical point is given by K/M, we can see that

$$ -2\ell(I_{\sigma}^{(0)}, I_{\psi}^{(0)}) +2\ell(I_{\sigma}^{(1)}, I_{\psi}^{(0)}) = g(\lambda_{q_{\sigma}+1}^{(12)}/n) \geq g(K/M) = 0. $$

Likewise, it can be seen that \(-2\ell (I_{\sigma }^{(0)}, I_{\psi }^{(1)}) \leq -2\ell (I_{\sigma }^{(0)}, I_{\psi }^{(0)})\) when \((q_{\sigma }, k_{\psi }+1) \in {\mathscr{L}}\). Hence, the proof is completed. □

Appendix D. Proofs of Lemmas 1 and 2

Proof

Firstly, we prepare the following lemma to show Lemma 1.

Lemma 3.

For any a ≥ 0 and symmetric matrices \(A_{1}, A_{2} \in \mathbb {R}^{m \times m}\) such that A1 and A2 are positive and non-negative definite, respectively, it holds that for all ∈ 1,…,m,

$$ \{a+\lambda_{\ell}(A_{2})\}\lambda_{m}(A_{1}) \leq \lambda_{\ell}(A_{1}(A_{2} + aI_{m})) \leq \{a+\lambda_{\ell}(A_{2})\}\lambda_{1}(A_{1}). $$

It follows from some linear algebra (see e.g., Siotani et al. 1985) that for all ∈ 0,…,m − 1,

$$ \begin{array}{@{}rcl@{}} \lambda_{\ell+1}(A_{1}(A_{2} + aI_{m})) &=& \lambda_{\ell+1}(A_{1}^{1/2}(A_{2} + aI_{m})A_{1}^{1/2}) \\ &=& \inf_{F_{\ell}}\sup_{F_{\ell}^{\top} x=0_{\ell}} \frac{x^{\top} A_{1}^{1/2}(A_{2} + aI_{m})A_{1}^{1/2} x}{x^{\top} x} \\ &=& \inf_{F_{\ell}}\sup_{F_{\ell}^{\top} y=0_{\ell}} \frac{y^{\top} (A_{2} + aI_{m})y}{y^{\top} A_{1}^{-1} y}, \end{array} $$

where \(F_{\ell } \in \mathbb {R}^{m \times \ell }\). Because eigenvalues of \(A_{1}^{-1}\) are inverse of eigenvalues of A1, which is positive definite, for all \(y \in \mathbb {R}^{m}\),

$$ \begin{array}{@{}rcl@{}} \frac{y^{\top} (A_{2} + aI_{m})y}{y^{\top} A_{1}^{-1} y} &\leq& \lambda_{m}(A_{1}^{-1})^{-1}\frac{y^{\top} (A_{2} + aI_{m})y}{y^{\top} y} = \lambda_{1}(A_{1})\frac{y^{\top} (A_{2} + aI_{m})y}{y^{\top} y}, \\ \frac{y^{\top} (A_{2} + aI_{m})y}{y^{\top} A_{1}^{-1} y} &\geq& \lambda_{1}(A_{1}^{-1})^{-1}\frac{y^{\top} (A_{2} + aI_{m})y}{y^{\top} y} = \lambda_{m}(A_{1})\frac{y^{\top} (A_{2} + aI_{m})y}{y^{\top} y}. \end{array} $$

Note that

$$ \inf_{F_{\ell}}\sup_{F_{\ell}^{\top} y=0_{\ell}} \frac{y^{\top} (A_{2} + aI_{m})y}{y^{\top} y} = \lambda_{\ell+1}(A_{2} + aI_{m})=a+\lambda_{\ell+1}(A_{2}). $$

Hence, the proof is completed. □

Proof of Lemma 1

Proof

At first, we show \(Pr((q_{\sigma }^{*}, k_{\sigma }^{*}) \in {\mathscr{L}}) \rightarrow 1\) for all asymptotic frameworks (i)–(iii) of the lemma, that is,

$$ Pr(\lambda_{q_{\sigma}^{*}}^{(12)}/n \geq \hat{\delta}^{2}(q_{\sigma}^{*}, k_{\sigma}^{*}), \lambda_{k_{\psi}^{*}}^{(21)}/p \geq \hat{\delta}^{2}(q_{\sigma}^{*}, k_{\sigma}^{*})) \rightarrow 1. $$

We firstly show that for all \(q_{\sigma } \geq q_{\sigma }^{*}\) and \(k_{\psi } \geq k_{\psi }^{*}\), \(\hat {\delta }^{2}(q_{\sigma }, k_{\psi })\) converges to δ2 in probability. By considering the transformation in Appendix A, we can see that

$$ \tilde{A}\tilde{B}\tilde{C}=\begin{pmatrix} \tilde{B} & O\\ O & O \end{pmatrix}, ~~ \tilde{D}^{2}=\begin{pmatrix} \tilde{{\varSigma}}+\delta^{2}I_{q} & O_{q, p-q}\\ O_{p-q, q} & \delta^{2}I_{p-q} \end{pmatrix}, ~~ \tilde{F}^{2}=\begin{pmatrix} \tilde{{\varPsi}}+I_{k} & O_{k, n-k}\\ O_{n-k, k} & I_{n-k} \end{pmatrix}. $$

Splitting the error matrix E into four parts like X, i.e.,

$$ E = \begin{pmatrix} E_{11} & E_{12}\\ E_{21} & E_{22} \end{pmatrix}: ~~ \begin{array}{cc} q \times k & q \times (n-k) \\ (p-q) \times k & (p-q) \times (n-k) \end{array} $$

then

$$ \tilde{X} =\begin{pmatrix} \tilde{B} + (\tilde{{\varSigma}}+\delta^{2}I_{q})^{1/2}E_{11}(\tilde{{\varPsi}}+I_{k})^{1/2}& (\tilde{{\varSigma}}+\delta^{2}I_{q})^{1/2}E_{12}\\ \delta E_{21}(\tilde{{\varPsi}}+I_{k})^{1/2} & \delta E_{22} \end{pmatrix}, $$

where \(\tilde {X}=\tilde {A}\tilde {B}\tilde {C}+\tilde {D}E\tilde {F}\). Hence, X12 and X21 can be expressed as

$$ X_{12} = (\tilde{{\varSigma}}+\delta^{2}I_{q})^{1/2}E_{12}, ~~ X_{21} = \delta E_{21}(\tilde{{\varPsi}}+I_{k})^{1/2}, $$

with \(E_{12} \sim N_{q, n-k}(O_{q, n-k}, I_{q}, I_{n-k})\) and \(E_{21} \sim N_{p-q, k}(O_{p-q, k}, I_{p-q}, I_{k})\). Note that \(\tilde {{\varSigma }}=(A^{\top } A)^{1/2}{\varSigma }(A^{\top } A)^{1/2}\) and \(\tilde {{\varPsi }}=(CC^{\top })^{1/2}{\varPsi }(CC^{\top })^{1/2}\). Define

$$ W_{12}=E_{12}E_{12}^{\top} \sim W_{q}(I_{q}, n-k), ~~ W_{21}=E_{21}^{\top} E_{21} \sim W_{k}(I_{k}, p-q). $$
(D.1)

Then, \(\lambda _{q_{\sigma }}^{(12)}\) and \(\lambda _{k_{\psi }}^{(21)}\) can be expressed as follows:

$$ \begin{array}{@{}rcl@{}} \lambda_{q_{\sigma}}^{(12)}&=&\lambda_{q_{\sigma}}(X_{12} X_{12}^{\top})=\lambda_{q_{\sigma}}((\tilde{{\varSigma}}+\delta^{2}I_{q})W_{12}), \\ \lambda_{k_{\psi}}^{(21)}&=&\lambda_{k_{\psi}}(X_{21} X_{21}^{\top})=\delta^{2}\lambda_{k_{\psi}}((\tilde{{\varPsi}}+I_{k})W_{21}). \end{array} $$
(D.2)

Here, let us fix \(q_{\sigma } > q_{\sigma }^{*}\) and \(k_{\psi } > k_{\psi }^{*}\). Because \(\lambda _{q_{\sigma }}({\varSigma })=\lambda _{k_{\psi }}({\varPsi })=0\), Lemma 3 yields that \(\lambda _{q_{\sigma }}(\tilde {{\varSigma }})=\lambda _{k_{\psi }}(\tilde {{\varPsi }})=0\). Combining this result, Eq. D.2 and Lemma 3, we have

$$ \begin{array}{@{}rcl@{}} \delta^{2}\lambda_{q}(W_{12}) &\leq& \lambda_{q_{\sigma}}^{(12)} \leq \delta^{2}\lambda_{1}(W_{12}), \\ \delta^{2}\lambda_{k}(W_{21}) &\leq& \lambda_{k_{\psi}}^{(21)} \leq \delta^{2}\lambda_{1}(W_{21}). \end{array} $$
(D.3)

Because \(W_{12}/n \overset {p}{\rightarrow } I_{q}\) as \(n \rightarrow \infty \) and \(W_{21}/p \overset {p}{\rightarrow } I_{k}\) as \(p \rightarrow \infty \), we can see that λ1(W12)/n, \(\lambda _{q}(W_{12})/n \overset {p}{\rightarrow } 1\) and λ1(W21)/p, \(\lambda _{k}(W_{21})/p \overset {p}{\rightarrow } 1\) when n and p go to infinity, respectively. Therefore, Eq. D.3 indicates that

$$ \begin{array}{@{}rcl@{}} &&\lambda_{q_{\sigma}}^{(12)}/n \overset{p}{\rightarrow} \delta^{2} ~~\text{for}~\text{(i), (ii)}, ~ \lambda_{q_{\sigma}}^{(12)}/n =O_{p}(1) ~~\text{for}~\text{(iii)}, \\ &&\lambda_{k_{\psi}}^{(21)}/p \overset{p}{\rightarrow} \delta^{2} ~~\text{for}~\text{(i), (iii)}, ~ \lambda_{k_{\psi}}^{(21)}/p =O_{p}(1) ~~\text{for}~\text{(ii)}. \end{array} $$

On the other hand, \(\text {tr}(X_{22}X_{22}^{\top })/\delta ^{2} \sim \chi ^{2}_{(n-k)(p-q)}\) and \(\chi ^{2}_{(n-k)(p-q)}/(n-k)(p-q) \overset {p}{\rightarrow } 1\) for (i)–(iii). Hence, it holds that

$$ \hat{\delta}^{2}(q_{\sigma}, k_{\psi})=\frac{\text{tr}(X_{22}X_{22}^{\top})+{\sum}_{s > q_{\sigma}}\lambda_{s}^{(12)}+{\sum}_{t > k_{\psi}}\lambda_{t}^{(21)}}{np-nq_{\sigma}-k_{\psi} p} \overset{p}{\rightarrow} \delta^{2} $$
(D.4)

for (i)–(iii) when \(q_{\sigma } \geq q_{\sigma }^{*}\) and \(k_{\psi } \geq k_{\psi }^{*}\).

Next, we obtain the lower bounds of \(\lambda _{q_{\sigma }^{*}}^{(12)}\) and \(\lambda _{k_{\psi }^{*}}^{(21)}\). Using Lemma 3 and the condition (13), we have

$$ \begin{array}{@{}rcl@{}} \lambda_{q_{\sigma}^{*}}(\tilde{{\varSigma}}+\delta^{2}I_{q}) &\geq& \lambda_{q}(A^{\top} A)\lambda_{q_{\sigma}^{*}}({\varSigma}) + \delta^{2} \geq p\zeta_{a}\lambda_{q_{\sigma}^{*}}({\varSigma})+\delta^{2}, \\ \lambda_{k_{\psi}^{*}}(\tilde{{\varPsi}}+I_{k})&\geq& \lambda_{k}(CC^{\top})\lambda_{k_{\psi}^{*}}({\varPsi})+1 \geq n\zeta_{c}\lambda_{k_{\psi}^{*}}({\varPsi})+1 . \end{array} $$

By applying this evaluation and Lemma 3 into Eq. D.2, it follows that

$$ \begin{array}{@{}rcl@{}} \lambda_{q_{\sigma}^{*}}^{(12)} &\geq& \lambda_{q_{\sigma}^{*}}(\tilde{{\varSigma}}+\delta^{2}I_{q})\lambda_{q}(W_{12}) \geq \{p\zeta_{a}\lambda_{q_{\sigma}^{*}}({\varSigma})+\delta^{2}\}\lambda_{q}(W_{12}), \\ \lambda_{k_{\psi}^{*}}^{(21)} &\geq& \delta^{2}\lambda_{k_{\psi}^{*}}(\tilde{{\varPsi}}+I_{k})\lambda_{k}(W_{21}) \geq \delta^{2}\{n\zeta_{c}\lambda_{k_{\psi}^{*}}({\varPsi})+1\}\lambda_{k}(W_{21}). \end{array} $$
(D.5)

If \(\lambda _{q}(W_{12})/n \geq \{\zeta _{a}\lambda _{q_{\sigma }^{*}}({\varSigma })/2+\delta ^{2}\}/\{p\zeta _{a}\lambda _{q_{\sigma }^{*}}({\varSigma })+\delta ^{2}\}\), then it follows from Eq. D.5 that \(\lambda _{q_{\sigma }^{*}}^{(12)}/n \geq \zeta _{a}\lambda _{q_{\sigma }^{*}}({\varSigma })/2+\delta ^{2}\). Here, we consider the case when \(n \rightarrow \infty \), i.e., (i) and (ii). Because \(\{\zeta _{a}\lambda _{q_{\sigma }^{*}}({\varSigma })/2+\delta ^{2}\}/\{p\zeta _{a}\lambda _{q_{\sigma }^{*}}({\varSigma })+\delta ^{2}\} \leq 1 + \zeta _{a}\lambda _{q_{\sigma }^{*}}({\varSigma })/(2\delta ^{2}) < 1\) and \(\lambda _{q}(W_{12})/n \overset {p}{\rightarrow } 1\), it holds that for (i) and (ii),

$$ Pr(\lambda_{q}(W_{12})/n \geq \{\zeta_{a}\lambda_{q_{\sigma}^{*}}({\varSigma})/2+\delta^{2}\}/\{p\zeta_{a}\lambda_{q_{\sigma}^{*}}({\varSigma})+\delta^{2}\}) \rightarrow 1. $$

Hence, we can see that

$$ Pr(\lambda_{q_{\sigma}^{*}}^{(12)}/n \geq \zeta_{a}\lambda_{q_{\sigma}^{*}}({\varSigma})/2+\delta^{2}) \rightarrow 1, $$
(D.6)

for (i) and (ii). In the case (iii), because \(\lambda _{q_{\sigma }^{*}}({\varSigma }) > 0\),

$$ Pr(\lambda_{q}(W_{12})/n \geq \{\zeta_{a}\lambda_{q_{\sigma}^{*}}({\varSigma})/2+\delta^{2}\}/\{p\zeta_{a}\lambda_{q_{\sigma}^{*}}({\varSigma})+\delta^{2}\}) \rightarrow 1. $$

Hence (D.6) holds for (iii). In a similar way, for (i)–(iii), it follows that

$$ Pr(\lambda_{k_{\psi}^{*}}^{(21)}/p \geq \delta^{2}\{\zeta_{c}\lambda_{k_{\psi}^{*}}({\varPsi})/2 +1\}) \rightarrow 1. $$
(D.7)

Combining (D.4), (D.6) and (D.7), it can be seen that \(Pr((q_{\sigma }^{*}, k_{\psi }^{*}) \in {\mathscr{L}}) \rightarrow 1\) for (i)–(iii). Thus, for all \(q_{\sigma } < q_{\sigma }^{*}\) and \(k_{\psi } < k_{\psi }^{*}\), \(Pr((q_{\sigma }, k_{\psi }) \not \in {\mathscr{L}}_{R}) \rightarrow 1\) for all (i)–(iii).

For the proof of the lemma, it suffices to show that (qσ,kψ), which satisfies \(q_{\sigma } \geq q_{\sigma }^{*}, k_{\psi } < k_{\psi }^{*}\) or \(q_{\sigma } < q_{\sigma }^{*}, k_{\psi } \geq k_{\psi }^{*}\), does not belong to \({\mathscr{L}}_{R}\) with a probability tending to 1. Because of symmetry, we only consider the case \(q_{\sigma } \geq q_{\sigma }^{*}\) and \(k_{\psi } < k_{\psi }^{*}\). Suppose that there exists \((q_{\sigma }, k_{\psi }) \in {\mathscr{L}}\) such that \(q_{\sigma } \geq q_{\sigma }^{*}\) and \(k_{\psi } < k_{\psi }^{*}\). Then, from the definition of \({\mathscr{L}}\), it follows that

$$ \hat{\delta}^{2}(q_{\sigma}, k_{\psi}) \leq \frac{\lambda_{q_{\sigma}}^{(12)}}{n}, ~~\hat{\delta}^{2}(q_{\sigma}, k_{\psi}) \leq \frac{\lambda_{k_{\psi}}^{(21)}}{p}. $$

On the other hand, because \(k_{\psi } < k_{\psi }^{*}\) is assumed, if \((q_{\sigma }, k_{\psi }^{*}) \in {\mathscr{L}}\), that is,

$$ \hat{\delta}^{2}(q_{\sigma}, k_{\psi}^{*}) \leq \frac{\lambda_{q_{\sigma}}^{(12)}}{n}. ~~\hat{\delta}^{2}(q_{\sigma}, k_{\psi}^{*}) \leq \frac{\lambda_{k_{\psi}^{*}}^{(21)}}{p}, $$

then \((q_{\sigma }, k_{\psi }) \not \in {\mathscr{L}}_{R}\). From the previous result presented in Eq. D.4, \(\hat {\delta }^{2}(q_{\sigma }, k_{\psi }^{*})\\ \overset {p}{\rightarrow } \delta ^{2}\). This and Eq. D.7 indicate that

$$ Pr(\hat{\delta}^{2}(q_{\sigma}, k_{\psi}^{*}) \leq \lambda_{k_{\psi}^{*}}^{(21)}/p) \rightarrow 1 $$
(D.8)

for (i)–(iii). Moreover, it follows from Eq. B.7 with \(a=np-nq_{\sigma }-k_{\psi }^{*} p\), \(b=\text {tr}(X_{22}X_{22}^{\top })+{\sum }_{s > q_{\sigma }}\lambda _{s}^{(12)}+{\sum }_{t > k_{\psi }^{*}}\lambda _{t}^{(21)}\), \(c=(k_{\psi }-k_{\psi }^{*})p\) and \(d={\sum }_{k_{\psi } < t \leq k_{\psi }^{*}} \lambda _{t}^{(21)}\) that

$$ \begin{array}{@{}rcl@{}} &&\frac{b}{a} = \hat{\delta}^{2}(q_{\sigma}, k_{\psi}^{*}) \leq \frac{\lambda_{k_{\psi}^{*}}^{(21)}}{p} \leq \frac{{\sum}_{k_{\psi} < t \leq k_{\psi}^{*}} \lambda_{t}^{(21)}}{(k_{\psi}-k_{\psi}^{*})p} = \frac{d}{c}\\ &&\Rightarrow \hat{\delta}^{2}(q_{\sigma}, k_{\psi}^{*}) = \frac{b}{a} \leq \frac{b+d}{a+c} = \hat{\delta}^{2}(q_{\sigma}, k_{\psi}). \end{array} $$

Hence, Eq. D.8 implies that \(Pr(\hat {\delta }^{2}(q_{\sigma }, k_{\psi }^{*}) \leq \hat {\delta }^{2}(q_{\sigma }, k_{\psi })) \rightarrow 1\) for (i)–(iii). Since the assumption \((q_{\sigma }, k_{\psi }) \in {\mathscr{L}}\) indicates that \(\hat {\delta }^{2}(q_{\sigma }, k_{\psi }) \leq \lambda _{q_{\sigma }}^{(12)}/n\), the inequality \(\hat {\delta }^{2}(q_{\sigma }, k_{\psi }^{*}) \leq \hat {\delta }^{2}(q_{\sigma }, k_{\psi })\) yields \(\hat {\delta }^{2}(q_{\sigma }, k_{\psi }^{*}) \leq \lambda _{q_{\sigma }}^{(12)}/n\). From Eq. D.8, it follows that

$$ Pr(\hat{\delta}^{2}(q_{\sigma}, k_{\psi}^{*}) \leq \lambda_{q_{\sigma}}^{(12)}/n) \rightarrow 1 $$
(D.9)

for (i)–(iii). It is established from Eqs. D.8 and D.9 that

$$ Pr(q_{\sigma} \geq q_{\sigma}^{*}, k_{\psi} < k_{\psi}^{*} \Rightarrow (q_{\sigma}, k_{\psi}) \not\in \mathcal{L}_{R}) \rightarrow 1, $$

for (i)–(iii). Thus, the proof is completed. □

Proof of Lemma 2

Proof

Fix \(q_{\sigma } \geq q_{\sigma }^{*}\) and \(k_{\psi } \geq k_{\psi }^{*}\). Recall that

$$ \begin{array}{@{}rcl@{}} \hat{{\varSigma}}(q_{\sigma}, k_{\psi})&=&(A^{\top} A)^{-1/2}\{\hat{{\varSigma}}_{*}(q_{\sigma}, k_{\psi})-\hat{\delta}^{2}(q_{\sigma}, k_{\psi})I_{q}\}(A^{\top} A)^{-1/2}, \\ \hat{{\varSigma}}_{*}(q_{\sigma}, k_{\psi})&=&Q_{12}^{\top}\text{diag}\{\lambda_{1}^{(12)}/n, \ldots, \lambda_{q_{\sigma}}^{(12)}/n, \hat{\delta}^{2}(q_{\sigma}, k_{\psi}), \ldots, \hat{\delta}^{2}(q_{\sigma}, k_{\psi})\}Q_{12}, \end{array} $$

where Q12 is an orthogonal matrix, which is used to diagonalize: \(n\hat {{\varSigma }}_{*}(q, k)=(\tilde {{\varSigma }}+\delta ^{2}I_{q})^{1/2}W_{12}(\tilde {{\varSigma }}+\delta ^{2}I_{q})^{1/2} = Q_{12}^{\top }\text {diag}\{\lambda _{1}^{(12)}, \ldots , \lambda _{q}^{(12)}\}Q_{12}\). Note that \(\tilde {{\varSigma }}=(A^{\top } A)^{1/2}{\varSigma } (A^{\top } A)^{1/2}\) and \(W_{12} \sim W_{q}(I_{q}, n-k)\) defined in Eq. D.1. Here, \(\hat {{\varSigma }}(q_{\sigma }, k_{\psi })-{\varSigma }\) is evaluated by

$$ \begin{array}{@{}rcl@{}} \hat{{\varSigma}}(q_{\sigma}, k_{\psi})-{\varSigma}&=&(A^{\top} A)^{-1/2}\hat{{\varSigma}}_{*}(q_{\sigma}, k_{\psi})(A^{\top} A)^{-1/2} - \hat{\delta}^{2}(q_{\sigma}, k_{\psi})(A^{\top} A)^{-1} - {\varSigma} \\ &=&(A^{\top} A)^{-1/2}\{\hat{{\varSigma}}_{*}(q_{\sigma}, k_{\psi})-\hat{{\varSigma}}_{*}(q, k)\}(A^{\top} A)^{-1/2}\\ &&+ (A^{\top} A)^{-1/2}\hat{{\varSigma}}_{*}(q, k)(A^{\top} A)^{-1/2}-{\varSigma}-\delta^{2}(A^{\top} A)^{-1}\\ &&+ \{\delta^{2}-\hat{\delta}^{2}(q_{\sigma}, k_{\psi})\}(A^{\top} A)^{-1}. \end{array} $$

Therefore, an upper bound of \(\|\hat {{\varSigma }}(q_{\sigma }, k_{\psi })-{\varSigma }\|_{2}\) can be obtained as follows:

$$ \begin{array}{@{}rcl@{}} \|\hat{{\varSigma}}(q_{\sigma}, k_{\psi})-{\varSigma}\|_{2} &\leq& \max_{q_{\sigma} < s \leq q}|\hat{\delta}^{2}(q_{\sigma}, k_{\psi})-\lambda_{s}^{(12)}/n|\|(A^{\top} A)^{-1}\|_{2} \\ && +\|(A^{\top} A)^{-1/2}\hat{{\varSigma}}_{*}(q, k)(A^{\top} A)^{-1/2} - {\varSigma} - \delta^{2}(A^{\top} A)^{-1}\|_{2}\\ &&+ |\delta^{2}-\hat{\delta}^{2}(q_{\sigma}, k_{\psi})|\|(A^{\top} A)^{-1}\|_{2}. \end{array} $$

For (i) and (ii), the limits \(\hat {\delta }^{2}(q_{\sigma }, k_{\psi }) \overset {p}{\rightarrow } \delta ^{2}\) (\(q_{\sigma } \geq q_{\sigma }^{*}\), \(k_{\psi } \geq k_{\psi }^{*}\)) and \(\lambda _{s}^{(12)}/n \overset {p}{\rightarrow } \delta ^{2}\) (\(s > q_{\sigma }^{*}\)) have been shown in the proof of Lemma 1. Moreover, because \(W_{12}/n \overset {p}{\rightarrow } I_{q}\), it holds that

$$ \begin{array}{@{}rcl@{}} &&\|(A^{\top} A)^{-1/2}\hat{{\varSigma}}_{*}(q, k)(A^{\top} A)^{-1/2} - {\varSigma} - \delta^{2}(A^{\top} A)^{-1}\|_{2} \\ &&\leq \|{\varSigma} + \delta^{2}(A^{\top} A)^{-1}\|_{2}\|W_{12}/n-I_{q}\|_{2} \overset{p}{\rightarrow} 0. \end{array} $$

Hence, for (i) and (ii), \(\|\hat {{\varSigma }}(q_{\sigma }, k_{\psi })-{\varSigma }\|_{2} \overset {p}{\rightarrow } 0\).

In a similar way to the above presentation, we can verify the convergence of \(\hat {{\varPsi }}(q_{\sigma }, k_{\psi })\) to Ψ for (i) and (iii). Hence, the proof is completed. □

Figure 1
figure1

MSE transition of \(\hat {B}\)

Figure 2
figure2

MSE transition of \(\hat {{\varSigma }}\)

Figure 3
figure3

MSE transition of \(\hat {{\varPsi }}\)

Figure 4
figure4

MSE transition of \(\hat {\delta }^{2}\)

Appendix E. Figures for the Simulation Results

In this section, we show the transition of MSEs of \(\hat {B}\), \(\hat {{\varSigma }}\), \(\hat {{\varPsi }}\) and \(\hat {\delta }^{2}\) in graphs in order to make it clear, where the MSEs by Tables 12 and 3 are given in the simulation study in Section 5. Figures 123 and 4 are the graphs of MSEs of \(\hat {B}\), \(\hat {{\varSigma }}\), \(\hat {{\varPsi }}\) and \(\hat {\delta }^{2}\), respectively, for the cases 1–3 with p = n and p = 20. In all figures, the solid line means that p = n whereas the dashed lines indicates that p = 20. Note that several lines in Figs. 2 and 4 are overlapped. This implies that the value of Ψ does not much affect estimation of Σ and δ2. We can see from the figures that \(\hat {B}\) converges to B except for case 1, \(\hat {{\varPsi }}\) converges to Ψ except for case 1 with p = 20, and \(\hat {{\varSigma }}\) and \(\hat {\delta }\) converge to Σ and δ, respectively, in all cases, with n tending to infinity. These results enhance understanding of the simulation results.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Imori, S., von Rosen, D. & Oda, R. Growth Curve Model with Bilinear Random Coefficients. Sankhya A (2020). https://doi.org/10.1007/s13171-020-00204-5

Download citation

Keywords and phrases

  • Consistency
  • Growth curve model
  • Maximum likelihood estimators
  • Random coefficients.

AMS (2000) subject classification

  • Primary 62H12; Secondary 62J05