Abstract
Regression models for multinomial responses with time dependent covariates have been studied recently both in longitudinal and time series setup. For practical importance, in this paper, we focus on a longitudinal multinomial response model with two categorical covariates to study their main and interaction effects after accommodating a lag 1 dynamic relationship between past and present multinomial responses. The proposed model could be generalized easily to accommodate multiple (more than two) categorical covariates and their interactions. As far as the estimation of the regression and the dynamic dependence parameters is concerned, we follow a recent parameter dimension-split based approach suggested by Sutradhar (Sankhya A80, 301β329 2018) but unlike the conditional method of moments (CMM) used in this study, we use a more efficient estimation approach, namely the so-called conditional generalized quasi-likelihood (CGQL) method for the estimation of the dynamic dependence parameters. The regression parameters are also estimated by using the same CGQL approach where responses become independent conditional on the past responses which is similar in principle to the likelihood estimation where the likelihood function is formed as a product of transitional probabilities conditional on the past responses. The asymptotic properties of the CGQL estimators are provided in details. The higher efficiency performance of the CGQL approach over the CMM approach is also demonstrated, for example, for the estimation of the dynamic dependence parameters.
Similar content being viewed by others
References
Amemiya, T. (1985). Advanced econometrics. Harvard University Press, Cambridge.
Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete multivariate analysis: Theory and practice. The MIT Press, Cambridge.
Fokianos, K. and Kedem, B. (2003). Regression theory for categorical time series. Statistical Science18, 357β376.
Fokianos, K. and Kedem, B. (2004). Partial likelihood inference for time series following generalized linear models. Journal of Time Series Analysis25, 173β197.
Kaufmann, H. (1987). Regression models for nonstationary categorical time series: Asymptotic estimation theory. The Annals of Statistics15, 79β98.
Loredo-Osti, J. C. and Sutradhar, B. C. (2012). Estimation of regression and dynamic dependence parameters for non-stationary multinomial time series. Journal of Time Series Analysis33, 458β467.
Mallick, T. S. and Sutradhar, B. C. (2008). GQL Versus conditional GQL inferences for non-stationary time series of counts with overdispersion. Journal of Time Series Analysis29, 402β420.
McDonald, D. R. (2005). The local limit theorem: a historical perspective. Journal of Iranian Statistical Society4, 73β86.
Sutradhar, B.C. (2003). An overview on regression models for diserete longitudinal responses. Statistical Science18, 377β393.
Sutradhar, B. C. (2011). Dynamic mixed models for familial longitudinal data. Springer, New York.
Sutradhar, B. C. (2014). Longitudinal categorical data analysis. Springer, New York.
Sutradhar, B. C. (2018). A parameter dimension-split based asymptotic regression estimation theory for a multinomial panel data model. Sankhya A80, 301β329.
Wedderburn, R. W. M. (1974). Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika61, 439β447.
Acknowledgments
This research was supported partially by an NSERC grant. The authors would like to thank two reviewers for their comments and suggestions that lead to the improvement of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisherβs Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Computational Aids for the Estimation of Ξ³ and π
Formula for the derivative matrix \(\frac {\partial \eta ^{\prime }_{[\ell ]t|t-1}(\theta ,\gamma )} {\partial \gamma }\) in Eq. 3.7 under Section 3.1.1:
The (J ββ1)2 Γ (J ββ1) derivative matrix \(\frac {\partial \eta ^{\prime }_{[\ell ]t|t-1}(\theta ,\gamma )} {\partial \gamma }\) has the computational formula
where
as in Eq. 2.10, and \(\eta ^{*}_{([\ell ]t|t-1),M}(\theta ,\gamma )\) denotes the matrix constructed by using the (J ββ1)-dimensional column vectors \(\left [\eta ^{(j)}_{[\ell ]t|t-1}(\delta _{[\ell ](t-1)j}-\eta _{[\ell ]t|t-1})\right ]\) for all j =β1,β¦,J ββ1.
Derivation:
Because by Eq. 2.10
it then follows that
Next because \(\gamma =(\gamma ^{\prime }_{1},\ldots ,\gamma ^{\prime }_{j},\ldots ,\gamma ^{\prime }_{J-1})'\), one obtains
The Eq.Β 4.11 then follows from Eq.Β 4.13.
Formula for the derivative matrix \(\frac {\partial \pi ^{\prime }_{[\ell ](1)}(\theta )}{\partial \theta }\) in Eq. 3.22 under Section 3.2:
Recall from Section 2.1 that
where, by Eq. 2.10, we write
with x[β]j are defined through Eqs. 2.11β2.14, and the (J ββ1)(p1 +β1)(p2 +β1)-dimensional parameters vector of regression and interaction effects is defined by Eq. 2.8. This formulation provides a computationally easy derivative for each element in Eq.Β 4.14. To be specific, the derivative for the j-th element has the formula
The desired derivative matrix \(\frac {\partial \pi ^{\prime }_{[\ell ](1)}(\theta )}{\partial \theta }: (J-1)(p_{1}+1)(p_{2}+1) \times (J-1)\) follows by using Eqs.Β 4.15β4.14.
Formula for the derivative matrix \(\frac {\partial \eta ^{\prime }_{[\ell ]t|t-1}(\theta ,\hat {\gamma }(\theta ))} {\partial \theta }\) in Eq. 3.22 under Section 3.22:
First, we write
where by replacing Ξ³j in Eq. 2.10 with \(\hat {\gamma }_{j}(\theta )\) for all j =β1,β¦,J ββ1, one writes the formula for the transitional probability as
It then follows that
leading to the desired derivative matrix in Eq.Β 4.16, provided the derivative \(\frac {\partial \hat {\gamma }^{\prime }_{j}(\theta )}{\partial \theta }\) is known. We compute this latter derivative below.
Computation of the derivative matrix \(\frac {\partial \hat {\gamma }^{\prime }_{j}(\theta )}{\partial \theta }\) for j =β1,β¦,J ββ1
Recall from Eq. 4.4 that
Because the estimation is done iteratively, under the assumption that the π parameter involved in the derivatives and in the covariance matrix is known from the previous iteration, we can obtain an approximation for the desired derivative, as
where the formula for the derivative \(\frac {\partial \eta _{[\ell ]}}{\partial \theta ^{\prime }}\) follows from the derivative vector
with
where
Appendix B: Derivations for the Asymptotic Normality for the CGQL Estimators of Ξ³ and π
Consistency of \(\hat {\gamma }_{CGQL}(\theta )\)
Suppose that following 2 Assumptions hold.
Assumption B1.
Forβ =β1,β¦,(p1 +β1)(p2 +β1), minβK[β] ββ,as in certain casesβcan be 1, implying that the total number of individuals in the study isK = K1which should be large. In general, this assumption guarantees that\(K={\sum }_{\ell =1}K_{[\ell ]} \rightarrow \infty \).
Assumption B2.
maxβ|Q[β](π,Ξ³)| < M, M being a finite quantity. That is,Q[β](β );β =β1,β¦ in Eq. 4.4 are bounded and finite matrices.
Now because
by using the aforementioned 2 assumptions (Assumptions B1 and B2) and the standard central limit theorem (Bishop et al. 1975, Theorem 14.4-1, p. 476), one obtains the limiting results shown in Eqs. 4.5 and 4.6, in Section 4.1. This proves the consistency of the estimator \(\hat {\gamma }_{CGQL}(\theta )\).
Asymptotic normality of \(\hat {\gamma }_{CGQL}(\theta )\)
To prove the asymptotic normality of this CGQL estimator, we define
which has the expectation vector and covariance matrix as
because \(E[Y^{*}_{i\in \ell }-\eta _{[\ell ]}]=0\), and \(\text {cov}[Y^{*}_{i\in \ell }-\eta _{[\ell ]}]={\Sigma }^{*}_{[\ell ],D}(\theta ,\gamma )\). Consequently, by applying (4.24) and (4.25), one may re-express estimating equation in Eq. 4.4 as
where \(Z_{K}=\left [\text {cov}\{\bar {f}(\theta ,\gamma )\}\right ]^{-\frac {1}{2}}\bar {f}(\theta ,\gamma )\), with \(\bar {f}(\theta ,\gamma )=\frac {1}{K}{\sum }^{(p_{1}+1)(p_{2}+1)}_{\ell =1} {\sum }^{K_{[\ell ]}}_{i=1}\)fiββ(π,Ξ³) as in Eq.Β 4.24.
Now suppose that following Assumption B3 holds.
Assumption B3.
fiββ(β ) in (B1) (see also Eq. 4.2) satisfy the Lindeberg condition, that is,
for all π >β0, g(β ) being the probability distribution of fiββ [For a proof for the Lindeberg condition in the context of categorical time series data, see Kaufmann (1987, pp. 89, 93)].
Because the Assumptions B1 and B3 hold, it follows from the Lindeberg-Feller central limit theorem Amemiya (1985, Theorem 3.3.6), McDonald (2005, Theorem 2.2)] that ZK in Eq.Β 4.26 has the limiting distribution \((g^{*}_{K}(\cdot ))\):
Applying (4.28) into (4.26), we then obtain the limiting distribution of \(\hat {\gamma }_{CGQL}(\theta )\), say \(g^{*}_{K}(\hat {\gamma }_{CGQL}(\theta ))\) as K ββ, as
This shows that \(\hat {\gamma }_{CGQL}(\theta )\) has the limiting normal distribution.
Consistency and Asymptotic normality of \(\hat {\theta }_{CGQL}\)
Recall from Eq. 3.22 that the estimating function in the left hand side of the CGQL estimating equation for π may be expressed as
where \(\hat {\gamma }(\theta )\) satisfies the CGQL iterative equation for Ξ³ given in Eq. 4.4, that is
Notice that the computation of the estimating function in Eq.Β 4.30 requires the formula for the derivative \(\frac {\partial \hat {\gamma }(\theta )}{\partial \theta ^{\prime }}\), which can be obtained from Eq.Β 4.31. This formula is derived in AppendixΒ A.
Next, for \(K={\sum }^{(p_{1}+1)(p_{2}+1)}_{\ell =1}K_{[\ell ]}\), let \(\bar {h}_{K}(\theta )=\frac {1}{K}h_{K}(\theta )\), where hK(π) is given by Eq.Β 4.30. This (J ββ1)(p1 +β1)(p2 +β1)-dimensional mean vector function has the expectation and the (J ββ1)(p1 +β1)(p2 +β1) Γ (J ββ1)(p1 +β1)(p2 +β1) covariance matrix given by
respectively. Furthermore, by a first order Taylorβs series expansion it follows Eq. 3.22 that
where R[β](π) and W[β],t(π) are defined in Eq.Β 4.33. The approximation in Eq.Β 4.34 further can be expressed by Eq.Β 4.33, as
In order to derive the asymptotic distribution of \(\hat {\theta }_{CGQL}\) expressed by Eq.Β 4.35, we first observe that the individual functions hiββ(β ) in Eq.Β 4.34 are independent for all i =β1,β¦,K[β];β =β1,β¦,(p1 +β1)(p2 +β1), but they have non-identical distributions implying that their variances are different at different covariate levels (β). Similar to the Assumption B3, we now make the following assumption about the elementary estimating function hiββ(β ) :
Assumption B4.
Assume that the elementary estimating functionhiββ(π) in Eq.Β 4.35satisfy the Lindeberg condition, that is,
for all π >β0, g(β ) being the probability distribution of hiββ.
Now, by using this Assumption B4, it follows from the Lindeberg-Feller central limit theorem (Amemiya 1985, Theorem 3.3.6) that the standardized mean function \(\tilde {Z}_{K}=[\text {cov}(\bar {h}_{K}(\theta ))]^{-\frac {1}{2}}\bar {h}_{K}(\theta )\) in Eq.Β 4.35 asymptotically (minβK[β] ββ) follows a multivariate normal distribution. That is,
leading to the limiting distribution of \(\hat {\theta }_{CGQL}\) as in Eq. ?? under Section ??.
Appendix C: Asymptotic Efficiency Comparison Between the CMM and CGQL Approaches
It follows from Eq. 3.2 that the CMM (conditional method of moments) estimating equation for Ξ³ has the form
whereas the CGQL estimating equation is a weighted CMM estimating equation where the responses are weighted by their corresponding variances. More specifically, as given in Eq. 4.1, the CGQL estimating equation has the form
where Ξ£[β]t|tββ1(π,Ξ³) is the conditional variance matrix of the multinomial response vector yiββ,t.
Next by a Taylor series approximation similar to that of Eq. 4.4, the CMM estimating (4.38) yields
where
with \(y^{*}_{i\in \ell }=\left (\begin {array}{cccccc}y^{\prime }_{i\in \ell ,2} & {\ldots } & y^{\prime }_{i\in \ell ,t} & {\ldots } & y^{\prime }_{i \in \ell ,T} \end {array}\right )^{\prime }: (J-1)(T-1) \times 1\). It then follows from Eq.Β 4.40 that the CMM estimator \(\hat {\gamma }_{CMM}(\theta )\) has the asymptotic (as \(K={\sum }_{\ell }K_{[\ell ]} \rightarrow \infty \)) covariance matrix estimate given by
whereas by Eq. 4.4, an estimate of the asymptotic covariance matrix of the CGQL estimator \(\hat {\gamma }_{CGQL}(\theta )\) has the formula
We remark that if all the diagonal elements of the Ξ£β[β],D matrix were the same, i.e., the responses under all categories have the same variances, then the CGQL estimates reduce to the CMM estimates and the covariance matrices given by Eqs.Β 4.41 andΒ 4.42 would be the same. However, this assumption is impractical as the multinomial responses under different categories in general would exhibit different variances.
We further remark that because the formulas for the covariance matrices given by Eqs.Β 4.41 andΒ 4.42 are, respectively, similar to the traditional OLS (ordinary least squared) and GLS (generalized least squared) regression estimators in a linear longitudinal regression setup, one may easily show that
for all the components of the Ξ³, i.e., for all u =β1,β¦,(J ββ1)2. For a detailed proof, one may refer to Sutradhar (2011, Theorem 2.1, Chapter 2), for example. This, as indicated in Section 3.1.1 (see the comments prior to Eq. 3.3), demonstrates that the CGQL estimates are more efficient than the CMM estimates.
Rights and permissions
About this article
Cite this article
Rao, R.P., Sutradhar, B.C. Multiple Categorical Covariates-Based Multinomial Dynamic Response Model. Sankhya A 82, 186β219 (2020). https://doi.org/10.1007/s13171-019-00168-1
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13171-019-00168-1
Keywords and phrases
- Asymptotic properties
- Covariates with possible interactions
- Dynamic dependence parameters
- Conditional generalized quasi-likelihood estimation
- Lag 1 transitional multinomial probabilities
- Unconditional method of moments