Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Multiple Categorical Covariates-Based Multinomial Dynamic Response Model

  • 34 Accesses


Regression models for multinomial responses with time dependent covariates have been studied recently both in longitudinal and time series setup. For practical importance, in this paper, we focus on a longitudinal multinomial response model with two categorical covariates to study their main and interaction effects after accommodating a lag 1 dynamic relationship between past and present multinomial responses. The proposed model could be generalized easily to accommodate multiple (more than two) categorical covariates and their interactions. As far as the estimation of the regression and the dynamic dependence parameters is concerned, we follow a recent parameter dimension-split based approach suggested by Sutradhar (Sankhya A80, 301–329 2018) but unlike the conditional method of moments (CMM) used in this study, we use a more efficient estimation approach, namely the so-called conditional generalized quasi-likelihood (CGQL) method for the estimation of the dynamic dependence parameters. The regression parameters are also estimated by using the same CGQL approach where responses become independent conditional on the past responses which is similar in principle to the likelihood estimation where the likelihood function is formed as a product of transitional probabilities conditional on the past responses. The asymptotic properties of the CGQL estimators are provided in details. The higher efficiency performance of the CGQL approach over the CMM approach is also demonstrated, for example, for the estimation of the dynamic dependence parameters.

This is a preview of subscription content, log in to check access.


  1. Amemiya, T. (1985). Advanced econometrics. Harvard University Press, Cambridge.

  2. Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete multivariate analysis: Theory and practice. The MIT Press, Cambridge.

  3. Fokianos, K. and Kedem, B. (2003). Regression theory for categorical time series. Statistical Science18, 357–376.

  4. Fokianos, K. and Kedem, B. (2004). Partial likelihood inference for time series following generalized linear models. Journal of Time Series Analysis25, 173–197.

  5. Kaufmann, H. (1987). Regression models for nonstationary categorical time series: Asymptotic estimation theory. The Annals of Statistics15, 79–98.

  6. Loredo-Osti, J. C. and Sutradhar, B. C. (2012). Estimation of regression and dynamic dependence parameters for non-stationary multinomial time series. Journal of Time Series Analysis33, 458–467.

  7. Mallick, T. S. and Sutradhar, B. C. (2008). GQL Versus conditional GQL inferences for non-stationary time series of counts with overdispersion. Journal of Time Series Analysis29, 402–420.

  8. McDonald, D. R. (2005). The local limit theorem: a historical perspective. Journal of Iranian Statistical Society4, 73–86.

  9. Sutradhar, B.C. (2003). An overview on regression models for diserete longitudinal responses. Statistical Science18, 377–393.

  10. Sutradhar, B. C. (2011). Dynamic mixed models for familial longitudinal data. Springer, New York.

  11. Sutradhar, B. C. (2014). Longitudinal categorical data analysis. Springer, New York.

  12. Sutradhar, B. C. (2018). A parameter dimension-split based asymptotic regression estimation theory for a multinomial panel data model. Sankhya A80, 301–329.

  13. Wedderburn, R. W. M. (1974). Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika61, 439–447.

Download references


This research was supported partially by an NSERC grant. The authors would like to thank two reviewers for their comments and suggestions that lead to the improvement of the paper.

Author information

Correspondence to Brajendra C. Sutradhar.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix A: Computational Aids for the Estimation of γ and 𝜃

Formula for the derivative matrix \(\frac {\partial \eta ^{\prime }_{[\ell ]t|t-1}(\theta ,\gamma )} {\partial \gamma }\) in Eq. 3.7 under Section 3.1.1:

The (J − 1)2 × (J − 1) derivative matrix \(\frac {\partial \eta ^{\prime }_{[\ell ]t|t-1}(\theta ,\gamma )} {\partial \gamma }\) has the computational formula

$$ \begin{array}{@{}rcl@{}} &&\frac{\partial \eta^{\prime}_{[\ell]t|t-1}(\theta,\gamma)} {\partial \gamma}=\left[\frac{\partial \eta^{(1)}_{[\ell]t|t-1}(\theta,\gamma)}{\partial \gamma},\ldots,\frac{\partial \eta^{(j)}_{[\ell]t|t-1}(\theta,\gamma)}{\partial \gamma}, \ldots,\frac{\partial \eta^{(J-1)}_{[\ell]t|t-1}(\theta,\gamma)}{\partial \gamma}\right] \\ &=&\left( \begin{array}{ccccccc}\eta^{(1)}_{[\ell]t|t-1}(\delta_{([\ell](t-1))1}-\eta_{[\ell]t|t-1}) & {\cdots} & \eta^{(J-1)}_{[\ell]t|t-1}(\delta_{([\ell](t-1))(J-1)}-\eta_{[\ell]t|t-1}) \end{array}\right) \otimes \textbf{y}_{i\in \ell, t-1} \\ &=& \eta^{*}_{([\ell]t|t-1),M}(\theta,\gamma) \otimes \textbf{y}_{i\in \ell,t-1}, \end{array} $$


$$\delta_{([\ell](t-1))g} = y^{(g)}_{i \in \ell,t-1}=\left\{\begin{array}{ll} [01'_{g-1},1,01'_{J-1-g}]' & \text{for} g=1,\ldots,J-1 \\ 01_{J-1} & \text{for} g=J, \end{array} \right. $$

as in Eq. 2.10, and \(\eta ^{*}_{([\ell ]t|t-1),M}(\theta ,\gamma )\) denotes the matrix constructed by using the (J − 1)-dimensional column vectors \(\left [\eta ^{(j)}_{[\ell ]t|t-1}(\delta _{[\ell ](t-1)j}-\eta _{[\ell ]t|t-1})\right ]\) for all j = 1,…,J − 1.


Because by Eq. 2.10

$$ \begin{array}{@{}rcl@{}} &&\eta^{(j)}_{[\ell]t|t-1}(y_{i \in \ell,t-1};\theta,\gamma) \equiv \eta^{(j)}_{[\ell]t|t-1}(\theta,\gamma) \equiv \eta^{(j)}_{[\ell]t|t-1}\\ &=&{\exp \left[x^{\prime}_{[\ell]j}\theta +\gamma^{\prime}_{j}y_{i\in \ell,t-1}\right]}/\left[ {1 + {\sum}^{J-1}_{v=1}\exp \left[x^{\prime}_{[\ell]v}\theta +\gamma^{\prime}_{v}y_{i\in \ell,t-1}\right]}\right], \end{array} $$

it then follows that

$$ \begin{array}{@{}rcl@{}} \frac{\partial \eta^{(j)}_{[\ell]t|t-1}}{\partial \gamma_{h}} &=& \left\{\begin{array}{ll} y_{i\in \ell,t-1} \eta^{(j)}_{[\ell]t|t-1}[1-\eta^{(j)}_{[\ell]t|t-1}] & \text{for} h=j; h,j=1,\ldots,J-1 \\ -y_{i\in \ell,t-1}\eta^{(j)}_{[\ell]t|t-1}\eta^{(h)}_{[\ell]t|t-1} & \text{for} h \ne j; h,j=1,\ldots,J-1. \end{array} \right. \end{array} $$

Next because \(\gamma =(\gamma ^{\prime }_{1},\ldots ,\gamma ^{\prime }_{j},\ldots ,\gamma ^{\prime }_{J-1})'\), one obtains

$$ \begin{array}{@{}rcl@{}} \frac{\partial \eta^{(j)}_{[\ell]t|t-1}}{\partial \gamma} &=&\left( \begin{array}{cc} -\eta^{(1)}_{[\ell]t|t-1}\eta^{(j)}_{[\ell]t|t-1} \\ {\vdots} \\ \eta^{(j)}_{[\ell]t|t-1}[1-\eta^{(j)}_{[\ell]t|t-1}] \\ {\vdots} \\ -\eta^{(J-1)}_{[\ell]t|t-1}\eta^{(j)}_{[\ell]t|t-1} \end{array}\right) \otimes \textbf{y}_{i\in \ell,t-1} : (J-1)(J-1) \times 1 \\ &=&\left[\eta^{(j)}_{[\ell]t|t-1}(\delta_{([\ell](t-1))j}-\eta_{[\ell]t|t-1})\right] \otimes \textbf{y}_{i\in \ell,t-1}. \end{array} $$

The Eq. 4.11 then follows from Eq. 4.13.

Formula for the derivative matrix \(\frac {\partial \pi ^{\prime }_{[\ell ](1)}(\theta )}{\partial \theta }\) in Eq. 3.22 under Section 3.2:

Recall from Section 2.1 that

$$ \pi_{[\ell](1)}(\theta)= [\pi_{[\ell](1)1}(\theta),\ldots,\pi_{[\ell](1)j}(\theta),\ldots, \pi_{[\ell](1)(J-1)}(\theta)]', $$

where, by Eq. 2.10, we write

$$\pi_{([\ell]1)j}(\theta)= \frac{\exp(x^{\prime}_{[\ell]j}\theta)} {1+{\sum}^{J-1}_{h=1}\exp(x^{\prime}_{[\ell]h}\theta)},$$

with x[]j are defined through Eqs. 2.11–2.14, and the (J − 1)(p1 + 1)(p2 + 1)-dimensional parameters vector of regression and interaction effects is defined by Eq. 2.8. This formulation provides a computationally easy derivative for each element in Eq. 4.14. To be specific, the derivative for the j-th element has the formula

$$ \frac{\partial \pi_{[\ell](1)j}}{\partial \theta} = \pi_{[\ell](1)j}\left[x_{[\ell]j} -{\sum}^{J-1}_{h=1}\pi_{[\ell](1)h}x_{[\ell]h}\right] : (J-1)(p_{1}+1)(p_{2}+1) \times 1. $$

The desired derivative matrix \(\frac {\partial \pi ^{\prime }_{[\ell ](1)}(\theta )}{\partial \theta }: (J-1)(p_{1}+1)(p_{2}+1) \times (J-1)\) follows by using Eqs. 4.154.14.

Formula for the derivative matrix \(\frac {\partial \eta ^{\prime }_{[\ell ]t|t-1}(\theta ,\hat {\gamma }(\theta ))} {\partial \theta }\) in Eq. 3.22 under Section 3.22:

First, we write

$$ \frac{\partial \eta^{\prime}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta))} {\partial \theta}=\left[\frac{\partial \eta^{(1)}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta))}{\partial \theta},\ldots,\frac{\partial \eta^{(j)}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta))}{\partial \theta}, \ldots,\frac{\partial \eta^{(J-1)}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta))}{\partial \theta}\right], $$

where by replacing γj in Eq. 2.10 with \(\hat {\gamma }_{j}(\theta )\) for all j = 1,…,J − 1, one writes the formula for the transitional probability as

$$ \begin{array}{@{}rcl@{}} &&\eta^{(j)}_{[\ell]t|t-1}(y_{i \in \ell,t-1};\theta,\hat{\gamma}(\theta)) \equiv \eta^{(j)}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta)) \equiv \eta^{(j)}_{[\ell]t|t-1} \end{array} $$
$$ \begin{array}{@{}rcl@{}} &=&{\exp \left[x^{\prime}_{[\ell]j}\theta +\hat{\gamma}^{\prime}_{j}(\theta)y_{i\in \ell,t-1}\right]}/\left[ {1 + {\sum}^{J-1}_{v=1}\exp \left[x^{\prime}_{[\ell]v}\theta +\hat{\gamma}^{\prime}_{v}(\theta)y_{i\in \ell,t-1}\right]}\right]. \end{array} $$

It then follows that

$$ \begin{array}{@{}rcl@{}} \frac{\partial \eta^{(j)}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta))} {\partial \theta} &=& \eta^{(j)}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta))\left[\left\{ x_{[\ell]j} +\frac{\partial \hat{\gamma}^{\prime}_{j}(\theta)}{\partial \theta}y_{i\in \ell,t-1}\right\} \right. \\ &-&\left. {\sum}^{J-1}_{v=1}\eta^{(v)}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta)) \left\{x_{[\ell]v} +\frac{\partial \hat{\gamma}^{\prime}_{v}(\theta)}{\partial \theta}y_{i\in \ell,t-1}\right\} \right],\\ \end{array} $$

leading to the desired derivative matrix in Eq. 4.16, provided the derivative \(\frac {\partial \hat {\gamma }^{\prime }_{j}(\theta )}{\partial \theta }\) is known. We compute this latter derivative below.

Computation of the derivative matrix \(\frac {\partial \hat {\gamma }^{\prime }_{j}(\theta )}{\partial \theta }\) for j = 1,…,J − 1

Recall from Eq. 4.4 that

$$ \begin{array}{@{}rcl@{}} &&\hat{\gamma}_{CGQL}(\theta) \simeq \gamma+\left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1} \frac{\partial \eta^{\prime}_{[\ell]}} {\partial \gamma}{{\Sigma}^{*}}^{-1}_{[\ell],D}\frac{\partial \eta_{[\ell]}} {\partial \gamma^{\prime}}\right]^{-1} \\ &\times&{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}\frac{\partial \eta^{\prime}_{[\ell]}} {\partial \gamma}{{\Sigma}^{*}}^{-1}_{[\ell],D}\left( y^{*}_{i\in \ell}-\eta_{[\ell]}\right). \end{array} $$

Because the estimation is done iteratively, under the assumption that the 𝜃 parameter involved in the derivatives and in the covariance matrix is known from the previous iteration, we can obtain an approximation for the desired derivative, as

$$ \begin{array}{@{}rcl@{}} &&\frac{\partial \hat{\gamma}(\theta)}{\partial \theta^{\prime}} \simeq - \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1} \frac{\partial \eta^{\prime}_{[\ell]}} {\partial \gamma}{{\Sigma}^{*}}^{-1}_{[\ell],D}\frac{\partial \eta_{[\ell]}} {\partial \gamma^{\prime}}\right]^{-1} \\ &\times&{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}\frac{\partial \eta^{\prime}_{[\ell]}} {\partial \gamma}{{\Sigma}^{*}}^{-1}_{[\ell],D}\frac{\partial \eta_{[\ell]}}{\partial \theta^{\prime}}, \end{array} $$

where the formula for the derivative \(\frac {\partial \eta _{[\ell ]}}{\partial \theta ^{\prime }}\) follows from the derivative vector

$$ \begin{array}{@{}rcl@{}} \frac{\partial \eta^{\prime}_{[\ell]}}{\partial \theta} &=& (\frac{\partial \eta^{\prime}_{[\ell]2|1}}{\partial \theta} {\ldots} \frac{\partial \eta^{\prime}_{[\ell]t|t-1}}{\partial \theta} \ldots \frac{\partial \eta^{\prime}_{[\ell]T|T-1}}{\partial \theta}), \end{array} $$


$$ \frac{\partial \eta^{\prime}_{[\ell]t|t-1}(\theta,\gamma)} {\partial \theta}=\left[\frac{\partial \eta^{(1)}_{[\ell]t|t-1}(\theta,\gamma)}{\partial \theta},\ldots,\frac{\partial \eta^{(j)}_{[\ell]t|t-1}(\theta,\gamma)}{\partial \theta}, \ldots,\frac{\partial \eta^{(J-1)}_{[\ell]t|t-1}(\theta,\gamma)}{\partial \theta}\right], $$


$$ \begin{array}{@{}rcl@{}} &&\frac{\partial \eta^{(j)}_{[\ell]t|t-1}(\theta,\gamma)}{\partial \theta} =\eta^{(j)}_{[\ell]t|t-1}(\theta,\gamma) x_{[\ell]j} -{\sum}^{J-1}_{v=1}\eta^{(v)}_{[\ell]t|t-1}(\theta,\gamma)x_{[\ell]v}. \end{array} $$

Appendix B: Derivations for the Asymptotic Normality for the CGQL Estimators of γ and 𝜃

Consistency of \(\hat {\gamma }_{CGQL}(\theta )\)

Suppose that following 2 Assumptions hold.

Assumption B1.

For = 1,…,(p1 + 1)(p2 + 1), minK[],as in certain casescan be 1, implying that the total number of individuals in the study isK = K1which should be large. In general, this assumption guarantees that\(K={\sum }_{\ell =1}K_{[\ell ]} \rightarrow \infty \).

Assumption B2.

max|Q[](𝜃,γ)| < M, M being a finite quantity. That is,Q[](⋅); = 1,… in Eq. 4.4 are bounded and finite matrices.

Now because

$$ \begin{array}{@{}rcl@{}} &&E\left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}f_{i\in \ell}(\theta,\gamma)\right]=0, \text{and} \\ &&\text{var}\left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}f_{i\in \ell}(\theta,\gamma)\right] ={\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}Q_{[\ell]}(\theta,\gamma), \end{array} $$

by using the aforementioned 2 assumptions (Assumptions B1 and B2) and the standard central limit theorem (Bishop et al. 1975, Theorem 14.4-1, p. 476), one obtains the limiting results shown in Eqs. 4.5 and 4.6, in Section 4.1. This proves the consistency of the estimator \(\hat {\gamma }_{CGQL}(\theta )\).

Asymptotic normality of \(\hat {\gamma }_{CGQL}(\theta )\)

To prove the asymptotic normality of this CGQL estimator, we define

$$ \begin{array}{@{}rcl@{}} \bar{f}(\theta,\gamma)&=&\frac{1}{K}{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1} {\sum}^{K_{[\ell]}}_{i=1}f_{i \in \ell}(\theta,\gamma) \end{array} $$

which has the expectation vector and covariance matrix as

$$ \begin{array}{@{}rcl@{}} &&E[\bar{f}(\theta,\gamma)]=0: (J-1)^{2} \times 1 \\ &&\text{cov}[\bar{f}(\theta,\gamma)]=\frac{1}{K^{2}}{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1} {\sum}^{K_{[\ell]}}_{i=1}Q_{[\ell]}(\theta,\gamma) \\ &=&\frac{1}{K^{2}} V^{*}_{K}(\theta,\gamma): (J-1)^{2} \times (J-1)^{2}, \end{array} $$

because \(E[Y^{*}_{i\in \ell }-\eta _{[\ell ]}]=0\), and \(\text {cov}[Y^{*}_{i\in \ell }-\eta _{[\ell ]}]={\Sigma }^{*}_{[\ell ],D}(\theta ,\gamma )\). Consequently, by applying (4.24) and (4.25), one may re-express estimating equation in Eq. 4.4 as

$$ \begin{array}{@{}rcl@{}} \hat{\gamma}_{CGQL}(\theta)-\gamma &\simeq & \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}Q_{[\ell]} (\theta,\gamma)\right]^{-1}K \bar{f}(\theta,\gamma) \\ &=& \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1} Q_{[\ell]}(\theta,\gamma)\right]^{-1}[V^{*}_{K}(\theta,\gamma)]^{\frac{1}{2}} \left\{V^{*}_{K}(\theta,\gamma) \right\}^{-\frac{1}{2}} K\bar{f}(\theta,\gamma) \\ &=& \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1} Q_{[\ell]}(\theta,\gamma) \right]^{-1}[V^{*}_{K}(\theta,\gamma)]^{\frac{1}{2}} \left[\text{cov}\{\bar{f}(\theta,\gamma)\}\right]^{-\frac{1}{2}}\bar{f}(\theta,\gamma) \\ &=&\left\{V^{*}_{K}(\theta,\gamma)\right\}^{-\frac{1}{2}}Z_{K}, \end{array} $$

where \(Z_{K}=\left [\text {cov}\{\bar {f}(\theta ,\gamma )\}\right ]^{-\frac {1}{2}}\bar {f}(\theta ,\gamma )\), with \(\bar {f}(\theta ,\gamma )=\frac {1}{K}{\sum }^{(p_{1}+1)(p_{2}+1)}_{\ell =1} {\sum }^{K_{[\ell ]}}_{i=1}\)fi(𝜃,γ) as in Eq. 4.24.

Now suppose that following Assumption B3 holds.

Assumption B3.

fi(⋅) in (B1) (see also Eq. 4.2) satisfy the Lindeberg condition, that is,

$$ \lim_{K \rightarrow \infty}{V^{*}}^{-1}_{K}{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1} {\sum}_{(f^{\prime}_{i\in \ell} {V^{*}}^{-1}_{K} f_{i\in \ell} ) >\epsilon}f_{i\in \ell}f^{\prime}_{i\in \ell}g(f_{i\in \ell})=0 $$

for all 𝜖 > 0, g(⋅) being the probability distribution of fi [For a proof for the Lindeberg condition in the context of categorical time series data, see Kaufmann (1987, pp. 89, 93)].

Because the Assumptions B1 and B3 hold, it follows from the Lindeberg-Feller central limit theorem Amemiya (1985, Theorem 3.3.6), McDonald (2005, Theorem 2.2)] that ZK in Eq. 4.26 has the limiting distribution \((g^{*}_{K}(\cdot ))\):

$$ \min_{\ell}\lim_{K_{[\ell]} \rightarrow \infty} g^{*}_{K}(Z_{K}) \rightarrow N(0, I_{(J-1)^{2}}). $$

Applying (4.28) into (4.26), we then obtain the limiting distribution of \(\hat {\gamma }_{CGQL}(\theta )\), say \(g^{*}_{K}(\hat {\gamma }_{CGQL}(\theta ))\) as K, as

$$ \begin{array}{@{}rcl@{}} \min_{\ell}\lim_{K_{[\ell]} \rightarrow \infty} g^{*}_{K}(\hat{\gamma}_{CGQL}(\theta)) &\rightarrow & N\left( \gamma,\left\{V^{*}_{K}(\theta,\gamma)\right\}^{-\frac{1}{2}} I_{(J-1)^{2}}\left\{V^{*}_{K}(\theta,\gamma)\right\}^{-\frac{1}{2}} \right) \\ &=& N\left( \gamma,\left\{V^{*}_{K}(\theta,\gamma)\right\}^{-1} \right). \end{array} $$

This shows that \(\hat {\gamma }_{CGQL}(\theta )\) has the limiting normal distribution.

Consistency and Asymptotic normality of \(\hat {\theta }_{CGQL}\)

Recall from Eq. 3.22 that the estimating function in the left hand side of the CGQL estimating equation for 𝜃 may be expressed as

$$ \begin{array}{@{}rcl@{}} &&h_{K}(\theta,\hat{\gamma}(\theta)) \\ & = &{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}\left[\frac{\partial \pi^{\prime}_{[\ell](1)}(\theta)}{\partial \theta}[{\Sigma}_{[\ell]1}(\theta)]^{-1} [y_{i\in \ell,1}-\pi_{[\ell](1)}(\theta)] \right. \\ & + &\left. {\sum}^{T}_{t=2} \left\{\frac{\partial \eta^{\prime}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta))}{\partial \theta} [{\Sigma}_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta))]^{-1} (y_{i\in \ell,t}-\eta_{[\ell]t|t-1}(\theta,\hat{\gamma}(\theta)))\right\}\right] \\ & = &{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}\left[h_{i\in \ell,1}(\theta) +h_{i\in \ell,2}(\theta,\hat{\gamma}(\theta))\right] \\ & = &{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}\left[h_{i \in \ell}(\theta, \hat{\gamma}(\theta))\right], \end{array} $$

where \(\hat {\gamma }(\theta )\) satisfies the CGQL iterative equation for γ given in Eq. 4.4, that is

$$ \begin{array}{@{}rcl@{}} &&\hat{\gamma}_{CGQL}(\theta) \simeq \gamma+\left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1} \frac{\partial \eta^{\prime}_{[\ell]}} {\partial \gamma}{{\Sigma}^{*}}^{-1}_{[\ell],D}\frac{\partial \eta_{[\ell]}} {\partial \gamma^{\prime}}\right]^{-1} \\ &\times&{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}\frac{\partial \eta^{\prime}_{[\ell]}} {\partial \gamma}{{\Sigma}^{*}}^{-1}_{[\ell],D}\left( y^{*}_{i\in \ell}-\eta_{[\ell]}\right). \end{array} $$

Notice that the computation of the estimating function in Eq. 4.30 requires the formula for the derivative \(\frac {\partial \hat {\gamma }(\theta )}{\partial \theta ^{\prime }}\), which can be obtained from Eq. 4.31. This formula is derived in Appendix A.

Next, for \(K={\sum }^{(p_{1}+1)(p_{2}+1)}_{\ell =1}K_{[\ell ]}\), let \(\bar {h}_{K}(\theta )=\frac {1}{K}h_{K}(\theta )\), where hK(𝜃) is given by Eq. 4.30. This (J − 1)(p1 + 1)(p2 + 1)-dimensional mean vector function has the expectation and the (J − 1)(p1 + 1)(p2 + 1) × (J − 1)(p1 + 1)(p2 + 1) covariance matrix given by

$$ \begin{array}{@{}rcl@{}} &&E[\bar{h}_{K}(\theta)]=\frac{1}{K}E[h_{K}(\theta)]=0, \end{array} $$
$$ \begin{array}{@{}rcl@{}} &&\text{and} \\ &&\text{cov}(\bar{h}_{K}(\theta)) =\frac{1}{K^{2}}P_{K}(\theta): (J-1)(p_{1}+1)(p_{2}+1) \times (J-1)(p_{1}+1)(p_{2}+1), \\ &=&\frac{1}{K^{2}}{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}\left[R_{[\ell]}(\theta) +{\sum}^{T}_{t=2}W_{[\ell],t}(\theta)\right], \text{(say)}, \end{array} $$

respectively. Furthermore, by a first order Taylor’s series expansion it follows Eq. 3.22 that

$$ \begin{array}{@{}rcl@{}} \hat{\theta}_{CGQL}-\theta &\simeq & \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1} \{R_{[\ell]}(\theta)+{\sum}^{T}_{t=2}W_{[\ell],t}(\theta)\}\right]^{-1} {\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}h_{i\in \ell}(\theta) \\ &+&o_{p}(1/\sqrt{{\sum}_{\ell=1}K_{[\ell]}}), \end{array} $$

where R[](𝜃) and W[],t(𝜃) are defined in Eq. 4.33. The approximation in Eq. 4.34 further can be expressed by Eq. 4.33, as

$$ \begin{array}{@{}rcl@{}} \hat{\theta}_{CGQL}-\theta &\simeq & P^{-1}_{K}(\theta) P^{\frac{1}{2}}_{K}(\theta) [\text{cov}(\bar{h}_{K}(\theta))]^{-\frac{1}{2}}\bar{h}_{K}(\theta) \\ &+&o_{p}(1/\sqrt{{\sum}_{\ell=1}K_{[\ell]}}). \end{array} $$

In order to derive the asymptotic distribution of \(\hat {\theta }_{CGQL}\) expressed by Eq. 4.35, we first observe that the individual functions hi(⋅) in Eq. 4.34 are independent for all i = 1,…,K[]; = 1,…,(p1 + 1)(p2 + 1), but they have non-identical distributions implying that their variances are different at different covariate levels (). Similar to the Assumption B3, we now make the following assumption about the elementary estimating function hi(⋅) :

Assumption B4.

Assume that the elementary estimating functionhi(𝜃) in Eq. 4.35satisfy the Lindeberg condition, that is,

$$ \lim_{K\rightarrow \infty}{P}^{-1}_{K}(\theta){\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1} {\sum}_{\{h^{\prime}_{i\in \ell}{P}^{-1}_{K} h_{i \in \ell}\} >\epsilon}\{h_{i\in \ell}h^{\prime}_{i\in \ell}g(h_{i\in \ell})\}=0 $$

for all 𝜖 > 0, g(⋅) being the probability distribution of hi.

Now, by using this Assumption B4, it follows from the Lindeberg-Feller central limit theorem (Amemiya 1985, Theorem 3.3.6) that the standardized mean function \(\tilde {Z}_{K}=[\text {cov}(\bar {h}_{K}(\theta ))]^{-\frac {1}{2}}\bar {h}_{K}(\theta )\) in Eq. 4.35 asymptotically (minK[]) follows a multivariate normal distribution. That is,

$$ \min_{\ell}\lim_{K_{[\ell]} \rightarrow \infty} g^{*}_{K}(\tilde{Z}_{K}) \rightarrow N(0, I_{(J-1)(p_{1}+1)(p_{2}+1)}). $$

leading to the limiting distribution of \(\hat {\theta }_{CGQL}\) as in Eq. ?? under Section ??.

Appendix C: Asymptotic Efficiency Comparison Between the CMM and CGQL Approaches

It follows from Eq. 3.2 that the CMM (conditional method of moments) estimating equation for γ has the form

$$ \begin{array}{@{}rcl@{}} &&{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}{\sum}^{T}_{t=2}\frac{\partial \eta^{\prime}_{[\ell]t|t-1}(y_{i \in \ell,t-1};\theta,\gamma)} {\partial \gamma}[y_{i\in \ell,t}-\eta_{[\ell]t|t-1}(\theta,\gamma)]=0,\\ \end{array} $$

whereas the CGQL estimating equation is a weighted CMM estimating equation where the responses are weighted by their corresponding variances. More specifically, as given in Eq. 4.1, the CGQL estimating equation has the form

$$ \begin{array}{@{}rcl@{}} &&{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}{\sum}^{T}_{t=2}\frac{\partial \eta^{\prime}_{[\ell]t|t-1}}{\partial \gamma} {\Sigma}^{-1}_{[\ell]t|t-1}(\theta,\gamma) [y_{i\in \ell,t}-\eta_{[\ell]t|t-1}(\theta,\gamma)]=0,\\ \end{array} $$

where Σ[]t|t− 1(𝜃,γ) is the conditional variance matrix of the multinomial response vector yi,t.

Next by a Taylor series approximation similar to that of Eq. 4.4, the CMM estimating (4.38) yields

$$ \begin{array}{@{}rcl@{}} \hat{\gamma}_{CMM}(\theta)-\gamma &\simeq & \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}Q^{*}_{[\ell]}(\theta,\gamma) \right]^{-1} {\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}f^{*}_{i\in \ell}(\theta,\gamma) \\ &+&o_{p}(1/\sqrt{{\sum}_{\ell=1}K_{[\ell]}}), \end{array} $$


$$ \begin{array}{@{}rcl@{}} f^{*}_{i\in \ell}(\theta,\gamma)=\frac{\partial \eta^{\prime}_{[\ell]}} {\partial \gamma}\left( y^{*}_{i\in \ell}-\eta_{[\ell]}\right), \text{and} Q^{*}_{[\ell]}(\theta,\gamma)=\frac{\partial \eta^{\prime}_{[\ell]}} {\partial \gamma}\frac{\partial \eta_{[\ell]}} {\partial \gamma^{\prime}}, \end{array} $$

with \(y^{*}_{i\in \ell }=\left (\begin {array}{cccccc}y^{\prime }_{i\in \ell ,2} & {\ldots } & y^{\prime }_{i\in \ell ,t} & {\ldots } & y^{\prime }_{i \in \ell ,T} \end {array}\right )^{\prime }: (J-1)(T-1) \times 1\). It then follows from Eq. 4.40 that the CMM estimator \(\hat {\gamma }_{CMM}(\theta )\) has the asymptotic (as \(K={\sum }_{\ell }K_{[\ell ]} \rightarrow \infty \)) covariance matrix estimate given by

$$ \begin{array}{@{}rcl@{}} &&\hat{\text{cov}}[\hat{\gamma}_{CMM}(\theta)]= \lim_{K \rightarrow \infty} \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}Q^{*}_{[\ell]}(\theta,\gamma) \right]^{-1} \end{array} $$
$$ \begin{array}{@{}rcl@{}} &\times & \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}\frac{\partial \eta^{\prime}_{[\ell]}}{\partial \gamma}{{\Sigma}^{*}}_{[\ell],D}\frac{\partial \eta_{[\ell]}} {\partial \gamma^{\prime}}\right]\left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}Q^{*}_{[\ell]}(\theta,\gamma) \right]^{-1} , \end{array} $$

whereas by Eq. 4.4, an estimate of the asymptotic covariance matrix of the CGQL estimator \(\hat {\gamma }_{CGQL}(\theta )\) has the formula

$$ \begin{array}{@{}rcl@{}} \hat{\text{cov}}[\hat{\gamma}_{CGQL}(\theta)]&=& \lim_{K \rightarrow \infty} \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1}Q_{[\ell]}(\theta,\gamma) \right]^{-1} \\ &=& \lim_{K \rightarrow \infty} \left[{\sum}^{(p_{1}+1)(p_{2}+1)}_{\ell=1}{\sum}^{K_{[\ell]}}_{i=1} \frac{\partial \eta^{\prime}_{[\ell]}} {\partial \gamma}{{\Sigma}^{*}}^{-1}_{[\ell],D}\frac{\partial \eta_{[\ell]}} {\partial \gamma^{\prime}}\right]^{-1}.\\ \end{array} $$

We remark that if all the diagonal elements of the Σ[],D matrix were the same, i.e., the responses under all categories have the same variances, then the CGQL estimates reduce to the CMM estimates and the covariance matrices given by Eqs. 4.41 and 4.42 would be the same. However, this assumption is impractical as the multinomial responses under different categories in general would exhibit different variances.

We further remark that because the formulas for the covariance matrices given by Eqs. 4.41 and 4.42 are, respectively, similar to the traditional OLS (ordinary least squared) and GLS (generalized least squared) regression estimators in a linear longitudinal regression setup, one may easily show that

$$ \begin{array}{@{}rcl@{}} &&\text{var}[\hat{\gamma}_{u, CGQL}] \leq \text{var}[\hat{\gamma}_{u, CMM}], \end{array} $$

for all the components of the γ, i.e., for all u = 1,…,(J − 1)2. For a detailed proof, one may refer to Sutradhar (2011, Theorem 2.1, Chapter 2), for example. This, as indicated in Section 3.1.1 (see the comments prior to Eq. 3.3), demonstrates that the CGQL estimates are more efficient than the CMM estimates.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rao, R.P., Sutradhar, B.C. Multiple Categorical Covariates-Based Multinomial Dynamic Response Model. Sankhya A 82, 186–219 (2020). https://doi.org/10.1007/s13171-019-00168-1

Download citation

Keywords and phrases

  • Asymptotic properties
  • Covariates with possible interactions
  • Dynamic dependence parameters
  • Conditional generalized quasi-likelihood estimation
  • Lag 1 transitional multinomial probabilities
  • Unconditional method of moments

AMS (2000) subject classification

  • Primary 62H12, 62F12
  • Secondary 62F10