Empirical Bayes methods in nested error regression models with skew-normal errors

Tsujino, Tatsuhiko; Kubokawa, Tatsuya

doi:10.1007/s42081-019-00038-y

Empirical Bayes methods in nested error regression models with skew-normal errors

Original Paper
Published: 04 March 2019

Volume 2, pages 375–403, (2019)
Cite this article

Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Tatsuhiko Tsujino¹ &
Tatsuya Kubokawa²

609 Accesses
3 Citations
Explore all metrics

Abstract

The nested error regression (NER) model is a standard tool to analyze unit-level data in the field of small area estimation. Both random effects and error terms are assumed to be normally distributed in the standard NER model. However, in the case that asymmetry of distribution is observed in a given data, it is not appropriate to assume the normality. In this paper, we suggest the NER model with the error terms having skew-normal distributions. The Bayes estimator and the posterior variance are derived as simple forms. We also construct the estimators of the model-parameters based on the moment method. The resulting empirical Bayes (EB) estimator is assessed in terms of the conditional mean squared error, which can be estimated with second-order unbiasedness by parametric bootstrap methods. Through simulation and empirical studies, we compare the skew-normal model with the usual NER model and illustrate that the proposed model gives much more stable EB estimator when skewness is present.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Small area estimation under transformed nested-error regression models

Article 06 February 2017

The spatial empirical Bayes predictor of the small area mean for a lognormal variable of interest and spatially correlated random effects

Article 26 May 2018

Variable Selection for Linear Mixed Models with Applications in Small Area Estimation

Article 02 May 2015

References

Arellano-Valle, R. B., Bolfarine, H., & Lachos, V. H. (2005). Skew-normal linear mixed models. Journal of Data Science, 3, 415–438.
MATH Google Scholar
Arellano-Valle, R. B., Bolfarine, H., & Lachos, V. (2007). Bayesian inference for skew-normal linear mixed models. Journal of Applied Statistics, 34, 663–682.
Article MathSciNet Google Scholar
Azzalini, A. (1985). A class of distributions which includes the normal ones. Scandinavian Journal of Statistics, 12, 171–178.
MathSciNet MATH Google Scholar
Azzalini, A. (1986). Further results on a class of distributions which includes the normal ones. Statistica, XLVI, 199–208.
MathSciNet MATH Google Scholar
Azzalini, A. (2013). The skew-normal and related families. Cambridge: Cambridge University Press.
Book Google Scholar
Azzalini, A., & Capitanio, A. (1999). Statistical applications of the multivariate skew normal distribution. Journal of the Royal Statistical Society, B 61, 579–602.
Article MathSciNet Google Scholar
Battese, G. E., Harter, R. M., & Fuller, W. A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83, 28–36.
Article Google Scholar
Booth, J. G., & Hobert, J. P. (1998). Standard errors of prediction in generalized linear mixed models. Journal of the American Statistical Association, 93, 262–272.
Article MathSciNet Google Scholar
Butar, F. B., & Lahiri, P. (2003). On measures of uncertainty of empirical Bayes small-area estimators. Journal of Statistical Planning and Inference, 112, 63–76.
Article MathSciNet Google Scholar
Diallo, M., & Rao, J. N. K. (2018). Small area estimation of complex parameters under unit-level models with skew-normal errors. Scandinavian Journal of Statistics, 45, 1092–1116.
Article MathSciNet Google Scholar
Dunnett, C. W., & Sobel, M. (1955). Approximations to the probability integral and certain percentage points of a multivariate analogue of Student’s t-distribution. Biometrika, 42, 258–260.
Article MathSciNet Google Scholar
Ferraz, V. R. S., & Moura, F. A. S. (2012). Small area estimation using skew normal models. Computational Statistics and Data Analysis, 56, 2864–2874.
Article MathSciNet Google Scholar
Fuller, W. A., & Battese, G. E. (1973). Transformations for estimation of linear models with nested-error structure. Journal of the American Statistical Association, 68, 626–632.
Article MathSciNet Google Scholar
Ghosh, M., & Rao, J. N. K. (1994). Small area estimation: an appraisal. Statistical Science, 9, 55–76.
Article MathSciNet Google Scholar
Henze, N. (1986). A probabilistic representation of the “skew-normal” distribution. Scandinavian Journal of Statistics, 13, 271–275.
MathSciNet MATH Google Scholar
Pewsey, A. (2000). Problems of inference for Azzalini’s skew-normal distribution. Journal of Applied Statistics, 27, 859–870.
Article Google Scholar
Pfeffermann, D. (2013). New important developments in small area estimation. Statistical Science, 28, 40–68.
Article MathSciNet Google Scholar
Prasad, N. G. N., & Rao, J. N. K. (1990). The estimation of the mean squared error of small-area estimators. Journal of the American Statistical Association, 85, 163–171.
Article MathSciNet Google Scholar
Rao, J. N. K., & Molina, I. (2015). Small area estimation (2nd ed.). Hoboken: Wiley.
Book Google Scholar
Tallis, G. M. (1961). The moment generating function of the truncated multi-normal distribution. Journal of the Royal Statistical Society: Series B, 23, 223–229.
MathSciNet MATH Google Scholar

Download references

Acknowledgements

We would like to thank the Associate Editor and the two reviewers for many valuable comments and helpful suggestions which led to an improved version of this paper. The research of the second author was supported in part by Grant-in-Aid for Scientific Research from Japan Society for the Promotion of Science ($\#$ 18K11188, $\#$ 15H01943 and $\#$ 26330036).

Author information

Authors and Affiliations

Graduate School of Economics, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
Tatsuhiko Tsujino
Faculty of Economics, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
Tatsuya Kubokawa

Authors

Tatsuhiko Tsujino
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuya Kubokawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tatsuya Kubokawa.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proofs

All the proofs of lemmas and theorems given in the paper are provided here.

Proof of expression (6)

We explain briefly the derivation of the expression (6). The exponent in $f(v_i,{{\varvec{u}}}_{1i}\mid {{\varvec{y}}}_i)$ is proportional to

$$\begin{aligned} {1+{\lambda }^2\over {\sigma }^2}\sum _{j=1}^{n_i} \left\{v_i- \left(y_{ij}-{{\varvec{x}}}_{ij}^\top {\varvec{\beta }}-{{\sigma }{\lambda }\over \sqrt{1+{\lambda }_i}}u_{1ij} \right) \right\}^2 + {v_i^2\over \tau ^2}+ \sum _{j=1}^{n_i}u_{1ij}^2, \end{aligned}$$

which is rewritten as

$$\begin{aligned}& {(v_i-\mu _{v_i})^2\over {\sigma }_{v_i}^2} +{1+{\lambda }^2\over {\sigma }^2} \left[ \sum _{j=1}^{n_i} \left\{ {{\sigma }{\lambda }\over \sqrt{1+{\lambda }^2}}u_{1ij}-(y_{ij}-{{\varvec{x}}}_{ij}^\top {\varvec{\beta }}) \right\}^2 + {{\sigma }^2\over 1+{\lambda }^2}\sum _{j=1}^{n_i}u_{1ij}^2 \right. \\& \left.\quad - {(1+{\lambda }^2)\tau ^2 \over {\sigma }^2+(1+{\lambda }^2)\tau ^2n_i} \left\{{{\sigma }{\lambda }\over \sqrt{1+{\lambda }^2}}\sum _{j=1}^{n_i}u_{1ij}-\sum _{j=1}^{n_i}(y_{ij}-{{\varvec{x}}}_{ij}^\top {\varvec{\beta }}) \right\}^2 \right]. \end{aligned}$$

The first part corresponds to the density $\phi (v_i;\mu _{v_i}, {\sigma }_{v_i}^2)$. For simplicity, let ${{\varvec{J}}}_{n_i}=\mathbf{1 }_{n_i}\mathbf{1 }_{n_i}^\top $,

$$\begin{aligned} {{\varvec{c}}}_i={{\sigma }{\lambda }\over \sqrt{1+{\lambda }^2}}({{\varvec{y}}}_i-{{\varvec{X}}}_i{\varvec{\beta }})-{n_i{\sigma }{\lambda }\over \sqrt{1+{\lambda }^2}}({{\overline{y}}}_i-{{\overline{{{\varvec{x}}}}}}_i^\top {\varvec{\beta }})\mathbf{1 }_{n_i}\quad \mathrm{and}\quad d={\tau ^2{\sigma }^2{\lambda }^2\over {\sigma }^2+n_i(1+{\lambda }^2)\tau ^2}. \end{aligned}$$

Then the second term can be expressed as ${\sigma }^2{{\varvec{u}}}_{1i}^\top {{\varvec{u}}}_{1i} - 2{{\varvec{c}}}_i^\top {{\varvec{u}}}_{1i}-d{{\varvec{u}}}_{1i}^\top {{\varvec{J}}}_{n_i}{{\varvec{u}}}_{1i}$. After completing square, one gets $\{{{\varvec{u}}}_{1i}-({\sigma }^2{{\varvec{I}}}_{n_i}-d{{\varvec{J}}}_{n_i})^{-1}{{\varvec{c}}}_i\}^\top ({\sigma }^2{{\varvec{I}}}_{n_i}-d{{\varvec{J}}}_{n_i})\{{{\varvec{u}}}_{1i}-({\sigma }^2{{\varvec{I}}}_{n_i}-d{{\varvec{J}}}_{n_i})^{-1}{{\varvec{c}}}_i\}$, which corresponds to the density $\phi _{n_i}({{\varvec{u}}}_{1i};{\varvec{\mu }}_i,{\sigma }_{u_i}^2{{\varvec{R}}}_i)$. Thus, we have the expression given in (6). $\square $

Proof of Lemma 3.1

Following Tallis (1961), we have

$$\begin{aligned} E[w_{ij}{\,|\,}{{\varvec{y}}}_i] = \sum _{k=1}^{n_i} \rho _{i,jk}\phi (a_{ik}) \frac{\int _{-{{\varvec{a}}}_{i(-k)}}^{{\infty }} \phi _{n_i-1}({{\varvec{w}}}_i^k;\mathbf{0 },{{\varvec{R}}}_i^k)\,{\text{d}}{{\varvec{w}}}_{i(-k)}}{\int _{-{{\varvec{a}}}_i}^{{\infty }} \phi _{n_i}({{\varvec{w}}}_i;\mathbf{0 },{{\varvec{R}}}_i)\,{\text{d}}{{\varvec{w}}}_i}, \end{aligned}$$

(19)

where ${{\varvec{a}}}_{i(-k)}$ and ${{\varvec{w}}}_{i(-k)}$ are respectively $(n_i-1)$-dimensional vector obtained by dropping the kth element of ${{\varvec{a}}}_i$ and ${{\varvec{w}}}_i$, ${{\varvec{w}}}_i^k=({{\varvec{w}}}_{i(-k)}+\rho _ia_{ik}\mathbf{1 }_{n_i-1})(1 - \rho _i^2)^{-1/2}$, and ${{\varvec{R}}}_i^k$ is the matrix of the partial correlation coefficients for ${{\varvec{w}}}_i$. Using results from Dunnett and Sobel (1955), we reduce the two multiple integrals in (19) to one-dimensional integrals. Since the denominator of the fraction in (19) is written as

$$\begin{aligned} \int _{-{{\varvec{a}}}_i}^{{\infty }} \phi _{n_i}({{\varvec{w}}}_i;\mathbf{0 },{{\varvec{R}}}_i)\,{\text{d}}{{\varvec{w}}}_i = \Pr \{W_j > -a_{ij},\,j=1,\ldots ,n_i\}, \end{aligned}$$

where ${{\varvec{W}}}=(W_1,\ldots ,W_{n_i})^\top \sim \mathcal{N}_{n_i}(\mathbf{0 },{{\varvec{R}}}_i)$ with $({{\varvec{R}}}_i)_{qr}=\rho _i\in [0,1)$ for $q\ne r$. Thus $W_j$ can be represented as

$$\begin{aligned} W_j = \sqrt{\rho _i}\xi _0+\sqrt{1-\rho _i}\xi _j,\quad j=1,\ldots ,n_i, \end{aligned}$$

(20)

where for $j=0,1,\ldots ,n_i$, $\xi _j$’s are mutually independently distributed as $\mathcal{N}(0,1)$. This transformation gives

$$\begin{aligned}&\Pr \{W_j> -a_{ij},\,j=1,\ldots ,n_i\} = \Pr \left\{\xi _j > \frac{-a_{ij}-\sqrt{\rho _i}\xi _0}{\sqrt{1-\rho _i}},\,j=1,\ldots ,n_i \right\}\nonumber \\&\quad = \int _{-{\infty }}^{{\infty }} \left\{ \prod _{j=1}^{n_i} \left( 1-\Phi \left(\frac{-a_{ij}-\sqrt{\rho _i}\xi _0}{\sqrt{1-\rho _i}} \right) \right) \right\} \phi (\xi _0)\,\mathrm{d}\xi _0 = \int _{-{\infty }}^{{\infty }} \left\{ \prod _{j=1}^{n_i} \Phi \left(\frac{a_{ij}+\sqrt{\rho _i}\xi _0}{\sqrt{1-\rho _i}} \right) \right\} \phi (\xi _0)\,\mathrm{d}\xi _0, \end{aligned}$$

which corresponds to (8).

Similarly, we see that the numerator in (19) is

$$\begin{aligned} \int _{-{{\varvec{a}}}_{i(-k)}}^{\infty }\phi _{n_i-1}({{\varvec{w}}}_i^k;\mathbf{0 },{{\varvec{R}}}_i^k)\,{\text{d}}{{\varvec{w}}}_{i(-k)} =\Pr \left\{W_l^k>\frac{-a_{il}+\rho _i a_{ik}}{(1-\rho _i^2)^{1/2}},\,l=1,\ldots ,n_i,\,l\ne k \right\}, \end{aligned}$$

where ${{\varvec{W}}}^k=(W_1^k,\ldots ,W_{k-1}^k,W_{k+1}^k,\ldots ,W_{n_i}^k)^\top \sim \mathcal{N}_{n_i-1}(\mathbf{0 },{{\varvec{R}}}_i^k)$ with $({{\varvec{R}}}_i^k)_{qr}=\rho _i/(1+\rho _i)$ for $q\ne r$. Thus, analogously to (20), $W_l^k$ can be expressed as

$$\begin{aligned} W_l^k = \sqrt{\frac{\rho _i}{1+\rho _i}}\xi _0 + \sqrt{\frac{1}{1+\rho _i}}\xi _l,\quad l=1,\ldots ,n_i,\,l\ne k, \end{aligned}$$

where $\xi _l$’s are mutually independent standard normal variables again. Then it follows that

$$\begin{aligned}&\Pr \left\{W_l^k>\frac{-a_{il}+\rho _ia_{ik}}{(1-\rho _i^2)^{1/2}},\,l=1,\ldots ,n_i,\,l\ne k \right\}\\&= \int _{-{\infty }}^{{\infty }} \left\{\prod _{l\ne k} \left (1-\Phi \left(\frac{-a_{il}+\rho _ia_{ik}}{\sqrt{1-\rho _i}}-\sqrt{\rho _i}\xi _0 \right) \right) \right\} \phi (\xi _0)\,\mathrm{d}\xi _0\\&= \int _{-{\infty }}^{{\infty }} \left\{\prod _{l\ne k} \Phi \left(\frac{a_{il}-\rho _ia_{ik}}{\sqrt{1-\rho _i}}+\sqrt{\rho _i}\xi _0 \right) \right\}\phi (\xi _0)\,\mathrm{d}\xi _0, \end{aligned}$$

which gives (9). $\square $

Proof of Theorem 3.1

It suffices to transform the second term of (7) into the desired form. Using Lemma 3.1 and the fact that $({{\varvec{R}}}_i)_{jk}=\rho _i$ for $j\ne k$, we have

$$\begin{aligned}&\frac{n_i\tau ^2(1+{\lambda }^2)}{{\sigma }^2+n_i\tau ^2(1+{\lambda }^2)}{\sigma }_{u_i}\sum _{j=1}^{n_i}E[w_{ij}{\,|\,}{{\varvec{y}}}_i]\nonumber \\&\quad = \frac{n_i\tau ^2(1+{\lambda }^2)}{{\sigma }^2+n_i\tau ^2(1+{\lambda }^2)}{\sigma }_{u_i} (1+(n_i-1)\rho _i)\sum _{j=1}^{n_i}\phi (a_{ij})\frac{{\alpha }_{1ij}}{{\alpha }_{0i}}. \end{aligned}$$

(21)

Since

$$\begin{aligned} 1+(n_i-1)\rho _i = \frac{1}{(1+{\lambda }^2){\sigma }_{u_i}^2}\frac{{\sigma }^2+n_i\tau ^2(1+{\lambda }^2)}{{\sigma }^2+n_i\tau ^2}, \end{aligned}$$

(22)

the formula (21) reduces to

$$\begin{aligned} \frac{n_i\tau ^2}{{\sigma }^2+n_i\tau ^2}\sum _{j=1}^{n_i}{\sigma }_{u_i}^{-1}\phi (a_{ij})\frac{{\alpha }_{1ij}}{{\alpha }_{0i}}, \end{aligned}$$

which gives the desired expression. $\square $

Proof of Lemma 3.2

The outline is the same as in the proof of Lemma 3.1. Using the results of Tallis (1961) and Theorem 3.1, we have

$$\begin{aligned} \begin{aligned} E[w_{ij}w_{ik}{\,|\,}{{\varvec{y}}}_i]&=\rho _{i,jk}-\sum _{q=1}^{n_i}\rho _{i,jq}\rho _{i,kq}a_{iq}\phi (a_{iq})\frac{{\alpha }_{1iq}}{{\alpha }_{0i}}\\&\quad +\sum _{q=1}^{n_i}\rho _{i,jq}\sum _{r\ne q}(\rho _{i,kr}-\rho _i\rho _{i,kq})\phi (a_{iq},a_{ir};\rho _i){\alpha }_{0i}^{-1}\\&\qquad \times \int _{-{{\varvec{a}}}_{i(-q,r)}}^{{\infty }}\phi _{n_i-2}({{\varvec{w}}}_i^{qr};\mathbf{0 },{{\varvec{R}}}_i^{qr})\,{\text{d}}{{\varvec{w}}}_{i(-q,r)}, \end{aligned} \end{aligned}$$

(23)

where ${{\varvec{a}}}_{i(-q,r)}$ and ${{\varvec{w}}}_{i(-q,r)}$ are respectively $(n_i-2)$-dimensional vectors obtained by dropping the qth and rth elements of ${{\varvec{a}}}$ and ${{\varvec{w}}}$,

$$\begin{aligned} {{\varvec{w}}}_i^{qr}=\frac{{{\varvec{w}}}_{i(-q,r)}+\rho _i(1+\rho _i)^{-1}(a_{iq}+a_{ir})\mathbf{1 }_{n_i-2}}{\sqrt{(1-\rho _i)(1+2\rho _i)(1+\rho _i)^{-1}}}, \end{aligned}$$

and ${{\varvec{R}}}_i^{qr}$ is the matrix of the second-order partial correlation coefficients for ${{\varvec{w}}}_i$. The $(n_i-2)$-fold integral in (23) is written as

$$\begin{aligned}&\int _{-{{\varvec{a}}}_{i(-q,r)}}^{{\infty }}\phi _{n_i-2}({{\varvec{w}}}_i^{qr};\mathbf{0 },{{\varvec{R}}}_i^{qr})\,{\text{d}}{{\varvec{w}}}_{i(-q,r)}\\&\quad =\Pr \left\{W_s^{qr}>\frac{-a_{is}+\rho _i(1+\rho _i)^{-1}(a_{iq}+a_{ir})}{\sqrt{(1-\rho _i)(1+2\rho _i)(1+\rho _i)^{-1}}},\,s=1,\ldots ,n_i,\,s\ne q,r \right\} \end{aligned}$$

where ${{\varvec{W}}}^{qr}\sim \mathcal{N}_{n_i-2}(\mathbf{0 },{{\varvec{R}}}_i^{qr})$ with $({{\varvec{R}}}_i^{qr})_{st}=\rho _i/(1+2\rho _i)$ for $s\ne t$. Here the definition of ${{\varvec{W}}}^{qr}$ is analogous to ${{\varvec{W}}}^q$ in the proof of Lemma 3.1. Then using the similar method to (20) with $\rho _i/(1+2\rho _i)$ instead of $\rho _i$, we have

$$\begin{aligned} W_{s}^{qr} = \sqrt{\frac{\rho _i}{1+2\rho _i}}\xi _0 + \sqrt{\frac{1+\rho _i}{1+2\rho _i}}\xi _s,\quad s=1,\ldots ,n_i,\,s\ne q,r \end{aligned}$$

where $\xi _s$’s are mutually independent standard normal random variables. Then we have

$$\begin{aligned}&\Pr \left \{W_s^{qr}>\frac{-a_{is}+\rho _i(1+\rho _i)^{-1}(a_{iq}+a_{ir})}{\sqrt{(1-\rho _i)(1+2\rho _i)(1+\rho _i)^{-1}}},\,s=1,\ldots ,n_i,\,s\ne q,r \right\}\\&\quad =\int _{-{\infty }}^{{\infty }} \left\{\prod _{s\ne q,r} \left(1-\Phi \left(\frac{-a_{is}+\rho _i(1+\rho _i)^{-1}(a_{iq}+a_{ir})}{\sqrt{1-\rho _i}}-\sqrt{\frac{\rho _i}{1+\rho _i}}\xi _0 \right) \right) \right\}\phi (\xi _0)\,\mathrm{d}\xi _0\\&\quad =\int _{-{\infty }}^{{\infty }} \left\{\prod _{s\ne q,r} \Phi \left(\frac{a_{is}-\rho _i(1+\rho _i)^{-1}(a_{iq}+a_{ir})}{\sqrt{1-\rho _i}}+\sqrt{\frac{\rho _i}{1+\rho _i}}\xi _0 \right) \right\}\phi (\xi _0)\,\mathrm{d}\xi _0, \end{aligned}$$

which corresponds to (12). $\square $

Proof of Theorem 3.2

It follows from Lemmas 3.1 and 3.2 that

$$\begin{aligned} \sum _{j=1}^{n_i}E[w_{ij}{\,|\,}{{\varvec{y}}}_i]&= (1+(n_i-1)\rho _i)\sum _{q=1}^{n_i}\phi (a_{iq})\frac{{\alpha }_{1iq}}{{\alpha }_{0i}},\\ \sum _{j=1}^{n_i}E[w_{ij}^2{\,|\,}{{\varvec{y}}}_i]&= n_i-(1+(n_i-1)\rho _i)^2\sum _{q=1}^{n_i}a_{iq}\phi (a_{iq})\frac{{\alpha }_{1iq}}{{\alpha }_{0i}}\\&\quad +\rho _i(1-\rho _i)(1+(n_i-1)\rho _i)\sum _{q=1}^{n_i}\sum _{r\ne q}\phi (a_{iq},a_{ir};\rho _i)\frac{{\alpha }_{2iqr}}{{\alpha }_{0i}}, \\ \sum _{j=1}^{n_i}\sum _{k\ne j}E[w_{ij}w_{ik}{\,|\,}{{\varvec{y}}}_i]&= n_i(n_i-1)\rho _i - (n_i-1)\rho _i(2+(n_i-2)\rho _i)\sum _{q=1}^{n_i}a_{iq}\phi (a_{iq})\frac{{\alpha }_{1iq}}{{\alpha }_{0i}}\\&\quad +(1-\rho _i)(1+(n_i-1)\rho _i)(1+(n_i-2)\rho _i)\sum _{q=1}^{n_i}\sum _{r\ne q}\phi (a_{iq},a_{ir};\rho _i)\frac{{\alpha }_{2iqr}}{{\alpha }_{0i}}. \end{aligned}$$

The conditional variance of $\sum _{j=1}^{n_i}w_{ij}$ given ${{\varvec{y}}}_i$ is

$$\begin{aligned} \mathrm{Var}\Big (\sum _{j=1}^{n_i}w_{ij}{\,\Bigr |\,}{{\varvec{y}}}_i\Big )&=\sum _{j=1}^{n_i}E[w_{ij}^2{\,|\,}{{\varvec{y}}}_i]+\sum _{j=1}^{n_i}\sum _{k\ne j}E[w_{ij}w_{ik}{\,|\,}{{\varvec{y}}}_i]- \left\{\sum _{j=1}^{n_i}E[w_{ij}{\,|\,}{{\varvec{y}}}_i] \right\}^2\nonumber \\&=n_i(1+(n_i-1)\rho _i)-(1+(n_i-1)\rho _i)^2\nu _i({\varvec{\omega }},{{\varvec{y}}}_i), \end{aligned}$$

(24)

where $\nu _i({\varvec{\omega }},{{\varvec{y}}}_i)$ is defined as (14). Then it follows from (11), (22) and (24) that

$$\begin{aligned} \mathrm{Var}({\theta }_i{\,|\,}{{\varvec{y}}}_i)&= \frac{{\sigma }^2\tau ^2}{{\sigma }^2+n_i\tau ^2(1+{\lambda }^2)} + \frac{n_i{\sigma }^2\tau ^4{\lambda }^2}{({\sigma }^2+n_i\tau ^2(1+{\lambda }^2))({\sigma }^2+n_i\tau ^2)}\\&\quad - \frac{1}{(1+{\lambda }^2){\sigma }_{u_i}^2}\frac{{\sigma }^2\tau ^4{\lambda }^2}{({\sigma }^2+n_i\tau ^2)^2}\nu _i({\varvec{\omega }},{{\varvec{y}}}_i)\\&= \frac{{\sigma }^2\tau ^2}{{\sigma }^2+n_i\tau ^2(1+{\lambda }^2)}\Big (1+\frac{n_i\tau ^2{\lambda }^2}{{\sigma }^2+n_i\tau ^2}\Big )\\&\quad - \frac{1}{1+\tau ^2{\lambda }^2/({\sigma }^2+n_i\tau ^2)}\frac{{\sigma }^2\tau ^4{\lambda }^2}{({\sigma }^2+n_i\tau ^2)^2}\nu _i({\varvec{\omega }},{{\varvec{y}}}_i)\\&=\frac{{\sigma }^2\tau ^2}{{\sigma }^2+n_i\tau ^2} - \frac{{\sigma }^2\tau ^2}{{\sigma }^2+n_i\tau ^2}\frac{\tau ^2{\lambda }^2/({\sigma }^2+n_i\tau ^2)}{1+\tau ^2{\lambda }^2/({\sigma }^2+n_i\tau ^2)}\nu _i({\varvec{\omega }},{{\varvec{y}}}_i)\\&=\frac{{\sigma }^2\tau ^2}{{\sigma }^2+n_i\tau ^2}\{1-\rho _i\nu _i({\varvec{\omega }},{{\varvec{y}}}_i)\}, \end{aligned}$$

which proves Theorem 3.2. $\square $

Proof of Theorem 4.1

First, we derive the desired properties for the parameters $\big ({\varvec{\beta }}_{{\varepsilon }}^\top ,{\sigma }^2,\tau ^2,{\lambda }\big )$. In general, consider the case that two estimators ${{\hat{{\theta }}}}_1$ of ${\theta }_1$ and ${{\hat{{\theta }}}}_2$ of ${\theta }_2$ have the forms

$$\begin{aligned} {{\hat{{\theta }}}}_1 - {\theta }_1&= \frac{1}{m}\sum _{i=1}^m h_{1i}({{\varvec{y}}}_i), \\ {{\hat{{\theta }}}}_2 - {\theta }_2&= \frac{1}{m}\sum _{i=1}^m h_{2i}({{\varvec{y}}}_i), \end{aligned}$$

where $h_{1i}({{\varvec{y}}}_i)$ and $h_{2i}({{\varvec{y}}}_i)$ (written as $h_{1i}$ and $h_{2i}$ for simplicity) are functions of ${{\varvec{y}}}_i$ such that $h_{ki} = O_p(1),\,E[h_{ki}] = O(1)$ for $k = 1, 2$. Since ${{\varvec{y}}}_i$’s are mutually independent, it is shown that

$$\begin{aligned} E[{{\hat{{\theta }}}}_1 - {\theta }_1 {\,|\,} {{\varvec{y}}}_i]&= E[{{\hat{{\theta }}}}_1 - {\theta }_1] + \frac{1}{m}(h_{1i} - E[h_{1i}]) = E[{{\hat{{\theta }}}}_1 - {\theta }_1] + O_p(m^{-1}), \\ E[({{\hat{{\theta }}}}_1 - {\theta }_1)({{\hat{{\theta }}}}_2 - {\theta }_2) {\,|\,} {{\varvec{y}}}_i]&= E[({{\hat{{\theta }}}}_1 - {\theta }_1)({{\hat{{\theta }}}}_2 - {\theta }_2)] \\&\quad + \frac{1}{m}\big \{(h_{1i} - E[h_{1i}])E[{{\hat{{\theta }}}}_2 - {\theta }_2] + (h_{2i} - E[h_{2i}])E[{{\hat{{\theta }}}}_1 - {\theta }_1]\big \} \\&\quad + \frac{1}{m^2}\big \{(h_{1i} - E[h_{1i}])(h_{2i} - E[h_{2i}]) - E[(h_{1i} - E[h_{1i}])(h_{2i} - E[h_{2i}])]\big \} \\&= E[({{\hat{{\theta }}}}_1 - {\theta }_1)({{\hat{{\theta }}}}_2 - {\theta }_2)] + O_p(m^{-1}), \end{aligned}$$

which means that it is enough to obtain the required results for the unconditional expectations. Note that all the moments of a skew-normal distribution exist. Since the methods of estimating ${\varvec{\beta }}_{\varepsilon }$ and $m_2$ are the same as those of the usual NER model, it follows from the results of Fuller and Battese (1973) that

$$\begin{aligned} E\big [\big ({{\widehat{{\varvec{\beta }}}}}_{\varepsilon }-{\varvec{\beta }}_{\varepsilon }\big )\big ({{\widehat{{\varvec{\beta }}}}}_{\varepsilon }- {\varvec{\beta }}_{\varepsilon }\big )^\top \big ]&= O(m^{-1}),&E[({{\widehat{m}}}_2 - m_2)^2]&= O(m^{-1}), \\ E\big [{{\widehat{{\varvec{\beta }}}}}_{\varepsilon }- {\varvec{\beta }}_{\varepsilon }\big ]&= O(m^{-1}),&E[{{\widehat{m}}}_2 - m_2]&= 0. \end{aligned}$$

The last formula comes from the unbiasedness of ${{\widehat{m}}}_2$.

Next we treat ${{\widehat{m}}}_3$. Let ${{\widehat{{\varvec{\beta }}}}}_2^{FE}$ be the OLS estimator obtained by regressing $y_{ij}$ on ${\tilde{{{\varvec{z}}}}}_{ij}$. Then, ${{\widehat{{\varvec{\beta }}}}}_1^{FE}$ is written as ${{\widehat{{\varvec{\beta }}}}}_2^{FE} = {\varvec{\beta }}_2 + (\sum _{i=1}^m\sum _{j=1}^{n_i}{\tilde{{{\varvec{z}}}}}_{ij}{\tilde{{{\varvec{z}}}}}_{ij}^\top )^{-1} \sum _{i=1}^m \sum _{j=1}^{n_i} {\tilde{{{\varvec{z}}}}}_{ij}{\tilde{{\varepsilon }}}_{ij}$. It follows from (RC) and ${{\widehat{{\varvec{\beta }}}}}_2^{FE} - {\varvec{\beta }}_2 = O_p(m^{-1/2})$ that

$$\begin{aligned} {{\widehat{m}}}_3&= \frac{1}{\eta _1}\sum _{i=1}^m\sum _{j=1}^{n_i}{\tilde{{\varepsilon }}}_{ij}^3 -\frac{3}{\eta _1}\big ({{\widehat{{\varvec{\beta }}}}}_2^{FE} - {\varvec{\beta }}_2\big )^\top \left(\sum _{i=1}^m\sum _{j=1}^{n_i} {\tilde{{{\varvec{z}}}}}_{ij} {\tilde{{\varepsilon }}}_{ij}^2 \right) \\&\quad + \frac{3}{\eta _1}\big ({{\widehat{{\varvec{\beta }}}}}_2^{FE} - {\varvec{\beta }}_2\big )^\top \left(\sum _{i=1}^m\sum _{j=1}^{n_i} {\tilde{{{\varvec{z}}}}}_{ij} {\tilde{{{\varvec{z}}}}}_{ij}^\top {\tilde{{\varepsilon }}}_{ij} \right) \big ({{\widehat{{\varvec{\beta }}}}}_2^{FE} - {\varvec{\beta }}_2\big ) - \frac{1}{\eta _1}\sum _{i=1}^m\sum _{j=1}^{n_i}\big \{{\tilde{{{\varvec{z}}}}}_{ij}^\top \big ({{\widehat{{\varvec{\beta }}}}}_2^{FE} - {\varvec{\beta }}_2\big )\big \}^3 \\&= \frac{1}{\eta _1}\sum _{i=1}^m\sum _{j=1}^{n_i}{\tilde{{\varepsilon }}}_{ij}^3 - \frac{3}{\eta _1} \left(\sum _{i=1}^m\sum _{j=1}^{n_i}{\tilde{{{\varvec{z}}}}}_{ij}^\top {\tilde{{\varepsilon }}}_{ij} \right) \left(\sum _{i=1}^m\sum _{j=1}^{n_i} {\tilde{{{\varvec{z}}}}}_{ij}{\tilde{{{\varvec{z}}}}}_{ij}^\top \right)^{-1} \left(\sum _{i=1}^m\sum _{j=1}^{n_i} {\tilde{{{\varvec{z}}}}}_{ij}{\tilde{{\varepsilon }}}_{ij}^2 \right) + O_p(m^{-3/2}). \end{aligned}$$

Since ${\tilde{{\varepsilon }}}_{ij}$’s are independent for different i and $E[{\tilde{{\varepsilon }}}_{ij}] = 0$, the bias of ${{\widehat{m}}}_3$ is

$$\begin{aligned} E[{{\widehat{m}}}_3 - m_3]&= -\frac{3}{\eta _1}\sum _{i=1}^mE\Big [\sum _{j=1}^{n_i}\sum _{k=1}^{n_i} {\tilde{{{\varvec{z}}}}}_{ij}^\top {\tilde{{\varepsilon }}}_{ij} \Big (\sum _{q=1}^m\sum _{r=1}^{n_i}{\tilde{{{\varvec{z}}}}}_{qr}{\tilde{{{\varvec{z}}}}}_{qr}^\top \Big )^{-1} {\tilde{{{\varvec{z}}}}}_{ik}{\tilde{{\varepsilon }}}_{ik}^2\Big ] + o(m^{-1}) \\&= -\frac{3}{m\eta _1}\sum _{i=1}^m\sum _{j=1}^{n_i}\sum _{k=1}^{n_i} {\tilde{{{\varvec{z}}}}}_{ij}^\top \Big (\frac{1}{m}\sum _{q=1}^m\sum _{r=1}^{n_i}{\tilde{{{\varvec{z}}}}}_{qr}{\tilde{{{\varvec{z}}}}}_{qr}^\top \Big )^{-1} {\tilde{{{\varvec{z}}}}}_{ik}E[{\tilde{{\varepsilon }}}_{ij}{\tilde{{\varepsilon }}}_{ik}^2] + o(m^{-1}), \end{aligned}$$

which is of order $O(m^{-1})$. From the fact that ${{\widehat{m}}}_3 = \eta _1^{-1}\sum _{i=1}^m\sum _{j=1}^{n_i}{\tilde{{\varepsilon }}}_{ij}^3 + O_p(m^{-1})$ and $\eta _1^{-1}\sum _{i=1}^m\sum _{j=1}^{n_i}{\tilde{{\varepsilon }}}_{ij}^3 = O_p(m^{-1/2})$, it follows that

$$\begin{aligned} E[({{\widehat{m}}}_3 - m_3)^2]&= E \left[\left(\frac{1}{\eta _1}\sum _{i=1}^m\sum _{j=1}^{n_i}{\tilde{{\varepsilon }}}_{ij}^3 - m_3 \right)^2 \right] + o(m^{-1}) \nonumber \\&= \frac{1}{\eta _1^2}\sum _{i=1}^m E \left[\left(\sum _{j=1}^{n_i}{\tilde{{\varepsilon }}}_{ij}^3 - \frac{(n_i - 1)(n_i - 2)}{n_i}m_3 \right)^2 \right] + o(m^{-1}). \end{aligned}$$

(25)

The expectation in (25) is bounded under the condition (RC) and the existence of up to the sixth moment of ${\varepsilon }_{ij}$, which leads to $E[({{\widehat{m}}}_3 - m_3)^2] = O(m^{-1})$. Then we have ${{\widehat{m}}}_2 - m_2 = O_p(m^{-1/2})$ and ${{\widehat{m}}}_3 - m_3 = O_p(m^{-1/2})$. Also, $E[({{\widehat{m}}}_2 - m_2)({{\widehat{m}}}_3 - m_3)]$ can be treated by Schwarz’s inequality as

$$\begin{aligned} |E[({{\widehat{m}}}_2 - m_2)({{\widehat{m}}}_3 - m_3)]| \le (E[({{\widehat{m}}}_2 - m_2)^2] E[({{\widehat{m}}}_3 - m_3)^2])^{1/2} = O(m^{-1}). \end{aligned}$$

(26)

The inverse transformation of $(m_2({\sigma }^2,{\lambda }), m_3({\sigma }^2,{\lambda }))$ is derived from (1) and (2) as

$$\begin{aligned} {\sigma }^2(m_2,m_3)&= m_2 + \left(\frac{2}{4 - \pi }m_3 \right)^{2/3}, \\ {\delta }(m_2,m_3)&= \sqrt{\frac{\pi }{2}} \left(\frac{2}{4 - \pi }m_3 \right)^{1/3} \left\{m_2 + \Big (\frac{2}{4 - \pi }m_3\Big )^{2/3} \right\}^{-1/2}. \end{aligned}$$

Since ${\lambda }\ne 0$, ${\delta }\ne 0$, or $m_3 \ne 0$, it is easy to check these functions are three times continuously differentiable. Thus, using the Taylor series expansion we have

$$\begin{aligned} \begin{aligned} \begin{pmatrix} {{\hat{{\sigma }}}}^2 - {\sigma }^2\\ {{\widetilde{{\delta }}}}- {\delta }\end{pmatrix}&= \begin{pmatrix} \partial {\sigma }^2/\partial m_2 &{} \partial {\sigma }^2/\partial m_3 \\ \partial {\delta }/\partial m_2 &{} \partial {\delta }/\partial m_3 \end{pmatrix} \begin{pmatrix} {{\widehat{m}}}_2 - m_2 \\ {{\widehat{m}}}_3 - m_3 \end{pmatrix} \\&\quad + \frac{1}{2}\sum _{r=2}^3 \left\{ \frac{\partial }{\partial m_r} \begin{pmatrix} \partial {\sigma }^2/\partial m_2 &{} \partial {\sigma }^2/\partial m_3 \\ \partial {\delta }/\partial m_2 &{} \partial {\delta }/\partial m_3 \end{pmatrix} \right\} \begin{pmatrix} {{\widehat{m}}}_2 - m_2 \\ {{\widehat{m}}}_3 - m_3 \end{pmatrix} ({{\widehat{m}}}_r - m_r) + O_p(m^{-3/2}), \end{aligned} \end{aligned}$$

which, together with the results obtained up to this point, gives

$$\begin{aligned} E \left[ \begin{pmatrix} {{\hat{{\sigma }}}}^2 - {\sigma }^2 \\ {{\widetilde{{\delta }}}}- {\delta }\end{pmatrix} \begin{pmatrix} {{\hat{{\sigma }}}}^2 - {\sigma }^2 \\ {{\widetilde{{\delta }}}}- {\delta }\end{pmatrix}^\top \right] = O(m^{-1}), \quad E \left[ \begin{pmatrix} {{\hat{{\sigma }}}}^2 - {\sigma }^2 \\ {{\widetilde{{\delta }}}}- {\delta }\end{pmatrix} \right] = O(m^{-1}). \end{aligned}$$

Concerning the truncated estimator ${{\widehat{{\delta }}}}=\max (-1+1/m, \min ({{\widetilde{{\delta }}}}, 1-1/m))$, we consider the case of $0<{\delta }<1$. For large m, we have $1-1/m-{\delta }>0$. Then, $\Pr ({{\widetilde{{\delta }}}}>1-1/m)=\Pr ({{\widetilde{{\delta }}}}-{\delta }> 1-1/m-{\delta })\le E[({{\widetilde{{\delta }}}}-{\delta })^2]/(1-1/m-{\delta })^2$, so that $\Pr ({{\widetilde{{\delta }}}}>1-1/m)=O(m^{-1})$. This shows the consistency of ${{\widehat{{\delta }}}}$. Using the same arguments as below, we can show that $E[{{\widehat{{\delta }}}}-{\delta }]=O(m^{-1})$ and $E[({{\widehat{{\delta }}}}-{\delta })^2]=O(m^{-1})$. These results lead to the asymptotic properties of ${{\widehat{{\lambda }}}}$.

Concerning ${{\hat{\tau }}}^2$, we have $E[({\tilde{\tau }}^2 - \tau ^2)^2] = O(m^{-1})$, because $(v_i,\, {\varepsilon }_{ij})$’s are independent for different i and all the moments of $v_i$ and ${\varepsilon }_{ij}$ exist. Following Prasad and Rao (1990),

$$\begin{aligned} \begin{aligned} \Pr \{{\tilde{\tau }}^2 \le 0\}&= \Pr \{{\tilde{\tau }}^2 - \tau ^2 \le - \tau ^2\} \le \Pr \{|{\tilde{\tau }}^2 - \tau ^2| \ge \tau ^2\} \le \frac{E[({\tilde{\tau }}^2 - \tau ^2)^2]}{\tau ^4}, \end{aligned} \end{aligned}$$

(27)

which is of order $O(m^{-1})$. Then we have

$$\begin{aligned} E[({{\hat{\tau }}}^2 - \tau ^2)^2]&= E[({\tilde{\tau }}^2 - \tau ^2)^2 {\,|\,} {\tilde{\tau }}^2> 0] \Pr \{{\tilde{\tau }}^2 > 0\} + E[(0 - \tau ^2)^2 {\,|\,} {\tilde{\tau }}^2 \le 0] \Pr \{{\tilde{\tau }}^2 \le 0\} \\&\le E[({\tilde{\tau }}^2 - \tau ^2)^2] + \tau ^4\Pr \{{\tilde{\tau }}^2 \le 0\}, \end{aligned}$$

which is of order $O(m^{-1})$. Also, it follows from $E[{\tilde{\tau }}^2 - \tau ^2] = 0$ that

$$\begin{aligned} E[{{\hat{\tau }}}^2 - \tau ^2]&= E[{\tilde{\tau }}^2 I({\tilde{\tau }}^2 > 0) - \tau ^2] = E[{\tilde{\tau }}^2 \{1 - I({\tilde{\tau }}^2 \le 0)\} -\tau ^2] \\&= E[-{\tilde{\tau }}^2 I({\tilde{\tau }}^2 \le 0)] = E[(-{\tilde{\tau }}^2 + \tau ^2) I(-{\tilde{\tau }}^2 + \tau ^2 \ge \tau ^2)] - \tau ^2\Pr \{{\tilde{\tau }}^2 \le 0\}. \end{aligned}$$

As for the first term,

$$\begin{aligned} E[(-{\tilde{\tau }}^2 + \tau ^2) I(-{\tilde{\tau }}^2 + \tau ^2 \ge \tau ^2)]&= \tau ^2 E\Big [\frac{-{\tilde{\tau }}^2+\tau ^2}{\tau ^2} I \Big (\frac{-{\tilde{\tau }}^2+\tau ^2}{\tau ^2} \ge 1\Big )\Big ] \\&\le \tau ^2 E\Big [\Big (\frac{-{\tilde{\tau }}^2+\tau ^2}{\tau ^2}\Big )^2 I \Big (\frac{-{\tilde{\tau }}^2+\tau ^2}{\tau ^2} \ge 1\Big )\Big ] \\&\le \frac{E[({\tilde{\tau }}^2 - \tau ^2)^2]}{\tau ^2}, \end{aligned}$$

which is of order $ O(m^{-1})$. Thus, together with (27), we obtain $E[{{\hat{\tau }}}^2 - \tau ^2] = O(m^{-1})$.

We have derived the desired properties for the unconditional case, so that from the argument at the beginning of the proof, the statement of the theorem has been checked for $\big ({\varvec{\beta }}_{\varepsilon }^\top ,{\sigma }^2,\tau ^2,{\lambda }\big )$. Lastly, we need to consider ${\beta }_0$ instead of ${\beta }_{0{\varepsilon }}$. Since $\mu _{\varepsilon }= {\sigma }\sqrt{2/\pi }{\lambda }/\sqrt{1 + {\lambda }^2}$ is a three times continuously differentiable function of $({\sigma }^2,{\lambda })$, it follows that

$$\begin{aligned} {{\widehat{\mu }}}_{\varepsilon }- \mu _{\varepsilon }&= \begin{pmatrix} {{\hat{{\sigma }}}}^2 - {\sigma }^2 \\ {{\widehat{{\lambda }}}}- {\lambda }\end{pmatrix}^\top \begin{pmatrix} \partial \mu _{\varepsilon }/\partial {\sigma }^2 \\ \partial \mu _{\varepsilon }/\partial {\lambda }\end{pmatrix}\\&\quad + \frac{1}{2} \begin{pmatrix} {{\hat{{\sigma }}}}^2 - {\sigma }^2 \\ {{\widehat{{\lambda }}}}- {\lambda }\end{pmatrix}^\top \begin{pmatrix} \partial ^2\mu _{\varepsilon }/(\partial {\sigma }^2\partial {\sigma }^2) &{} \partial ^2\mu _{\varepsilon }/(\partial {\lambda }\partial {\sigma }^2) \\ \partial ^2\mu _{\varepsilon }/(\partial {\sigma }^2\partial {\lambda }) &{} \partial ^2\mu _{\varepsilon }/(\partial {\lambda }\partial {\lambda }) \end{pmatrix} \begin{pmatrix} {{\hat{{\sigma }}}}^2 - {\sigma }^2 \\ {{\widehat{{\lambda }}}}- {\lambda }\end{pmatrix} + O_p(m^{-3/2}). \end{aligned}$$

Using this expansion, the desired results can be easily obtained.

It remains to show the expectations of cross terms are $O(m^{-1})$. Analogously to (26), this can be achieved by Schwarz’s inequality, and the proof is complete. $\square $

Proof of Proposition 4.1

From Theorem 4.1, $g_{1i}({{\widehat{{\varvec{\omega }}}}}, {{\varvec{y}}}_i)$ can be expanded as $g_{1i}({{\widehat{{\varvec{\omega }}}}},{{\varvec{y}}}_i) = g_{1i}({\varvec{\omega }},{{\varvec{y}}}_i) + G_{1i}({{\widehat{{\varvec{\omega }}}}},{\varvec{\omega }},{{\varvec{y}}}_i) + O_p(m^{-3/2})$ where

$$\begin{aligned} G_{1i}({{\widehat{{\varvec{\omega }}}}},{\varvec{\omega }},{{\varvec{y}}}_i) = ({{\widehat{{\varvec{\omega }}}}}- {\varvec{\omega }})^\top \frac{\partial g_{1i}({\varvec{\omega }},{{\varvec{y}}}_i)}{\partial {\varvec{\omega }}} + \frac{1}{2}({{\widehat{{\varvec{\omega }}}}}- {\varvec{\omega }})^\top \frac{\partial ^2 g_{1i}({\varvec{\omega }},{{\varvec{y}}}_i)}{\partial {\varvec{\omega }}\partial {\varvec{\omega }}^\top }({{\widehat{{\varvec{\omega }}}}}- {\varvec{\omega }}). \end{aligned}$$

Thus we have $E[g_{1i}({{\widehat{{\varvec{\omega }}}}},{{\varvec{y}}}_i) {\,|\,} {{\varvec{y}}}_i] = g_{1i}({\varvec{\omega }},{{\varvec{y}}}_i) + E[G_{1i}({{\widehat{{\varvec{\omega }}}}},{\varvec{\omega }},{{\varvec{y}}}_i) {\,|\,} {{\varvec{y}}}_i] + o_p(m^{-1})$. It follows from Theorem 4.1 that $E[G_{1i}({{\widehat{{\varvec{\omega }}}}},{\varvec{\omega }},{{\varvec{y}}}_i){\,|\,}{{\varvec{y}}}_i] = O_p(m^{-1})$, so that applying the same arguments as in Butar and Lahiri (2003) shows $E[{{\hat{g}}}_{1i} {\,|\,} {{\varvec{y}}}_i] = g_{1i}({\varvec{\omega }},{{\varvec{y}}}_i) + o_p(m^{-1})$. Also, using Theorem 4.1 again, it can be seen that $E[{{\hat{g}}}_{i2} {\,|\,} {{\varvec{y}}}_i] = g_{2i}({\varvec{\omega }},{{\varvec{y}}}_i) + o_p(m^{-1})$. Then the proposition can be immediately obtained. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsujino, T., Kubokawa, T. Empirical Bayes methods in nested error regression models with skew-normal errors. Jpn J Stat Data Sci 2, 375–403 (2019). https://doi.org/10.1007/s42081-019-00038-y

Download citation

Received: 15 July 2018
Accepted: 20 February 2019
Published: 04 March 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s42081-019-00038-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empirical Bayes methods in nested error regression models with skew-normal errors

Abstract

Access this article

Similar content being viewed by others

Small area estimation under transformed nested-error regression models

The spatial empirical Bayes predictor of the small area mean for a lognormal variable of interest and spatially correlated random effects

Variable Selection for Linear Mixed Models with Applications in Small Area Estimation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proofs

Proof of expression (6)

Proof of Lemma 3.1

Proof of Theorem 3.1

Proof of Lemma 3.2

Proof of Theorem 3.2

Proof of Theorem 4.1

Proof of Proposition 4.1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Empirical Bayes methods in nested error regression models with skew-normal errors

Abstract

Access this article

Similar content being viewed by others

Small area estimation under transformed nested-error regression models

The spatial empirical Bayes predictor of the small area mean for a lognormal variable of interest and spatially correlated random effects

Variable Selection for Linear Mixed Models with Applications in Small Area Estimation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proofs

Appendix: Proofs

Proof of expression (6)

Proof of Lemma 3.1

Proof of Theorem 3.1

Proof of Lemma 3.2

Proof of Theorem 3.2

Proof of Theorem 4.1

Proof of Proposition 4.1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation