Abstract
We investigate the errors in covariates issues in a generalized partially linear model. Different from the usual literature (Ma and Carroll in J Am Stat Assoc 101:1465–1474, 2006), we consider the case where the measurement error occurs to the covariate that enters the model nonparametrically, while the covariates precisely observed enter the model parametrically. To avoid the deconvolution type operations, which can suffer from very low convergence rate, we use the B-splines representation to approximate the nonparametric function and convert the problem into a parametric form for operational purpose. We then use a parametric working model to replace the distribution of the unobservable variable, and devise an estimating equation to estimate both the model parameters and the functional dependence of the response on the latent variable. The estimation procedure is devised under the functional model framework without assuming any distribution structure of the latent variable. We further derive theories on the large sample properties of our estimator. Numerical simulation studies are carried out to evaluate the finite sample performance, and the practical performance of the method is illustrated through a data example.
Similar content being viewed by others
References
Apanasovich TV, Carroll RJ, Maity A (2009) SIMEX and standard error estimation in semiparametric measurement error models. Electron J Stat 3:318–348
Buonaccorsi JP (2010) Measurement error: models, methods, and applications. Chapman & Hall/CRC, New York
Carroll RJ, Fan J, Gijbels I, Wand MP (1995) Generalized partially linear single-index models. J Am Stat Assoc 92:477–489
Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu C (2006) Measurement error in nonlinear models: a modern perspective, 2nd edn. CRC Press, London
Huang Y, Wang CY (2001) Consistent functional methods for logistic regression with errors in covariates. J Am Stat Assoc 96:1469–1482
Jiang F, Ma Y (2018) A spline-assisted semiparametric approach to nonparametric measurement error models. arXiv:1804.00793
Liang H, Ren H (2005) Generalized partially linear measurement error models. J Comput Graph Stat 14:237–250
Liang H, Thurston SW (2008) Additive partial linear models with measurement errors. Biometrika 95:667–678
Liang H, Qin Y, Zhang X, Ruppert D (2009) Empirical likelihood-based inferences for generalized partially linear models. Scand Stat Theory Appl 36:433–443
Liu L (2007) Estimation of generalized partially linear models with measurement error using sufficiency scores. Stat Probab Lett 77:1580–1588
Liu J, Ma Y, Zhu L, Carroll RJ (2017) Estimation and inference of error-prone covariate effect in the presence of confounding variables. Electron J Stat 11:480–501
Ma Y, Carroll RJ (2006) Locally efficient estimators for semiparametric models with measurement error. J Am Stat Assoc 101:1465–1474
Ma Y, Tsiatis AA (2006) Closed form semiparametric estimators for measurement error models. Stat Sin 16:183–193
Stefanski LA, Carroll RJ (1985) Covariate measurement error in logistic regression. Ann Stat 13:1335–1351
Stefanski LA, Carroll RJ (1987) Conditional scores and optimal scores in generalized linear measurement error models. Biometrika 74:703–716
Tsiatis AA, Ma Y (2004) Locally efficient semiparametric estimators for functional measurement error models. Biometrika 91:835–848
Xu K, Ma Y (2015) Instrument assisted regression for errors in variables models with binary response. Scand J Stat 42:104–117
Yi G, Ma Y, Spiegelman D, Carroll RJ (2015) Functional and structural methods with mixed measurement error and misclassification in covariates. J Am Stat Assoc 110:681–696
Yu Y, Ruppert D (2012) Penalized spline estimation for partially linear single-index models. J Am Stat Assoc 97:1042–1054
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Yang’s research was supported by the National Nature Science Foundation of China Grants 11471086 and 11871173, the National Social Science Foundation of China Grant 16BTJ032, the National Statistical Scientific Research Center Projects 2015LD02, and the Fundamental Research Funds for the Central Universities 19JNYH08. Ma’s work is partially supported by NSF and NIH.
Appendix
Appendix
1.1 A.1 Proof of Theorem 1
From the definitions of \(\mathbf{S}_{\mathrm{eff}}^*(Y_i, W_i,\mathbf{Z}_i, {\varvec{\delta }}, g)\) and \({\mathbf{S}_{\mathrm{res}}}_2^*(Y_i, W_i, \mathbf{Z}_i, {\varvec{\delta }},{\varvec{\gamma }})\), we have
where \(_a\) here and throughout the text stands for “approximate,” and \(E_a\) indicates the expectation calculated with \(g(\cdot )\) replaced by the approximate model \(\mathbf{B}(\cdot )^{\mathrm{T}}{\varvec{\gamma }}_0\). Taking another expectation, we get
Using Condition \((\mathrm {C}6)\), we further get
component-wise. Condition \((\mathrm {C}7)\) ensures that \([E\{\mathbf{S}_{\mathrm{eff}}^*(Y_i, W_i, \mathbf{Z}_i, {\varvec{\delta }}, {\varvec{\gamma }})\}^{\mathrm{T}}, E\{{\mathbf{S}_{\mathrm{res}}}_2^*(Y_i, W_i, \mathbf{Z}_i, {\varvec{\delta }}, {\varvec{\gamma }})\}^{\mathrm{T}}]^{\mathrm{T}}\) is invertible near its zero \(\varvec{\theta }^*\) as a vector function of \(\varvec{\theta }\), and the first derivative of the inverse function is bounded in the neighborhood of \(\varvec{\theta }^*\). Therefore, \(\Vert \varvec{\theta }^* - \varvec{\theta }_0\Vert _2 = o_p(1)\). On the other hand, since
we have
element-wise. Using exactly the same argument as above, we can also obtain \(\Vert \widehat{\varvec{\theta }}_n -\varvec{\theta }^*\Vert _2=o_p(1)\). Hence, combining the two results, we get \(\Vert \widehat{\varvec{\theta }}_n - \varvec{\theta }_0\Vert _2=o_p(1)\). \(\square \)
1.2 A.2 Proof of Theorem 2
We first write
where
where
and \(\widetilde{{\varvec{\delta }}}_n\) is on the line connecting \({\varvec{\delta }}_0\) and \(\widehat{{\varvec{\delta }}}_n\).
We further expand \(\mathbf{T}_1\) as a function of \(\widehat{{\varvec{\gamma }}}_n({\varvec{\delta }}_0)\) about \({\varvec{\gamma }}_0({\varvec{\delta }}_0)\) to obtain
where
and \(\widetilde{{\varvec{\gamma }}}_n({\varvec{\delta }}_0)\) is on the line connects \(\widehat{{\varvec{\gamma }}}_n({\varvec{\delta }}_0)\) and \({\varvec{\gamma }}_0({\varvec{\delta }}_0)\).
Because of the consistency of \(\mathbf{B}(x)^{\mathrm{T}}\widetilde{{\varvec{\gamma }}}_n\) to g(x) derived from Condition (C6) and Theorem 1, and the weak law of large numbers, for arbitrary \(d_{{\varvec{\gamma }}}\times p\) matrix \(\mathbf{G}\) with \(\Vert \mathbf{G}\Vert _2 = 1\), we have
where
Here, like before, \(f(y_i, w_i, \mathbf{z}_i, {\varvec{\delta }}_0, {\varvec{\gamma }}, f_X)\) stands for \(f(y_i, w_i, \mathbf{z}_i, {\varvec{\delta }}_0, g, f_X)\) with \(g(\cdot )\) replaced by \(\mathbf{B}(\cdot )^{\mathrm{T}}{\varvec{\gamma }}\), and \(\mathbf{S}_{a, {\varvec{\gamma }}}(y_i, w_i, \mathbf{z}_i, {\varvec{\delta }}_0, {\varvec{\gamma }}_0)\equiv \partial \log f(y_i, w_i, \mathbf{z}_i, {\varvec{\delta }}_0, {\varvec{\gamma }}, f_X)/\partial {\varvec{\gamma }}\). The second equality holds by condition \((\mathrm {C}6)\).
The third equality holds because \(\Vert \partial \mathbf{S}_{\mathrm{eff}}^*(y_i, w_i,\mathbf{z}_i, {\varvec{\delta }}_0, {\varvec{\gamma }}_0)/\partial {\varvec{\gamma }}_0^{\mathrm{T}}\Vert _{\infty }\) is integrable by condition \((\mathrm {C}8)\) and \(f(y_i, w_i, \mathbf{z}_i, {\varvec{\delta }}_0,{\varvec{\gamma }}_0,f_X)\) is absolutely integrable. The fourth equality holds also by condition \((\mathrm {C}6)\). The fifth equality holds because \(E\{\mathbf{S}_{\mathrm{eff}}^*(y_i, w_i, \mathbf{z}_i, {\varvec{\delta }}, g)\} = \mathbf{0}\). For the last equality, we note that
By Condition \((\mathrm {C}6)\) and definitions of \(\varLambda _{g}\) and \(\varLambda _{a, {\varvec{\gamma }}}\), for any \(d_{{\varvec{\gamma }}} \times p\) matrix \(\mathbf{G}\), there exists a function \(\mathbf{h}(y_i, w_i, \mathbf{z}_i, {\varvec{\delta }}_0, g) \equiv E[s\{y_i, \mathbf{z}_i^{\mathrm{T}}{\varvec{\beta }}_0+g(X),{\varvec{\alpha }}_0\} \mathbf{G}^{\mathrm{T}}\mathbf{B}(X) \mid y_i,w_i,\mathbf{z}_i] \in \varLambda _{g}\) such that
Further, \(\mathbf{S}_{\mathrm{eff}}^*(y_i, w_i, \mathbf{z}_i, {\varvec{\delta }}_0, g)\) is orthogonal to any function in \(\varLambda _{g}\), thus the last equality holds. Hence, we obtain \(\Vert \mathbf{T}_{12}\{\widetilde{{\varvec{\gamma }}}({\varvec{\delta }}_0)\} \Vert _2= O_p(h_b^q) \).
Based on the asymptotic results of Proposition 4 in Jiang and Ma (2018), we have \(\Vert \widehat{{\varvec{\gamma }}}_n({\varvec{\delta }}_0) - {\varvec{\gamma }}_0({\varvec{\delta }}_0)\Vert _2 = O_p\{(nh_b)^{-1/2}\}\). Then, we have
Further, by \((\mathrm {C}6)\) we have \(\mathbf{T}_{11} = n^{-1/2}\sum _{i=1}^{n}\mathbf{S}_{\mathrm{eff}}^*(Y_i, W_i, \mathbf{Z}_i, {\varvec{\delta }}_0, g)+O_p(n^{1/2}h_b^q)\). Since \(h_b^{q-1/2} = o_p(n^{1/2}h_b^q)\), and \(n^{1/2}h_b^q= o_p(1)\) by conditions \((\mathrm {C}4)\) and \((\mathrm {C}5)\), then
We next consider each term in \(\mathbf{T}_2(\widetilde{{\varvec{\delta }}}_n)\). Since \(\widehat{{\varvec{\gamma }}}_n(\cdot )\) satisfies
for any \({\varvec{\delta }}\),
Then,
where
Hence,
By the consistency of \(\widetilde{{\varvec{\delta }}}_n\) to \({\varvec{\delta }}_0\) and \(\mathbf{B}(x)^{\mathrm{T}}\widehat{{\varvec{\gamma }}}_n\) to g(x), we have
and
From (A.1), we also have
Based on the proof of Proposition 4 in Jiang and Ma (2018), we have \(\Vert \mathbf{T}_{23}(\widetilde{{\varvec{\delta }}}_n)^{-1}\Vert _2 = O_p(h_b^{-1})\). Therefore, we have \(\mathbf{T}_{22}(\widetilde{{\varvec{\delta }}}_n) \{\mathbf{T}_{23}(\widetilde{{\varvec{\delta }}}_n)\}^{-1} \mathbf{T}_{24}(\widetilde{{\varvec{\delta }}}_n)=O_p(h_b^{q-1})\), where \(q>1\) by condition \((\mathrm {C}2)\). Thus,
Therefore,
Since \(n^{-1/2}\sum _{i=1}^{n}\mathbf{S}_{\mathrm{eff}}^*(Y_i, W_i, \mathbf{Z}_i, {\varvec{\delta }}_0, g)\) is the sum of zero-mean random vectors, this will converge in distribution to a multivariate normal distribution with mean \(\mathbf{0}\) and covariance matrix \(\mathbf{V}\) given in Theorem 2. \(\square \)
Rights and permissions
About this article
Cite this article
Wang, Q., Ma, Y. & Yang, G. Locally efficient estimation in generalized partially linear model with measurement error in nonlinear function. TEST 29, 553–572 (2020). https://doi.org/10.1007/s11749-019-00668-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-019-00668-0
Keywords
- B-splines
- Efficient score
- Errors in variables
- Generalized linear models
- Instrumental variables
- Measurement errors
- Partially linear models
- Semiparametrics