Estimation of the volume under the ROC surface in presence of nonignorable verification bias

To Duc, Khanh; Chiogna, Monica; Adimari, Gianfranco

doi:10.1007/s10260-019-00451-3

Estimation of the volume under the ROC surface in presence of nonignorable verification bias

Original Paper
Published: 28 January 2019

Volume 28, pages 695–722, (2019)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

247 Accesses
2 Citations
Explore all metrics

Abstract

The volume under the receiver operating characteristic surface (VUS) is useful for measuring the overall accuracy of a diagnostic test when the possible disease status belongs to one of three ordered categories. In medical studies, the VUS of a new test is typically estimated through a sample of measurements obtained by some suitable sample of patients. However, in many cases, only a subset of such patients has the true disease status assessed by a gold standard test. In this paper, for a continuous-scale diagnostic test, we propose four estimators of the VUS which accommodate for nonignorable missingness of the disease status. The estimators are based on a parametric model which jointly describes both the disease and the verification process. Identifiability of the model is discussed. Consistency and asymptotic normality of the proposed estimators are shown, and variance estimation is discussed. The finite-sample behavior is investigated by means of simulation experiments. An illustration is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

bcROCsurface: an R package for correcting verification bias in estimation of the ROC surface and its volume for continuous diagnostic tests

Article Open access 18 November 2017

Empirical Likelihood Confidence Intervals for the Difference of Areas Under Two Correlated ROC Curves

Article 01 September 2014

Reducing the overfitting in the gROC curve estimation

Article 10 March 2023

References

Baker SG (1995) Evaluating multiple diagnostic tests with partial verification. Biometrics 51(1):330–337
Article Google Scholar
Chi YY, Zhou XH (2008) Receiver operating characteristic surfaces in the presence of verification bias. J R Stat Soc Ser C (Appl Stat) 57(1):1–23
Article MathSciNet Google Scholar
Fluss R, Reiser B, Faraggi D, Rotnitzky A (2009) Estimation of the ROC curve under verification bias. Biom J 51(3):475–490
Article MathSciNet Google Scholar
Fluss R, Reiser B, Faraggi D (2012) Adjusting ROC curve for covariates in the presence of verification bias. J Stat Plan Inference 142(1):1–11
Article MathSciNet Google Scholar
Kang L, Tian L (2013) Estimation of the volume under the ROC surface with three ordinal diagnostic categories. Comput Stat Data Anal 62:39–51
Article MathSciNet Google Scholar
Li J, Zhou XH (2009) Nonparametric and semiparametric estimation of the three way receiver operating characteristic surface. J Stat Plan Inference 139(12):4133–4142
Article MathSciNet Google Scholar
Little RJ, Rubin DB (2002) Statistical analysis with missing data. Wiley, New York
Book Google Scholar
Liu D, Zhou XH (2010) A model for adjusting for nonignorable verification bias in estimation of the ROC curve and its area with likelihood-based approach. Biometrics 66(4):1119–1128
Article MathSciNet Google Scholar
Nakas CT, Yiannoutsos CT (2004) Ordered multiple-class ROC analysis with continuous measurements. Stat Med 23(22):3437–3449
Article Google Scholar
Rotnitzky A, Faraggi D, Schisterman E (2006) Doubly robust estimation of the area under the receiver-operating characteristic curve in the presence of verification bias. J Am Stat Assoc 101(475):1276–1288
Article MathSciNet Google Scholar
Scurfield BK (1996) Multiple-event forced-choice tasks in the theory of signal detectability. J Math Psychol 40(3):253–269
Article Google Scholar
To Duc K (2017) bcROCsurface: an R package for correcting verification bias in estimation of the ROC surface and its volume for continuous diagnostic tests. BMC Bioinform 18(1):503
Article Google Scholar
To Duc K, Chiogna M, Adimari G (2016) Bias-corrected methods for estimating the receiver operating characteristic surface of continuous diagnostic tests. Electron J Stat 10(2):3063–3113
Article MathSciNet Google Scholar
van der Vaart AW (2000) Asymptotic statistics. Cambridge University Press, Cambridge
Google Scholar
Xiong C, van Belle G, Miller JP, Morris JC (2006) Measuring and estimating diagnostic accuracy when there are three ordinal diagnostic groups. Stat Med 25(7):1251–1273
Article MathSciNet Google Scholar
Zhang Y, Alonzo TA (2018) Estimation of the volume under the receiver-operating characteristic surface adjusting for non-ignorable verification bias. Stat Methods Med Res 27(3):715–739
Article MathSciNet Google Scholar
Zhou XH, Castelluccio P (2003) Nonparametric analysis for the ROC areas of two diagnostic tests in the presence of nonignorable verification bias. J Stat Plan Inference 115(1):193–213
Article MathSciNet Google Scholar
Zhou XH, Castelluccio P (2004) Adjusting for non-ignorable verification bias in clinical studies for Alzheimer’s disease. Stat Med 23(2):221–230
Article Google Scholar
Zhou XH, Rodenberg CA (1998) Estimating an ROC curve in the presence of nonignorable verification bias. Commun Stat 27(3):273–285
Google Scholar

Download references

Acknowledgements

The authors thank the Alzheimers Disease Neuroimaging Initiative research group for kindly permitting access to the data analyzed in this paper. Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense Award Number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimers Association; Alzheimers Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

Author information

Authors and Affiliations

Department of Statistical Sciences, University of Padova, Via Cesare Battisti, 241, 35121, Padova, Italy
Khanh To Duc & Gianfranco Adimari
Department of Statistical Sciences “Paolo Fortunati”, University of Bologna, Via Belle Arti, 41, 40126, Bologna, Italy
Monica Chiogna

Authors

Khanh To Duc
View author publications
You can also search for this author in PubMed Google Scholar
Monica Chiogna
View author publications
You can also search for this author in PubMed Google Scholar
Gianfranco Adimari
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

for the Alzheimer’s Disease Neuroimaging Initiative

Corresponding author

Correspondence to Monica Chiogna.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Data used in preparation of this article were obtained from the Alzheimers Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

Appendices

Appendix 1

Proves

Proof of Theorem 1

We can show that ${\mathbb {E}}\{G_{i\ell r,*}(\mu _0,{\varvec{\xi }}_0)\} = 0$ (see the “Appendix 2”). Then $e_*(\mu _0,{\varvec{\xi }}_0) = 0$, and, by condition (C2) and an application of implicit function theorem, there exists a neighborhood of ${\varvec{\xi }}_0$ in which a continuously differentiable function, $m({\varvec{\xi }})$, is uniquely defined such that $m({\varvec{\xi }}_0) = \mu _0$ and $e_*(m({\varvec{\xi }}),{\varvec{\xi }}) = 0$. Since the maximum likelihood estimator $\hat{{\varvec{\xi }}}$ is consistent, i.e., $\hat{{\varvec{\xi }}} {\mathop {\rightarrow }\limits ^{p}} {\varvec{\xi }}_0$, we have that ${{\tilde{\mu }}}_* = m(\hat{{\varvec{\xi }}}){\mathop {\rightarrow }\limits ^{p}} \mu _0$. On the other hand, $G_*({{\hat{\mu }}}_*, \hat{{\varvec{\xi }}}) = 0$ and condition (C3) implies that $e_*({{\hat{\mu }}}_*,\hat{{\varvec{\xi }}}){\mathop {\rightarrow }\limits ^{p}} 0$. Thus, ${\hat{\mu }}_* {\mathop {\rightarrow }\limits ^{p}} {{\tilde{\mu }}}_*$. $\square $

Proof of Theorem 2

We have

$$\begin{aligned} 0= & {} \sqrt{n}G_{*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) \\ 0= & {} \sqrt{n}G_{*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + \sqrt{n}e_*({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) - \sqrt{n}e_*({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}). \end{aligned}$$

Since $e_*(\mu _0,{\varvec{\xi }}_0) = 0$, we get

$$\begin{aligned} 0= & {} \sqrt{n}G_{*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + \sqrt{n}e_*({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) - \sqrt{n}e_*({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + \sqrt{n}e_*(\mu _0,{\varvec{\xi }}_0) - \sqrt{n}e_*(\mu _0,{\varvec{\xi }}_0) \\= & {} \sqrt{n}\left\{ G_{*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) - e_*({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) \right\} + \sqrt{n}\left\{ e_*({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) - e_*(\mu _0,{\varvec{\xi }}_0)\right\} + \sqrt{n}e_*(\mu _0,{\varvec{\xi }}_0) \\&- \, \sqrt{n}G_{*}(\mu _0,{\varvec{\xi }}_0) + \sqrt{n}G_{*}(\mu _0,{\varvec{\xi }}_0) \\= & {} \left[ \sqrt{n}\left\{ G_{*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) - e_*({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) \right\} - \sqrt{n}\left\{ G_{*}(\mu _0,{\varvec{\xi }}_0) - e_*(\mu _0,{\varvec{\xi }}_0)\right\} \right] \\&+ \, \sqrt{n}\left\{ e_*({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) - e_*(\mu _0,{\varvec{\xi }}_0)\right\} + \sqrt{n}G_{*}(\mu _0,{\varvec{\xi }}_0). \end{aligned}$$

Condition (C1) implies that the first term in right hand side of the last identity is $o_p(1)$. Using the Taylor expansion, we have

$$\begin{aligned} 0= & {} o_p(1) + \sqrt{n}\left\{ e_*({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) - e_*(\mu _0,{\varvec{\xi }}_0)\right\} + \sqrt{n}G_{*}(\mu _0,{\varvec{\xi }}_0) \nonumber \\= & {} o_p(1) + \sqrt{n}({\hat{\mu }}_{*} - \mu _0) \frac{\partial e_*(\mu ,{\varvec{\xi }}_0)}{\partial \mu }\Bigg |_{\mu = \mu _0} \nonumber \\&+ \, \sqrt{n}(\hat{{\varvec{\xi }}} - {\varvec{\xi }}_0)\frac{\partial e_*(\mu _0,{\varvec{\xi }})}{\partial {\varvec{\xi }}^\top }\Bigg |_{{\varvec{\xi }} = {\varvec{\xi }}_0} + \sqrt{n}G_{*}(\mu _0,{\varvec{\xi }}_0) . \end{aligned}$$

(18)

It is straightforward to show that

$$\begin{aligned} \frac{\partial e_*(\mu ,{\varvec{\xi }}_0)}{\partial \mu }\Bigg |_{\mu = \mu _0} = - \mathrm {Pr}(D_1 = 1) \mathrm {Pr}(D_2 = 1) \mathrm {Pr}(D_3 = 1) = - \theta _1 \theta _2 \theta _3. \end{aligned}$$

By standard results on the limit distribution of U-statistics (van der Vaart 2000, Theorem 12.3, Chap. 12),

$$\begin{aligned} \sqrt{n}U_{n,*}(\mu _0,{\varvec{\xi }}_0)= & {} \sqrt{n}\left\{ G_{*}(\mu _0,{\varvec{\xi }}_0) - e_*(\mu _0,{\varvec{\xi }}_0)\right\} \\= & {} \sqrt{n}G_{*}(\mu _0,{\varvec{\xi }}_0) {\mathop {\rightarrow }\limits ^{p}} \sqrt{n}{\tilde{G}}_{*}(\mu _0,{\varvec{\xi }}_0), \end{aligned}$$

where $\sqrt{n}{\tilde{G}}_{*}(\mu ,{\varvec{\xi }})$ is the projection of $U_{n,*}$ onto the set of all statistics of the form

$$\begin{aligned} \sqrt{n}{\tilde{G}}_{n,*}(\mu ,{\varvec{\xi }})= & {} \frac{1}{2\sqrt{n}}\sum _{i=1}^{n} {\mathbb {E}}\bigg \{ G_{i\ell r,*}(\mu ,{\varvec{\xi }}) + G_{ir \ell ,*}(\mu ,{\varvec{\xi }}) + G_{\ell ir,*}(\mu ,{\varvec{\xi }}) \\&+ \, G_{\ell r i,*}(\mu ,{\varvec{\xi }}) + G_{r i\ell ,*}(\mu ,{\varvec{\xi }}) + G_{r \ell i,*}(\mu ,{\varvec{\xi }}) \big |O_i \bigg \} \end{aligned}$$

for $\ell \ne i$ and $r \ne \ell , r \ne i$. For the maximum likelihood estimator $\hat{{\varvec{\xi }}}$, we can write

$$\begin{aligned} \sqrt{n}\left( \hat{{\varvec{\xi }}} - {\varvec{\xi }}_0\right)= & {} \frac{1}{\sqrt{n}}\left[ -\frac{\partial {\mathbb {E}}\left\{ {\mathcal {S}}_i({\varvec{\xi }})\right\} }{\partial {\varvec{\xi }}^\top }\Bigg |_{{\varvec{\xi }} = {\varvec{\xi }}_0}\right] ^{-1}\sum _{i=1}^{n}{\mathcal {S}}_i({\varvec{\xi }}_0) + o_p(1) \\= & {} \frac{1}{\sqrt{n}}{\mathcal {I}}({\varvec{\xi }})^{-1} \sum _{i=1}^{n}{\mathcal {S}}_i({\varvec{\xi }}_0) + o_p(1). \end{aligned}$$

Hence, from (18),

$$\begin{aligned}&\theta _1 \theta _2 \theta _3 \sqrt{n}({\hat{\mu }}_{*} - \mu _0) \nonumber \\&\quad = o_p(1) + \frac{1}{\sqrt{n}} \frac{\partial e_*(\mu _0,{\varvec{\xi }})}{\partial {\varvec{\xi }}^\top }\Bigg |_{{\varvec{\xi }} = {\varvec{\xi }}_0} {\mathcal {I}}({\varvec{\xi }})^{-1} \sum _{i=1}^{n}{\mathcal {S}}_i({\varvec{\xi }}_0) \nonumber \\&\qquad + \, \frac{1}{2\sqrt{n}}\sum _{i=1}^{n} {\mathbb {E}}\bigg \{ G_{i\ell r,*}(\mu _0,{\varvec{\xi }}_0) + G_{ir \ell ,*}(\mu _0,{\varvec{\xi }}_0) + G_{\ell ir,*}(\mu _0,{\varvec{\xi }}_0) \nonumber \\&\qquad + \, G_{\ell r i,*}(\mu _0,{\varvec{\xi }}_0) + G_{r i\ell ,*}(\mu _0,{\varvec{\xi }}_0) + G_{r \ell i,*}(\mu _0,{\varvec{\xi }}_0) \big |O_i \bigg \}\nonumber \\&\quad = o_p(1) + \frac{1}{\sqrt{n}}\sum _{i=1}^{n} \Bigg [ \frac{\partial e_*(\mu _0,{\varvec{\xi }})}{\partial {\varvec{\xi }}^\top }\Bigg |_{{\varvec{\xi }} = {\varvec{\xi }}_0} {\mathcal {I}}({\varvec{\xi }})^{-1} {\mathcal {S}}_i({\varvec{\xi }}_0) \nonumber \\&\qquad + \, \frac{1}{2} {\mathbb {E}}\bigg \{ G_{i\ell r,*}(\mu _0,{\varvec{\xi }}_0) + G_{ir \ell ,*}(\mu _0,{\varvec{\xi }}_0) + G_{\ell ir,*}(\mu _0,{\varvec{\xi }}_0) \nonumber \\&\qquad + \, G_{\ell r i,*}(\mu _0,{\varvec{\xi }}_0) + G_{r i\ell ,*}(\mu _0,{\varvec{\xi }}_0) + G_{r \ell i,*}(\mu _0,{\varvec{\xi }}_0) \big |O_i \bigg \} \Bigg ] \nonumber \\&\quad = o_p(1) + \frac{1}{\sqrt{n}}\sum _{i=1}^{n}Q_{i,*}(\mu _0,{\varvec{\xi }}_0) = o_p(1) + \frac{1}{\sqrt{n}} Q_*(\mu _0,{\varvec{\xi }}_0). \end{aligned}$$

(19)

Note that the observed data $O_i$ are i.i.d, then $Q_{i,*}(\mu _0,{\varvec{\xi }}_0)$ are also i.i.d. In addition, we easily show that

$$\begin{aligned} 0= & {} {\mathbb {E}}\Bigg [{\mathbb {E}}\bigg \{ G_{i\ell r,*}(\mu _0,{\varvec{\xi }}_0) + G_{ir \ell ,*}(\mu _0, {\varvec{\xi }}_0) + G_{\ell ir,*}(\mu _0,{\varvec{\xi }}_0) + G_{\ell r i,*}(\mu _0, {\varvec{\xi }}_0) \\&+ \, G_{r i\ell ,*}(\mu _0, {\varvec{\xi }}_0) + G_{r \ell i,*}(\mu _0, {\varvec{\xi }}_0) \big |O_i \bigg \} \Bigg ]. \end{aligned}$$

Therefore, ${\mathbb {E}}\{Q_{i,*} (\mu _0,{\varvec{\xi }}_0)\} = 0$, and $\frac{1}{\sqrt{n}} Q_* (\mu _0,{\varvec{\xi }}_0) {\mathop {\rightarrow }\limits ^{d}} {\mathcal {N}}(0, {\mathbb {V}}\mathrm {ar}\left\{ Q_{i,*} (\mu _0,{\varvec{\xi }}_0)\right\} )$ by the Central Limit Theorem. It follows that

$$\begin{aligned} \sqrt{n}\left( {\hat{\mu }}_{*} - \mu _0 \right) {\mathop {\rightarrow }\limits ^{d}} {\mathcal {N}}\left( 0, \varLambda _*\right) , \end{aligned}$$

where

$$\begin{aligned} \varLambda _* = \frac{{\mathbb {V}}\mathrm {ar}\left\{ Q_{i,*} (\mu _0,{\varvec{\xi }}_0)\right\} }{\theta _1^2\theta _2^2\theta _3^2}. \end{aligned}$$

(20)

$\square $

Variance estimation

Under condition (C3), a consistent estimator of $\varLambda _*$ can be obtained as

$$\begin{aligned} {\hat{\varLambda }}_* = \frac{{\mathbb {V}}\mathrm {ar}\left\{ {\hat{Q}}_{i,*} ({\hat{\mu }}_{*},\hat{{\varvec{\xi }}})\right\} }{{\hat{\theta }}_{1,*}^2 {\hat{\theta }}_{2,*}^2 {\hat{\theta }}_{3,*}^2} = \frac{\frac{1}{n - 1} \sum \limits _{i=1}^{n}{\hat{Q}}_{i,*}^2({\hat{\mu }}_{*},\hat{{\varvec{\xi }}})}{{\hat{\theta }}_{1,*}^2 {\hat{\theta }}_{2,*}^2 {\hat{\theta }}_{3,*}^2}, \end{aligned}$$

(21)

where ${\hat{\theta }}_{k,*}$ are the estimates of the disease probabilities, $\theta _{k}$ for $k = 1,2,3$. Specifically, ${\hat{\theta }}_{k,\mathrm {FI}} = \frac{1}{n}\sum \nolimits _{i=1}^{n} {\hat{\rho }}_{ki}$, ${\hat{\theta }}_{k,\mathrm {MSI}} = \frac{1}{n}\sum \nolimits _{i=1}^{n} {\tilde{D}}_{ki,\mathrm {MSI}}$, ${\hat{\theta }}_{k,\mathrm {PDR}} = \frac{1}{n}\sum \nolimits _{i=1}^{n} {\tilde{D}}_{ki,\mathrm {PDR}}$ and ${\hat{\theta }}_{k,\mathrm {IPW}} = \sum \nolimits _{i=1}^{n} V_i D_{ki}{\hat{\pi }}_i^{-1} \bigg /\sum \nolimits _{i=1}^{n} V_i{\hat{\pi }}_i^{-1}$. According to (19), we have that

$$\begin{aligned}&{\hat{Q}}_{i,*} ({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) \\&\quad = \left\{ \frac{1}{(n-1)(n-2)} \sum _{i=1}^{n}\sum _{{\mathop {\ell \ne i}\limits ^{\ell =i}}}^{n}\sum _{{\mathop {r \ne \ell , r \ne i}\limits ^{r = 1}}}^{n} \frac{\partial G_{i\ell r,*}({\hat{\mu }}_{*},{\varvec{\xi }})}{\partial {\varvec{\xi }}^\top }\bigg |_{{\varvec{\xi }} = \hat{{\varvec{\xi }}}} \right\} \\&\qquad \times \, \left\{ -\sum _{i=1}^{n}\frac{\partial {\mathcal {S}}_i({\varvec{\xi }})}{\partial {\varvec{\xi }}^\top }\bigg |_{{\varvec{\xi }} = \hat{{\varvec{\xi }}}}\right\} ^{-1} {\mathcal {S}}_i(\hat{{\varvec{\xi }}}) \\&\qquad + \, \frac{1}{2(n-1)(n-2)} \sum _{{\mathop {\ell \ne i}\limits ^{\ell =1}}}^{n} \sum _{{\mathop {r \ne i, r \ne \ell }\limits ^{r = 1}}}^{n}\bigg \{ G_{i\ell r,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + G_{ir \ell ,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + G_{\ell ir,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) \\&\qquad + \, G_{\ell r i,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + G_{r i\ell ,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + G_{r \ell i,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}})\bigg \}. \end{aligned}$$

In addition, for fixed i, we also have that

$$\begin{aligned} \sum _{{\mathop {\ell \ne i}\limits ^{\ell = 1}}}^{n} \sum _{{\mathop {r \ne i, r \ne \ell }\limits ^{r = 1}}}^{n} \left\{ G_{i\ell r,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + G_{ikr,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}})\right\}= & {} 2\sum _{{\mathop {\ell \ne i}\limits ^{\ell = 1}}}^{n} \sum _{{\mathop {r \ne i, r \ne \ell }\limits ^{r = 1}}}^{n}G_{i\ell r,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}), \\ \sum _{{\mathop {\ell \ne i}\limits ^{\ell = 1}}}^{n} \sum _{{\mathop {r \ne i, r \ne \ell }\limits ^{r = 1}}}^{n} \left\{ G_{\ell ir,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + G_{r i\ell ,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}})\right\}= & {} 2\sum _{{\mathop {\ell \ne i}\limits ^{\ell = 1}}}^{n} \sum _{{\mathop {r \ne i, r \ne \ell }\limits ^{r = 1}}}^{n}G_{\ell ir,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}), \\ \sum _{{\mathop {\ell \ne i}\limits ^{\ell = 1}}}^{n} \sum _{{\mathop {r \ne i, r \ne \ell }\limits ^{r = 1}}}^{n} \left\{ G_{\ell r i,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + G_{r \ell i,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}})\right\}= & {} 2\sum _{{\mathop {\ell \ne i}\limits ^{\ell =1}}}^{n} \sum _{{\mathop {r \ne i, r \ne \ell }\limits ^{r = 1}}}^{n}G_{r \ell i,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}). \end{aligned}$$

Therefore,

$$\begin{aligned}&{\hat{Q}}_{i,*} ({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) \nonumber \\&\quad = \left\{ \frac{1}{(n-1)(n-2)}\sum _{i = 1}^{n}\sum _{{\mathop {\ell \ne i}\limits ^{\ell = i}}}^{n}\sum _{{\mathop {r \ne \ell , r \ne i}\limits ^{r = 1}}}^{n} \frac{\partial G_{i\ell r,*}({\hat{\mu }}_{*},{\varvec{\xi }})}{\partial {\varvec{\xi }}^\top }\bigg |_{{\varvec{\xi }} = \hat{{\varvec{\xi }}}}\right\} \nonumber \\&\qquad \times \, \left\{ -\sum _{i=1}^{n}\frac{\partial {\mathcal {S}}_i({\varvec{\xi }})}{\partial {\varvec{\xi }}^\top }\bigg |_{{\varvec{\xi }} = \hat{{\varvec{\xi }}}}\right\} ^{-1} {\mathcal {S}}_i(\hat{{\varvec{\xi }}}) \nonumber \\&\qquad + \, \frac{1}{(n-1)(n-2)} \sum _{{\mathop {\ell \ne i}\limits ^{\ell = 1}}}^{n} \sum _{{\mathop {r \ne i, r \ne \ell }\limits ^{r = 1}}}^{n}\bigg \{ G_{i\ell r,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + G_{\ell ir,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}}) + G_{r \ell i,*}({\hat{\mu }}_{*},\hat{{\varvec{\xi }}})\bigg \}.\nonumber \\ \end{aligned}$$

(22)

The quantity $\sum \nolimits _{i=1}^{n} \frac{\partial {\mathcal {S}}_i({\varvec{\xi }})}{\partial {\varvec{\xi }}^\top }\bigg |_{{\varvec{\xi }} = \hat{{\varvec{\xi }}}}$ could be obtained as the Hessian matrix of the log-likelihood function at $\hat{{\varvec{\xi }}}$. In order to compute $\frac{\partial G_{i\ell r,*}({\hat{\mu }}_{*},{\varvec{\xi }})}{\partial {\varvec{\xi }}^\top }\bigg |_{{\varvec{\xi }} = \hat{{\varvec{\xi }}}}$, we have to get the derivatives $\frac{\partial }{\partial {\varvec{\xi }}^\top } \rho _{ki}({\varvec{\tau }}_{0\rho _k})$, $\frac{\partial }{\partial {\varvec{\xi }}^\top } \rho _{k(0)i}({\varvec{\xi }})$, $\frac{\partial }{\partial {\varvec{\xi }}^\top } \pi ^{-1}_{i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi )$, $\frac{\partial }{\partial {\varvec{\xi }}^\top } \pi _{10i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi )$, $\frac{\partial }{\partial {\varvec{\xi }}^\top } \pi _{01i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi )$ and $\frac{\partial }{\partial {\varvec{\xi }}^\top } \pi _{00i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi )$.

In Sect. 2.3, we obtain

$$\begin{aligned} \begin{array}{ll} \dfrac{\partial }{\partial \lambda _1} \pi _{10i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi ) = \pi _{10i}(1 - \pi _{10i}); &{} \quad \dfrac{\partial }{\partial \lambda _2} \pi _{10i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi ) = 0; \\ \dfrac{\partial }{\partial \lambda _1} \pi _{01i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi ) = 0; &{} \quad \dfrac{\partial }{\partial \lambda _2} \pi _{01i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi ) = \pi _{01i}(1 - \pi _{01i}); \\ \dfrac{\partial }{\partial \lambda _1} \pi _{00i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi ) = 0; &{} \quad \dfrac{\partial }{\partial \lambda _2} \pi _{00i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi ) = 0. \end{array} \end{aligned}$$

and

$$\begin{aligned} \frac{\partial }{\partial {\varvec{\tau }}_\pi ^\top }\pi _{d_1 d_2 i} = {\varvec{U}}_i (1 - \pi _{d_1 d_2 i})\pi _{d_1 d_2 i}, \end{aligned}$$

where $(d_1, d_2)$ belongs to the set $\{(1,0), (0,1), (0,0)\}$. Also, we have

$$\begin{aligned} \begin{array}{ll} \dfrac{\partial }{\partial {\varvec{\tau }}^\top _{\rho _1}} \rho _{1i}(\tau _\rho ) = {\varvec{U}}_i\rho _{1i}(1 - \rho _{1i}); &{}\quad \dfrac{\partial }{\partial {\varvec{\tau }}^\top _{\rho _2}} \rho _{1i}({\varvec{\tau }}_\rho ) = - {\varvec{U}}_i\rho _{1i}\rho _{2i}; \\ \dfrac{\partial }{\partial {\varvec{\tau }}^\top _{\rho _2}} \rho _{2i}({\varvec{\tau }}_\rho ) = {\varvec{U}}_i\rho _{2i}(1 - \rho _{2i}); &{}\quad \dfrac{\partial }{\partial {\varvec{\tau }}^\top _{\rho _1}} \rho _{2i}({\varvec{\tau }}_\rho ) = - {\varvec{U}}_i \rho _{1i}\rho _{2i}. \end{array} \end{aligned}$$

Moreover,

$$\begin{aligned} \frac{\partial }{\partial \lambda _s} \pi ^{-1}_i({\varvec{\lambda }}, {\varvec{\tau }}_\pi ) = -D_{si}\frac{1 - \pi _i}{\pi _i}; \qquad \frac{\partial }{\partial {\varvec{\tau }}_\pi ^\top }\pi ^{-1}_{i}({\varvec{\lambda }}, {\varvec{\tau }}_\pi ) = -{\varvec{U}}_{i}\frac{1 - \pi _i}{\pi _i}, \end{aligned}$$

with $s = 1, 2$. Then, recall that

$$\begin{aligned} \rho _{1(0)i}= & {} \frac{(1 - \pi _{10i})\rho _{1i}}{(1 - \pi _{10i})\rho _{1i} + (1 - \pi _{01i})\rho _{2i} + (1 - \pi _{00i})\rho _{3i}}, \\ \rho _{2(0)i}= & {} \frac{(1 - \pi _{01i})\rho _{2i}}{(1 - \pi _{10i})\rho _{1i} + (1 - \pi _{01i})\rho _{2i} + (1 - \pi _{00i})\rho _{3i}}, \\ \rho _{3(0)i}= & {} \frac{(1 - \pi _{00i})\rho _{3i}}{(1 - \pi _{10i})\rho _{1i} + (1 - \pi _{01i})\rho _{2i} + (1 - \pi _{00i})\rho _{3i}}. \end{aligned}$$

After some algebra, we get

$$\begin{aligned} \frac{\partial }{\partial \lambda _1} \rho _{1(0)i}({\varvec{\xi }})= & {} \frac{1}{z^2}\left[ -\pi _{10i}(1 - \pi _{10i})\rho _{1i}\left\{ (1 - \pi _{01i})\rho _{2i} + (1 - \pi _{00i})\rho _{3i} \right\} \right] , \\ \frac{\partial }{\partial \lambda _2} \rho _{1(0)i}({\varvec{\xi }})= & {} \frac{1}{z^2} \rho _{1i}\rho _{2i}\pi _{01i}(1 - \pi _{01i}) (1 - \pi _{10i}), \\ \frac{\partial }{\partial {\varvec{\tau }}_\pi ^\top } \rho _{1(0)i}({\varvec{\xi }})= & {} -\frac{{\varvec{U}}_i}{z^2} \rho _{1i}(1 - \pi _{10i}) \bigg \{ \rho _{2i}(1 - \pi _{01i})(\pi _{10i} - \pi _{01i}) \\&+ \, \rho _{3i}(1 - \pi _{00i})(\pi _{10i} - \pi _{00i})\bigg \}, \\ \frac{\partial }{\partial {\varvec{\tau }}_{\rho _1}^\top } \rho _{1(0)i}({\varvec{\xi }})= & {} \frac{{\varvec{U}}_i}{z^2} \rho _{1i} (1 - \pi _{10i}) \left\{ \rho _{2i}(1 - \pi _{01i}) + \rho _{3i}(1 - \pi _{00i}) \right\} , \\ \frac{\partial }{\partial {\varvec{\tau }}_{\rho _2}^\top } \rho _{1(0)i}({\varvec{\xi }})= & {} -\frac{{\varvec{U}}_i}{z^2} \rho _{1i}\rho _{2i} (1 - \pi _{10i}) (1 - \pi _{01i}). \end{aligned}$$

Finally, we set $z = (1 - \pi _{10i})\rho _{1i} + (1 - \pi _{01i})\rho _{2i} + (1 - \pi _{00i})\rho _{3i}$, and get

$$\begin{aligned} \frac{\partial }{\partial \lambda _1} \rho _{2(0)i}({\varvec{\xi }})= & {} \frac{1}{z^2} \rho _{1i}\rho _{2i}\pi _{10i}(1 - \pi _{10i}) (1 - \pi _{01i}), \\ \frac{\partial }{\partial \lambda _2} \rho _{2(0)i}({\varvec{\xi }})= & {} \frac{1}{z^2} \left[ -\pi _{01i}(1 - \pi _{01i})\rho _{2i}\left\{ (1 - \pi _{10i})\rho _{1i} + (1 - \pi _{00i})\rho _{3i} \right\} \right] , \\ \frac{\partial }{\partial {\varvec{\tau }}_\pi ^\top } \rho _{2(0)i}({\varvec{\xi }})= & {} -\frac{{\varvec{U}}_i}{z^2} \rho _{2i}(1 - \pi _{01i}) \bigg \{ \rho _{1i}(1 - \pi _{10i})(\pi _{01i} - \pi _{10i}) \\&+ \, \rho _{3i}(1 - \pi _{00i})(\pi _{01i} - \pi _{00i})\bigg \}, \\ \frac{\partial }{\partial {\varvec{\tau }}_{\rho _1}^\top } \rho _{2(0)i}({\varvec{\xi }})= & {} -\frac{{\varvec{U}}_i}{z^2} \rho _{1i}\rho _{2i} (1 - \pi _{10i}) (1 - \pi _{01i}), \\ \frac{\partial }{\partial {\varvec{\tau }}_{\rho _2}^\top } \rho _{2(0)i}({\varvec{\xi }})= & {} \frac{{\varvec{U}}_i}{z^2} \rho _{2i} (1 - \pi _{01i}) \left\{ \rho _{1i}(1 - \pi _{10i}) + \rho _{3i}(1 - \pi _{00i}) \right\} . \end{aligned}$$

The derivative $\dfrac{\partial }{\partial {\varvec{\xi }}^\top } \rho _{3(0)i}({\varvec{\xi }})$ can be computed by using the fact that $\rho _{3(0)i} = 1 - \rho _{1(0)i} - \rho _{2(0)i}$.

Appendix 2

Here, we show that the estimating functions $G_{i\ell r,*}$ are unbiased under the working disease and verification models. Recall that ${\varvec{\xi }} = ({\varvec{\lambda }}^\top , {\varvec{\tau }}^\top _\pi , {\varvec{\tau }}^\top _\rho )^\top $.

FI estimator
We have
$$\begin{aligned} {\mathbb {E}}\left\{ G_{i\ell r,\mathrm {FI}}(\mu _0, {\varvec{\xi }}_0)\right\}= & {} {\mathbb {E}}\left\{ \rho _{1i}({\varvec{\tau }}_{0\rho }) \rho _{2\ell }({\varvec{\tau }}_{0\rho }) \rho _{3r}({\varvec{\tau }}_{0\rho }) (I_{i\ell r} - \mu ) \right\} \\= & {} {\mathbb {E}}\left\{ \rho _{1i}\rho _{2\ell }\rho _{3r}(I_{i\ell r} - \mu _0) \right\} . \end{aligned}$$
Hence, ${\mathbb {E}}\left\{ G_{i\ell r,\mathrm {FI}}(\mu _0, {\varvec{\xi }}_0)\right\} = 0$ from (13).
MSI estimator
Consider ${\mathbb {E}}\left\{ D_{ki,\mathrm {MSI}}({\varvec{\xi }}_0)|T_i, {\varvec{A}}_i\right\} $. We have
$$\begin{aligned}&{\mathbb {E}}\left\{ D_{ki,\mathrm {MSI}}({\varvec{\xi }}_0)|T_i, {\varvec{A}}_i\right\} \\&\quad = {\mathbb {E}}\left\{ V_i D_{ki} + (1 - V_i)\rho _{k(0)i}({\varvec{\xi }}_0)|T_i, {\varvec{A}}_i\right\} \\&\quad = {\mathbb {E}}\left[ {\mathbb {E}}\left\{ V_i D_{ki} + (1 - V_i)\rho _{k(0)i}({\varvec{\xi }}_0)|T_i, {\varvec{A}}_i, V_i \right\} | T_i, {\varvec{A}}_i \right] \\&\quad = \mathrm {Pr}(V_i = 1|T_i, {\varvec{A}}_i){\mathbb {E}}\left( D_{ki}|V_i = 1, T_i, {\varvec{A}}_i\right) \\&\qquad + \, \mathrm {Pr}(V_i = 0|T_i, {\varvec{A}}_i){\mathbb {E}}\left( \rho _{k(0)i}({\varvec{\xi }}_0)|V_i = 0, T_i, {\varvec{A}}_i \right) \\&\quad = \mathrm {Pr}(V_i = 1|T_i, {\varvec{A}}_i)\mathrm {Pr}(D_{ki} = 1|V_i = 1, T_i, {\varvec{A}}_i) \\&\qquad + \, \mathrm {Pr}(V_i = 0|T_i, {\varvec{A}}_i)\mathrm {Pr}(D_{ki} = 1|V_i = 0, T_i, {\varvec{A}}_i) \\&\quad = \mathrm {Pr}(D_{ki} = 1|T_i, {\varvec{A}}_i) = \rho _{ki}. \end{aligned}$$
Therefore,
$$\begin{aligned}&{\mathbb {E}}\left\{ G_{i\ell r, \mathrm {MSI}}(\mu _0,{\varvec{\xi }}_0) \right\} \\&\quad = {\mathbb {E}}\left\{ D_{1i,\mathrm {MSI}}({\varvec{\xi }}_0) D_{2\ell , \mathrm {MSI}}({\varvec{\xi }}_0) D_{3r, \mathrm {MSI}}({\varvec{\xi }}_0) \left( I_{i\ell r} - \mu _0 \right) \right\} \\&\quad = {\mathbb {E}}\Big [ \left( I_{i\ell r} - \mu _0 \right) {\mathbb {E}}\left\{ D_{1i,\mathrm {MSI}}({\varvec{\xi }}_0) | T_i, {\varvec{A}}_i \right\} {\mathbb {E}}\left\{ D_{2\ell ,\mathrm {MSI}}({\varvec{\xi }}_0) | T_\ell , {\varvec{A}}_\ell \right\} \\&\qquad \times \, {\mathbb {E}}\left\{ D_{3r,\mathrm {MSI}}({\varvec{\xi }}_0) | T_r, {\varvec{A}}_r \right\} \Big ] \\&\quad = {\mathbb {E}}\left\{ \rho _{1i}\rho _{2\ell }\rho _{3r}(I_{i\ell r} - \mu _0) \right\} . \end{aligned}$$
IPW estimator
In this case,
$$\begin{aligned} {\mathbb {E}}\left( \frac{V_i D_{ki}}{\pi _i({\varvec{\xi }}_0)} \bigg |T_i, {\varvec{A}}_i \right)= & {} \frac{{\mathbb {E}}\left( V_i D_{ki}|T_i, {\varvec{A}}_i\right) }{\pi _i({\varvec{\xi }}_0)} \\= & {} \frac{{\mathbb {E}}\left\{ D_{ki} {\mathbb {E}}\left( V_i |D_{1i}, D_{2i}, T_i, {\varvec{A}}_i\right) \big | T_i, {\varvec{A}}_i\right\} }{\pi _i({\varvec{\xi }}_0)} \\= & {} \frac{{\mathbb {E}}\left( \pi _i D_{ki}|T_i, {\varvec{A}}_i\right) }{\pi _i} = \rho _{ki}. \end{aligned}$$
Thus,
$$\begin{aligned}&{\mathbb {E}}\left\{ G_{i\ell r, \mathrm {IPW}}(\mu _0, {\varvec{\xi }}_0)\right\} \\&\quad = {\mathbb {E}}\left\{ \frac{V_i V_\ell V_r D_{1i} D_{2\ell } D_{3r}}{\pi _i({\varvec{\xi }}_0) \pi _\ell ({\varvec{\xi }}_0) \pi _k({\varvec{\xi }}_0)} \left( I_{i\ell r} - \mu _0\right) \right\} \\&\quad = {\mathbb {E}}\Bigg \{ \left( I_{i\ell r} - \mu _0\right) {\mathbb {E}}\left( \frac{V_i D_{1i}}{\pi _i({\varvec{\xi }}_0)} \bigg | T_i, {\varvec{A}}_i\right) {\mathbb {E}}\left( \frac{V_\ell D_{2\ell }}{\pi _\ell ({\varvec{\xi }}_0)} \bigg |T_\ell , {\varvec{A}}_\ell \right) \\&\qquad \times \, {\mathbb {E}}\left( \frac{V_r D_{3r}}{\pi _r({\varvec{\xi }}_0)} \bigg | T_r, {\varvec{A}}_r\right) \Bigg \} \\&\quad = {\mathbb {E}}\left\{ \rho _{1i} \rho _{2\ell } \rho _{3r}(I_{i\ell r} - \mu _0) \right\} . \end{aligned}$$
PDR estimator
$$\begin{aligned}&{\mathbb {E}}\left\{ D_{ki, \mathrm {PDR}}({\varvec{\xi }}_0)|T_i, {\varvec{A}}_i\right\} \\&\quad = {\mathbb {E}}\left[ {\mathbb {E}}\left\{ \frac{V_i D_{ki}}{\pi _i({\varvec{\xi }}_0)} - \rho _{k(0)i}({\varvec{\xi }}_0)\left( \frac{V_i}{\pi _i({\varvec{\xi }}_0)} - 1\right) \bigg | D_{1i}, D_{2i}, T_i, {\varvec{A}}_i\right\} \bigg | T_i, {\varvec{A}}_i\right] \\&\quad = {\mathbb {E}}\Bigg \{D_{ki} {\mathbb {E}}\left( \frac{V_i}{\pi _i({\varvec{\xi }}_0)} \bigg | D_{1i}, D_{2i}, T_i, {\varvec{A}}_i\right) \\&\qquad - \, \rho _{k(0)i}({\varvec{\xi }}_0) {\mathbb {E}}\left( \frac{V_i}{\pi _i({\varvec{\xi }}_0)} - 1 \bigg | D_{1i}, D_{2i}, T_i, {\varvec{A}}_i\right) \bigg | T_i, {\varvec{A}}_i \Bigg \} \\&\quad = {\mathbb {E}}(D_{ki} | T_i, {\varvec{A}}_i) = \rho _{ki}. \end{aligned}$$
Hence,
$$\begin{aligned}&{\mathbb {E}}\left\{ G_{i\ell r, \mathrm {PDR}}(\mu _0,{\varvec{\xi }}_0)\right\} \\&\quad = {\mathbb {E}}\left\{ D_{1i,\mathrm {PDR}}({\varvec{\xi }}_0) D_{2\ell , \mathrm {PDR}}({\varvec{\xi }}_0) D_{3r, \mathrm {PDR}}({\varvec{\xi }}_0) \left( I_{i\ell r} - \mu _0 \right) \right\} \\&\quad = {\mathbb {E}}\Big [ \left( I_{i\ell r} - \mu _0 \right) {\mathbb {E}}\left\{ D_{1i,\mathrm {PDR}}({\varvec{\xi }}_0) | T_i, {\varvec{A}}_i \right\} {\mathbb {E}}\left\{ D_{2\ell ,\mathrm {PDR}}({\varvec{\xi }}_0) | T_\ell , {\varvec{A}}_\ell \right\} \\&\qquad \times \, {\mathbb {E}}\left\{ D_{3r,\mathrm {PDR}}({\varvec{\xi }}_0) | T_r, {\varvec{A}}_r \right\} \Big ] \\&\quad = {\mathbb {E}}\left\{ \rho _{1i}\rho _{2\ell }\rho _{3r}(I_{i\ell r} - \mu _0) \right\} . \end{aligned}$$

Table 4 Monte Carlo means (MCmean) for the maximum likelihood estimators of the elements of nuisance parameters $\varvec{\lambda }$, $\varvec{\tau }_\pi $, $\varvec{\tau }_{\rho _1}$ and $\varvec{\tau }_{\rho _2}$, when the true missing data mechanism is MAR

Full size table

Table 5 Monte Carlo means (MCmean), relative bias (Bias), Monte Carlo standard deviations (MCsd) and estimated standard deviations (Esd) for the proposed VUS estimators and the SPE estimator, when the true missing data mechanism is MAR

Full size table

Table 6 Monte Carlo means (MCmean) for the maximum likelihood estimators of the elements of nuisance parameters $\varvec{\lambda }$, $\varvec{\tau }_\pi $, $\varvec{\tau }_{\rho _1}$ and $\varvec{\tau }_{\rho _2}$, when the estimated models are misspecified

Full size table

Table 7 Monte Carlo means (MCmean), relative bias (Bias), Monte Carlo standard deviations (MCsd) and estimated standard deviations (Esd) for the proposed VUS estimators and the SPE estimator, when the estimated models are misspecified

Full size table

Appendix 3

Here, we present results of an additional simulation study, that covers the cases of: (i) missing at random (MAR) assumption for the missigness of the disease status; (ii) model misspecification in the estimation process.

In the study, the diagnostic test T, covariate A and the disease status ${\mathcal {D}}$ are generated as in scenario I of Sect. 4 of the paper. Moreover, the verification status V is:

(i)
generated as in scenario I with $h(T,A;{\varvec{\tau }}_\pi )=-1 + T - 1.2A$ and $\lambda _1 = \lambda _2 = 0$, i.e., under MAR assumption (verification rate roughly equal to 0.57);
(ii)
generated as in scenario I, but models for the verification and disease processes used in the fitting procedure are misspecified, because the estimated verification model uses as predictors $T^{1/3}$ and $\log |A|$ instead of T and A, respectively, and the estimated disease model uses $A^{1/3}$ instead of A.

In both (i) and (ii), the true VUS is 0.791. We consider three different values of sample size, i.e., 250, 500 and 1500. The number of replications in each simulation experiment is set to 1000.

Simulation results are given in Tables 4 and 5, for the case (i), and in Tables 6 and 7, for the case (ii). As expected, in case (i) results show some bias of the proposed VUS estimators when compared to the SPE estimator which is properly used here. However, the bias decreases when the sample size increases. In case (ii), all estimators appear to be biased, even when the sample size is large. Moreover, although in the considered case the bias seems to stay on acceptable levels, we expect that, given the nature of the estimators, it could be even dramatically high with other kinds of misspecification.

Rights and permissions

Reprints and permissions

About this article

Cite this article

To Duc, K., Chiogna, M., Adimari, G. et al. Estimation of the volume under the ROC surface in presence of nonignorable verification bias. Stat Methods Appl 28, 695–722 (2019). https://doi.org/10.1007/s10260-019-00451-3

Download citation

Accepted: 13 January 2019
Published: 28 January 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s10260-019-00451-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation of the volume under the ROC surface in presence of nonignorable verification bias

Abstract

Access this article

Similar content being viewed by others

bcROCsurface: an R package for correcting verification bias in estimation of the ROC surface and its volume for continuous diagnostic tests

Empirical Likelihood Confidence Intervals for the Difference of Areas Under Two Correlated ROC Curves

Reducing the overfitting in the gROC curve estimation

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

for the Alzheimer’s Disease Neuroimaging Initiative

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1

Proof of Theorem 1

Proof of Theorem 2

Appendix 2

Appendix 3

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimation of the volume under the ROC surface in presence of nonignorable verification bias

Abstract

Access this article

Similar content being viewed by others

bcROCsurface: an R package for correcting verification bias in estimation of the ROC surface and its volume for continuous diagnostic tests

Empirical Likelihood Confidence Intervals for the Difference of Areas Under Two Correlated ROC Curves

Reducing the overfitting in the gROC curve estimation

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

for the Alzheimer’s Disease Neuroimaging Initiative

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1

Proof of Theorem 1

Proof of Theorem 2

Appendix 2

Appendix 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation