Estimation of a Concordance Probability for Doubly Censored Time-to-Event Data

Hayashi, Kenichi; Shimizu, Yasutaka

doi:10.1007/s12561-018-9216-5

Estimation of a Concordance Probability for Doubly Censored Time-to-Event Data

Published: 22 March 2018

Volume 10, pages 546–567, (2018)
Cite this article

Statistics in Biosciences Aims and scope Submit manuscript

179 Accesses
Explore all metrics

Abstract

Evaluating the relationship between a response variable and explanatory variables is important to establish better statistical models. Concordance probability is one measure of this relationship and is often used in biomedical research. Concordance probability can be seen as an extension of the area under the receiver operating characteristic curve. In this study, we propose estimators of concordance probability for time-to-event data subject to double censoring. A doubly censored time-to-event response is observed when either left or right censoring may occur. In the presence of double censoring, existing estimators of concordance probability lack desirable properties such as consistency and asymptotic normality. The proposed estimators consist of estimators of the left-censoring and the right-censoring distributions as a weight for each pair of cases, and reduce to the existing estimators in special cases. We show the statistical properties of the proposed estimators and evaluate their performance via numerical experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint Modeling for Longitudinal and Interval-Censored Survival Data: Application to IMPI Multi-Center HIV/AIDS Clinical Trial

Estimation of the cumulative incidence function under multiple dependent and independent censoring mechanisms

Article 25 February 2017

Judith J. Lok, Shu Yang, … Michael D. Hughes

The importance of censoring in competing risks analysis of the subdistribution hazard

Article Open access 04 April 2017

Mark W. Donoghoe & Val Gebski

References

Chang MN (1990) Weak convergence of a self-consistent estimator of the survival function with doubly censored data. Ann Stat 18:391–404
Article MathSciNet Google Scholar
Chang MN, Yang GL (1987) Strong consistency of a nonparametric estimator of the survival function with doubly censored data. Ann Stat 15:1536–1547
Article MathSciNet Google Scholar
Cook NR (2007) Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation 115:928–935
Article Google Scholar
DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837–845
Article Google Scholar
Gönen M, Heller G (2005) Concordance probability and discriminatory power in proportional hazards regression. Biometrika 92:965–970
Article MathSciNet Google Scholar
Hara M, Sakata Y, Nakatani D, Suna S, Nishino M, Sato H, Kitamura T, Nanto S, Hamasaki T, Hori M, Komuro I (2016) Subclinical elevation of high-sensitive troponin T levels at the convalescent stage is associated with increased 5-year mortality after ST-elevation myocardial infarction. J Cardiol 67:314–320
Article Google Scholar
Harrell FE, Lee KL, Mark DB (1996) Tutorial in biostatistics: multivariate prognostic models: issues in developing models evaluating assumptions and adequacy and measuring and reducing errors. Stat Med 15:361–387
Article Google Scholar
Hilden J, Gerds TT (2014) A note on the evaluation of novel biomarkers: do not rely on integrated discrimination improvement and net reclassification index. Stat Med 33:3405–3414
Article MathSciNet Google Scholar
Ji S, Peng L, Cheng Y, HuiChuan L (2012) Quantile regression for doubly censored data. Biometrics 68:101–112
Article MathSciNet Google Scholar
Julià O, Gómez G (2011) Simultaneous marginal survival estimators when doubly censored data is present. Lifetime Data Anal 17:347–372
Article MathSciNet Google Scholar
Kyle RT, Rajkumar TV, Offord J, Larson D, Plevak M, Melton LJ III (2002) A long-terms study of prognosis in monoclonal gammopathy of undetermined significance. New Engl J Med 346:564–569
Article Google Scholar
Kim Y, Kim B, Jang W (2010) Asymptotic properties of the maximum likelihood estimator for the proportional hazards model with doubly censored data. J Multivar Anal 101:1339–1351
Article MathSciNet Google Scholar
Klein JP, Moeschberger ML (2003) Survival analysis: techniques for censored and truncated data. Springer, New York
MATH Google Scholar
Mantel N (1967) Ranking procedures for arbitrarily restricted observation. Biometrics 23:65–78
Article Google Scholar
Nolan D, Pollard D (1987) $U$-processes: rates of convergence. Ann Stat 15:780–799
Article MathSciNet Google Scholar
Nolan D, Pollard D (1988) Functional limit theorems for $U$-processes. Ann Stat 16:1291–1298
MathSciNet MATH Google Scholar
Pencina MJ, D’Agostino RB (2004) Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat Med 23:2109–2123
Article Google Scholar
Pepe MS (2003) The statistical evaluation of medical tests for classification and prediction. Oxford University Press, New York
MATH Google Scholar
Pepe MS, Janes H, Li CI (2014) Net risk reclassification p values: valid or misleading? J Natl Cancer Inst 106:dju041
Article Google Scholar
Pepe MS, Thompson LT (2000) Combining diagnostic test results to increasing accuracy. Biostatistics 1:123–140
Article Google Scholar
Peto R (1973) Experimental survival curves for interval-censored data. J R Stat Soc C 22:86–91
Google Scholar
R Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Ann Stat 4:1317–1334. http://www.R-project.org/
Tsai WY, Crowley J (1985) A large sample study of generalized maximum likelihood estimators from incomplete data via self-consistency. Ann Stat 4:1317–1334
Article MathSciNet Google Scholar
Turnbull BW (1974) Nonparametric estimation of a survivorship function with doubly censored data. J Am Stat Assoc 69:169–173
Article MathSciNet Google Scholar
Turnbull BW (1976) The empirical distribution function with arbitrarily grouped censored and truncated data. J R Stat Soc B 38:290–295
MathSciNet MATH Google Scholar
Uno H, Cai H, Pencina MJ, D’Agostino RB, Wei LJ (2011) On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med 30:1105–1117
MathSciNet Google Scholar
Zhang C-H, Li X (1996) Linear regression with doubly censored data. Ann Stat 24:2720–2743
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors are grateful to Drs. Yasuhiko Sakata, Daisaku Nakatani, and Yasushi Sakata for allowing us to utilize the dataset used in [6]. The authors would like to acknowledge the associate editor and anonymous reviewers for very useful comments and suggestions that improve the presentation of the paper. K. Hayashi is supported by JSPS KAKENHI (Grant-in-Aid for Scientific Research) Grant Number 15K15950. S. Shimizu is supported by JSPS KAKENHI (Grant-in-Aid for Scientific Research) Grant Number 70423085.

Author information

Authors and Affiliations

Department of Mathematics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
Kenichi Hayashi
Department of Applied Mathematics, Waseda University, 3-4-1 Okubo, Tokyo, 169-8555, Japan
Yasutaka Shimizu

Authors

Kenichi Hayashi
View author publications
You can also search for this author in PubMed Google Scholar
Yasutaka Shimizu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kenichi Hayashi.

Appendix

We shall show the consistency and asymptotic normality of $\widehat{C}_{\mathrm{D}}^{(1)}(\hat{h})$ in this section. Since we can also show these properties for $\widehat{C}_{\mathrm{D}}^{(2)}(h)$ in the same way, the corresponding proofs are omitted.

To show the weak consistency, note that

$$\begin{aligned}&\mathrm{E}\left[ \varGamma _i\varDelta _i\varGamma _j\left\{ F(X_i)\bar{G}(X_i)^2F(X_j)\right\} ^{-1}\mathbb {I}_{{\varvec{\tau }}}\left\{ X_i<X_j\right\} \right] \\&\quad = \mathrm{E}\left[ \mathbb {I}_{{\varvec{\tau }}}\left\{ L_i<T_i\right\} \mathbb {I}_{{\varvec{\tau }}}\left\{ T_i<R_i\right\} \mathbb {I}_{{\varvec{\tau }}}\left\{ L_j<T_j\right\} \left\{ F(X_i)\bar{G}(X_i)^2F(X_j)\right\} ^{-1}\mathbb {I}_{{\varvec{\tau }}}\left\{ T_i<\widetilde{X}_j\right\} \right] \\&\quad = \mathrm{E}\left[ \mathbb {I}_{{\varvec{\tau }}}\left\{ L_i<T_i\right\} \mathbb {I}_{{\varvec{\tau }}}\left\{ T_i<R_i\right\} \mathbb {I}_{{\varvec{\tau }}}\left\{ L_j<\widetilde{X}_j\right\} \left\{ F(T_i)\bar{G}(T_i)^2F(\widetilde{X}_j)\right\} ^{-1}\mathbb {I}_{{\varvec{\tau }}}\left\{ T_i<\widetilde{X}_j\right\} \right] \\&\quad = \mathrm{E}\Big [\left\{ F(T_i)\bar{G}(T_i)^2F(\widetilde{X}_j)\right\} ^{-1}\mathbb {I}_{{\varvec{\tau }}}\left\{ T_i<\widetilde{X}_j\right\} \\&\quad \mathrm{E}\left[ \mathbb {I}_{{\varvec{\tau }}}\left\{ L_i<T_i\right\} \mathbb {I}_{{\varvec{\tau }}}\left\{ T_i<R_i\right\} \mathbb {I}_{{\varvec{\tau }}}\left\{ L_j<\widetilde{X}_j\right\} \Big |T_i,\widetilde{X}_j\right] \Big ]\\&\quad = \mathrm{E}\left[ \left\{ F(T_i)\bar{G}(T_i)^2F(\widetilde{X}_j)\right\} ^{-1}\mathbb {I}_{{\varvec{\tau }}}\left\{ T_i<\widetilde{X}_j\right\} F(T_i)\bar{G}(T_i)F(\widetilde{X}_j)\right] \ \ (\mathrm{by independence})\\&\quad = \mathrm{E}\left[ \bar{G}(T_i)^{-1}\bar{G}(T_i)\mathbb {I}_{{\varvec{\tau }}}\left\{ T_i<T_j\right\} \right] = \mathrm{E}\left[ \mathbb {I}_{{\varvec{\tau }}}\left\{ T_i<T_j\right\} \right] , \end{aligned}$$

where $\widetilde{X}_j=\max (T_j,R_j)$. Then, the denominator of $\widehat{C}_{\mathrm{D}}^{(1)}(h)$,

$\displaystyle \frac{1}{n(n-1)}\sum _{i,j}\varGamma _i\varDelta _i\varGamma _j\mathbb {I}_{{\varvec{\tau }}}\left\{ X_i<X_j\right\} \left( \widehat{F}(X_i)\widehat{\bar{G}}(X_i)^2\widehat{F}(X_j)\right) ^{-1}$, converges to $\mathrm{E}\, [ \mathbb {I}_{{\varvec{\tau }}}\{T_1<T_2\}]$ in probability by the uniform consistency of $\widehat{F}(\cdot )$ and $\widehat{\bar{G}}(\cdot )$ [2] and a law of large numbers for U-processes [15]. Similarly, it can also be shown that

$$\begin{aligned}&\frac{1}{n(n-1)}\sum _{i,j}\varGamma _i\varDelta _i\varGamma _j\mathbb {I}\left\{ \hat{h}({\varvec{Z}}_i)>\hat{h}({\varvec{Z}}_j)\right\} \mathbb {I}_{{\varvec{\tau }}}\left\{ X_i<X_j\right\} \left( \widehat{F}(X_i)\widehat{\bar{G}}(X_i)^2\widehat{F}(X_j)\right) ^{-1} \\&\quad \overset{p}{\rightarrow }\mathrm{E}\left[ \mathbb {I}\left\{ h^*({\varvec{Z}}_1)>h^*({\varvec{Z}}_2)\right\} \mathbb {I}_{{\varvec{\tau }}}\left\{ T_1<T_2\right\} \right] . \end{aligned}$$

Then, the consistency of $\widehat{C}_{\mathrm{D}}^{(1)}(\hat{h})$ follows from Slutsky’s lemma.

To show the asymptotic normality of $\widehat{C}_{\mathrm{D}}^{(1)}(\hat{h})$, we first consider asymptotic behavior of the estimator with fixed $h=h(\cdot ;{\varvec{\beta }})$. For notational brevity, hereafter the following symbols are used.

$$\begin{aligned}&\varLambda _{ij}=\varGamma _i\varDelta _i\varGamma _j,\ J_{ij}(h)=\mathbb {I}\left\{ h({\varvec{Z}}_i)>h({\varvec{Z}}_j)\right\} ,\ I_{ij}^{{\varvec{\tau }}}=\mathbb {I}_{{\varvec{\tau }}}\left\{ X_i<X_j\right\} ,\\&W_{ij}=\left\{ F(X_i)\bar{G}(X_i)^2F(X_j)\right\} ^{-1}, \widehat{W}_{ij}=\left\{ \widehat{F}(X_i)\widehat{\bar{G}}(X_i)^2\widehat{F}(X_j)\right\} ^{-1}. \end{aligned}$$

For $C^*(h)=\mathrm{P}\left[ h({\varvec{Z}}_1)>h({\varvec{Z}}_2)|T_1<T_2, T_1<\tau _{\mathrm{U}}, \tau _{\mathrm{L}}<T_2\right] $, note that

$$\begin{aligned} \mathcal {W}(h):= & {} \sqrt{n}\left( \widehat{C}_{\mathrm{D}}^{(1)}(h)-C^*(h)\right) \\= & {} \sqrt{n}\left( \frac{\sum _{i,j} \varLambda _{ij}I_{ij}^{{\varvec{\tau }}}\{J_{ij}(h)-C^*(h)\} W_{ij}}{\sum _{i,j}\varLambda _{ij}I_{ij}^{{\varvec{\tau }}}\widehat{W}_{ij}}\right) \\&+ \,\sqrt{n}\left( \frac{\sum _{i,j} \varLambda _{ij}I_{ij}^{{\varvec{\tau }}}\{J_{ij}(h)-C^*(h)\} (\widehat{W}_{ij}-W_{ij})}{\sum _{i,j}\varLambda _{ij}I_{ij}^{{\varvec{\tau }}}\widehat{W}_{ij}}\right) \\=: & {} \mathcal {W}_1(h) + \mathcal {W}_2(h). \end{aligned}$$

Remark that $\mathcal {W}_1(h)$ and $\mathcal {W}_2(h)$ correspond to the variance component of the U-statistic and weights $\widehat{W}_{ij}$, respectively. It follows from the uniform consistency of $\widehat{F}$ and $\widehat{\bar{G}}$ and a functional limit theorem for U-processes [16] that

$$\begin{aligned} \mathcal {W}_1(h) = n^{-3/2}\frac{\sum _{i,j}\varLambda _{ij}I_{ij}^{{\varvec{\tau }}}(J_{ij}(h)-C^*(h))}{p({\varvec{\tau }})} + o_p(1),\quad n\rightarrow \infty , \end{aligned}$$

(4)

where $p({\varvec{\tau }})=\mathrm{P}\left[ T_1<T_2,T_1<\tau _{\mathrm{U}}, \tau _{\mathrm{L}}<T_2\right] $. Moreover note that

$$\begin{aligned} \mathcal {W}_2(h) = \int _{0}^{\tau _{\mathrm{U}}}\int _{\tau _{\mathrm{L}}}^{M}\sqrt{n}\left( \frac{\widehat{W}(s,t)}{W(s,t)}-1\right) \mathrm{d}{\hat{\gamma }}(s,t,h), \end{aligned}$$

(5)

where $W(s,t)=\left( F(s)\bar{G}(s)^2F(t)\right) ^{-1}$, $\widehat{W}(s,t)=\left( \widehat{F}(s)\widehat{\bar{G}}(s)^2\widehat{F}(t)\right) ^{-1}$, and

$$\begin{aligned} {\hat{\gamma }}(s,t,h)=\sum _{i,j}\varLambda _{ij}I_{ij}^{{\varvec{\tau }}}(J_{ij}(h)-C^*(h))W_{ij}\mathbb {I}\left\{ X_i\le s,X_j\le t\right\} \Big /\sum _{i,j}\varLambda _{ij}I_{ij}^{{\varvec{\tau }}}\widehat{W}_{ij}. \end{aligned}$$

By a uniform law of large numbers for U-processes [15] and the uniform consistency of $\widehat{F}$ and $\widehat{\bar{G}}$, we obtain that

$$\begin{aligned} \sup \{ |{\hat{\gamma }}(s,t,h)-\gamma (s,t,h)|\,:\, s\in [0,\tau _{\mathrm{U}}], t\in [\tau _{\mathrm{L}},M], {\varvec{\beta }}\}\overset{p}{\rightarrow }0, \end{aligned}$$

(6)

where $\gamma (s,t,h)=p((s,t)')(\widehat{C}_{\mathrm{D}}^{(1)}(h)-C^*(h))/p({\varvec{\tau }})$.

Therefore, the asymptotic distribution for $\widehat{C}_{\mathrm{D}}^{(1)}(h)$ is obtained if we show that $\sqrt{n}\left( \frac{\widehat{W}(s,t)}{W(s,t)}-1\right) $ converges in distribution to a zero-mean Gaussian process (indexed by s and t).

Note that

$$\begin{aligned} \sqrt{n}\left( \frac{\widehat{W}(s,t)}{W(s,t)}-1\right)= & {} \sqrt{n}\left( \frac{\widehat{F}(s)-F(s)}{F(s)}\right) \frac{F(s)\bar{G}(s)^2F(t)}{\widehat{F}(s)\widehat{\bar{G}}(s)^2\widehat{F}(t)}\\&+\,\sqrt{n}\left( \frac{\widehat{\bar{G}}(s)^2-\bar{G}(s)^2}{\bar{G}(s)^2}\right) \frac{\bar{G}(s)^2F(t)}{\widehat{\bar{G}}(s)^2\widehat{F}(t)}\nonumber \\&+\,\sqrt{n}\left( \frac{\widehat{F}(t)-F(t)}{F(t)}\right) \frac{F(t)}{\widehat{F}(t)}\nonumber \\= & {} \sqrt{n}\left( \frac{\widehat{F}(s)-F(s)}{F(s)} +2\frac{\widehat{\bar{G}}(s)-\bar{G}(s)}{\bar{G}(s)} +\frac{\widehat{F}(t)-F(t)}{F(t)}\right) \\&+\,\, o_p(1). \end{aligned}$$

As in Remark, $\sqrt{n}(\widehat{F}-F,\widehat{\bar{G}}-\bar{G})$ jointly converges in distribution to a two-dimensional zero-mean Gaussian process. Hence any finite-dimensional distributions of $\sqrt{n}(\widehat{F}-F,\widehat{\bar{G}}-\bar{G})$ converges in distribution to the corresponding multidimensional normal distribution. To complete the proof, we consider the asymptotic expansion of $\mathcal {W}(\hat{h})$ with respect to ${\hat{\beta }}$:

$$\begin{aligned} \mathcal {W}(\hat{h})=\mathcal {W}(h^*)+\nabla C^*(h^*)^\top \sqrt{n}(\hat{{\varvec{\beta }}}-{\varvec{\beta }}^*)+o_p(1), \end{aligned}$$

where $\nabla C^*(h^*)$ is the partial derivative of $C^*(h)$ with respect to ${\varvec{\beta }}$ evaluated at ${\varvec{\beta }}={\varvec{\beta }}^*$. By the same argument as in Appendix of [26]; see especially (A1) and (A2) in [26], we can conclude the asymptotic normality of ${{\mathcal {W}}}(\hat{h})$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hayashi, K., Shimizu, Y. Estimation of a Concordance Probability for Doubly Censored Time-to-Event Data. Stat Biosci 10, 546–567 (2018). https://doi.org/10.1007/s12561-018-9216-5

Download citation

Received: 26 December 2016
Accepted: 15 March 2018
Published: 22 March 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s12561-018-9216-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation of a Concordance Probability for Doubly Censored Time-to-Event Data

Abstract

Access this article

Similar content being viewed by others

Joint Modeling for Longitudinal and Interval-Censored Survival Data: Application to IMPI Multi-Center HIV/AIDS Clinical Trial

Estimation of the cumulative incidence function under multiple dependent and independent censoring mechanisms

The importance of censoring in competing risks analysis of the subdistribution hazard

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimation of a Concordance Probability for Doubly Censored Time-to-Event Data

Abstract

Access this article

Similar content being viewed by others

Joint Modeling for Longitudinal and Interval-Censored Survival Data: Application to IMPI Multi-Center HIV/AIDS Clinical Trial

Estimation of the cumulative incidence function under multiple dependent and independent censoring mechanisms

The importance of censoring in competing risks analysis of the subdistribution hazard

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation