## Abstract

We consider the survival function for univariate right-censored event time data, when a cure fraction is present. This means that the population consists of two parts: the cured or non-susceptible group, who will never experience the event of interest versus the non-cured or susceptible group, who will undergo the event of interest when followed up sufficiently long. When modeling the data, a parametric form is often imposed on the survival function of the susceptible group. In this paper, we construct a simple novel test to verify the aptness of the assumed parametric form. To this end, we contrast the parametric fit with the nonparametric fit based on a rescaled Kaplan–Meier estimator. The asymptotic distribution of the two estimators and of the test statistic are established. The latter depends on unknown parameters, hence a bootstrap procedure is applied to approximate the critical values of the test. An extensive simulation study reveals the good finite sample performance of the developed test. To illustrate the practical use, the test is also applied on two real-life data sets.

This is a preview of subscription content, access via your institution.

## References

Amico M, Van Keilegom I (2018) Cure models in survival analysis. Annu Rev Stat Appl 5:311–342

Chen X, Linton O, Van Keilegom I (2003) Estimation of semiparametric models when the criterion function is not smooth. Econometrica 71:1591–1608

Hollander M, Peña E (1992) A chi-squared goodness-of-fit test for randomly censored data. J Am Stat Assoc 87:458–463

Hosmer D, Lemeshow S, May S (2008) Applied survival analysis. Regression modeling of time-to-event data. Wiley, London

Kaplan E, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481

Kim N (2017) Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters. Commun Stat Appl Methods 24:519–531

Klein J, Moeschberger M (1997) Survival analysis, techniques for censored and truncated data. Springer, Berlin

Koziol J (1980) Goodness-of-fit tests for randomly censored data. Biometrika 67:693–696

Koziol J, Green S (1976) A Cramér-von Mises statistic for randomly censored data. Biometrika 63:465–474

Lo SH, Singh K (1986) The product-limit estimator and the bootstrap: some asymptotic representations. Probab Theory Relat Fields 71:455–465

Maller R, Zhou S (1992) Estimating the proportion of immunes in a censored sample. Biometrika 79:731–739

Maller R, Zhou S (1996) Survival analysis with long term survivors. Wiley, London

Pardo-Fernández J, Van Keilegom I, González-Manteiga W (2007) Goodness-of-fit tests for parametric models in censored regression. Can J Stat 35:249–264

Peng Y, Taylor JMG (2014) Cure models. In: Klein J, van Houwelingen H, Ibrahim JG, Scheike TH (eds) Handbook of survival analysis, handbooks of modern statistical methods series, vol 6. Chapman & Hall, Boca Raton, pp 113–134

Pettitt A, Stephens M (1976) Modified Cramér-von Mises statistics for censored data. Biometrika 63:291–298

Rao C (1965) Linear statistical inference and its applications. Wiley, London

Sánchez-Sellero C, González-Manteiga W, Van Keilegom I (2005) Uniform representation of product-limit integrals with applications. Scand J Stat 32:563–581

Serfling R (1980) Approximation theorems of mathematical statistics. Wiley, London

Van der Vaart A, Wellner J (1996) Weak convergence and empirical processes. Springer, Berlin

## Acknowledgements

This work was supported by the European Research Council [2016-2021, Horizon 2020/ERC Grant No. 694409], by the Interuniversity Attraction Poles Program [IAP-network P7/06] of the Belgian Science Policy Office, and the Research Foundation Flanders (FWO), Scientific Research Community on ‘Asymptotic Theory for Multidimensional Statistics’ [W000817N]. The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation Flanders (FWO) and the Flemish Government—department EWI.

## Author information

### Affiliations

### Corresponding author

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Electronic supplementary material

Below is the link to the electronic supplementary material.

## Appendix : Proofs of the asymptotic results

### Appendix : Proofs of the asymptotic results

### Proof

(of Proposition 1) As noted just before the statement of this proposition, the process \(n^{1/2} ({{\hat{S}}}(\cdot )-S(\cdot ))\) is constant from \(\tau _{F_1}\) on. Therefore, it suffices to study this process on \([0,\tau _{F_1}] \cap [0,\infty )\). To this end, we will use the results in Sánchez-Sellero et al. (2005). Although the latter paper considers the case where no cure fraction is present, it can be easily seen that the presence of a cure fraction does not alter the results, since inference is based on survival data available on \([0,\tau _H] \cap [0,\infty )\), so the shape of the survival function *S*(*t*) for *t* larger than \(\tau _H\) has no impact on the results. In the absence of covariates and when the data are not subject to truncation, Theorem 1 in the latter paper provides an independent and identically distributed representation for Kaplan–Meier integrals of the form \(\int \varphi (s) \, d({{\hat{S}}}(s)-S(s))\) uniformly over functions \(\varphi \) belonging to VC-subgraph classes of functions. Consider the family \(\{\varphi _t(s) = I(s \le t) : t \in [0,\tau _{F_1}] \cap [0,\infty )\}\). This is a VC-subgraph class, since the collection of all subgraphs of these functions is easily seen to be the set of all rectangles of the form \([0,t] \times [0,1]\) with \(0 \le t \le \tau _{F_1}\), and this is a VC-class of sets, since its VC-index equals 2, so finite. See Van der Vaart and Wellner (1996), page 141, for more details about VC-subgraph classes. Hence, condition \((\varphi 1)\) in Sánchez-Sellero et al. (2005) is satisfied. Condition \((\varphi 2)\) in the latter paper is easily seen to hold since the family \(\varphi _t(s)\) is uniformly bounded by the envelope \(I(s \le \tau _{F_1})\) which satisfies the conditions in \((\varphi 2)\) thanks to assumption (A6) and because \(\phi < 1\). Similarly, condition \((\varLambda )\) in Sánchez-Sellero et al. (2005) is satisfied thanks to assumption (A6) and because \(\phi < 1\). Since \(\int \varphi _t(s) \, d({{\hat{S}}}(s)-S(s)) = {{\hat{S}}}(t) - S(t)\), it follows from Theorem 1 in Sánchez-Sellero et al. (2005) that

where \(\sup _{t \in [0,\tau _{F_1}] \cap [0,\infty )} |r_n(t)| = o_P(n^{-1/2})\), where the formula of \(\xi ({{\tilde{T}}}_i,\delta _i,t)\) is obtained after some straightforward algebraic calculations. In particular, this shows that the process \(n^{1/2}({{\hat{S}}}(\cdot ) - S(\cdot ))\) converges weakly in \(\ell ([0,\tau _{F_1}] \cap [0,\infty ))\), since the class \(\{(y,\delta ) \rightarrow \xi (y,\delta ,t) : t \in [0,\tau _{F_1}] \cap [0,\infty )\}\) is a Donsker class thanks to Lemma 1. \(\square \)

### Proof

(of Proposition 2) We prove the asymptotic normality of \({{\hat{\theta }}}\) by checking the conditions of Theorems 1 and 2 in Chen et al. (2003). Theorem 1 gives conditions under which \({{\hat{\theta }}}\) is weakly consistent, which is required for the asymptotic normality that is established in Theorem 2. Let us check the conditions of these two theorems, one by one. Condition (1.1) is satisfied by definition of the estimator \({{\hat{\theta }}}\), whereas condition (1.2) holds true thanks to assumption (A2). The continuity of \(M(\theta ,\phi )\) with respect to \(\phi \) in \(\phi =\phi _0\) stated in condition (1.3) is obviously satisfied. For condition (1.4) we know that \({{\hat{\phi }}}-\phi _0 = O_P(n^{-1/2}) = o_P(1)\), because of (3) and Proposition 1. Finally, condition (1.5) holds true thanks to (A7). This shows that \({{\hat{\theta }}}-\theta _0 \rightarrow 0\) in probability.

Next, we verify the conditions of Theorem 2 in Chen et al. (2003). Condition (2.1) is verified by the definition of \({{\hat{\theta }}}\). Condition (2.2) is satisfied thanks to assumption (A3). Next, for condition (2.3) note that

Hence, it is easily seen that (2.3)(i) is satisfied, whereas for (2.3)(ii) we have that

whenever \(\Vert \phi -\phi _0\Vert \le \delta _n\) and \(\Vert \theta -\theta _0\Vert \le \delta _n\) with \(\delta _n \rightarrow 0\). Condition (2.4) follows from the fact that \({{\hat{\phi }}}-\phi _0 = O_P(n^{-1/2}) = o_P(n^{-1/4})\), whereas condition (2.5) follows from (A7). Finally, for (2.6) note that

by Proposition 1, since \({{\hat{\phi }}}-\phi _0 = -({{\hat{S}}}(\tau _{F_1}-) - S(\tau _{F_1}-))\), and this converges to a zero-mean normal distribution with variance-covariance matrix given by *V*. It now follows from the proof of Theorem 2 in Chen et al. (2003) that

which converges in distribution to a normal random variable with mean zero and variance-covariance matrix \(\varDelta ^{-1} V \varDelta ^{-1}\). \(\square \)

### Proof

(of Theorem 1) (a) We decompose our process \(n^{1/2} \big ({{\hat{S}}}_1(\cdot ) - S_{1,{{\hat{\theta }}}}(\cdot )\big )\) under \({{{\mathcal {H}}}}_0\) as follows:

Note that

uniformly in \(t \in B\), provided \({{\hat{\phi }}}-\phi _0 = O_P(n^{-1/2})\) and \(\sup _{t \in B} |{{\hat{S}}}(t) - S(t)| = O_P(n^{-1/2})\), which follows from Proposition 1 and the continuous mapping theorem. This combined with (5) and the linear representation of \({{\hat{\theta }}} - \theta _0\) given in Proposition 2 yields

The result of part (a) now follows by combining (7) and Propositions 1 and 2.

(b) It suffices to show that the class \(\{(y,\delta ) \rightarrow \eta (y,\delta ,t) : t \in B\}\) is Donsker. Since sums of Donsker class are again Donsker [see Example 2.10.7 in Van der Vaart and Wellner (1996)], we need to show that the classes corresponding to each of the three terms in the definition of \(\eta (y,\delta ,t)\) are Donsker. For the first term, we refer to Lemma 1, and the fact that the function \(\xi (y,\delta ,t)\) is constant for \(t \ge \tau _{F_1}\). The second and third term are both a product of a bounded function depending on *t* but not on *y* and \(\delta \) (thanks to assumption (A9)), and a function that is independent of *t* and which has finite variance (thanks to assumptions (A6) and (A8)). Hence, it is easy to see that these classes are also Donsker. \(\square \)

### Lemma 1

Assume (A4)–(A6). Then, the class \(\{(y,\delta ) \rightarrow \xi (y,\delta ,t) : t \in [0,\tau _{F_1}] \cap [0,\infty )\}\) is a *P*-Donsker class, where *P* is the joint probability measure of \(({{\tilde{T}}},\delta )\).

### Proof

We consider the two terms of the function \(\xi (y,\delta ,t)\) separately, defining two subclasses of functions denoted by \({{{\mathcal {F}}}}_1\) and \({{{\mathcal {F}}}}_2\). For \({{{\mathcal {F}}}}_2\), note that \(\int _0^{y \wedge t} \frac{dH^1(z)}{(1-H(z))^2}\) is an increasing and bounded function of *t* thanks to assumption (A6), and hence by Theorem 2.7.5 in Van der Vaart and Wellner (1996), the class \(\{y \rightarrow \int _0^{y \wedge t} \frac{dH^1(z)}{(1-H(z))^2} : t \in [0,\tau _{F_1}] \cap [0,\infty )\}\) is Donsker. Multiplying these functions by \(1-F(t)\) does not alter the Donsker property, since \(1-F(t)\) is a deterministic and bounded function. Hence, \({{{\mathcal {F}}}}_2\) is Donsker. Next, for the class \({{{\mathcal {F}}}}_1\), note that

thanks to assumption (A6). Hence, for a given \(\varepsilon >0\), we can divide the interval \([0,\tau _{F_1}] \cap [0,\infty )\) into subintervals \([t_j,t_{j+1}]\), \(j=1,\ldots ,K\varepsilon ^{-2}\), such that

for each *j*. This shows that the bracketing number \(N_{[\,]}(\varepsilon ,{{{\mathcal {F}}}}_1,P)\) is bounded by \(K \varepsilon ^{-2}\). Moreover, the envelope function \(I({{\tilde{T}}} \le \tau _{F_1}) \delta (1-H({{\tilde{T}}}))^{-1}\) has a weak second moment thanks to assumption (A6). Hence, the class \({{{\mathcal {F}}}}_1\) is also Donsker (see Theorem 2.5.6 in Van der Vaart and Wellner (1996)). Since sums of Donsker classes are again Donsker [see Example 2.10.7 in Van der Vaart and Wellner (1996)], the result follows. \(\square \)

### Proof

(of Corollary 1) The proof relies on the Helly–Bray Theorem [see e.g., p. 97 in Rao (1965)]. For more details, we refer, e.g., to the proof of Corollary 4 in Pardo-Fernández et al. (2007), in which the convergence of a Cramér-von Mises statistic is shown that has the same structure as our statistic.

### Proof

(of Corollary 2) Under \({\mathcal {H}}_{1n}\), we consider the following decomposition:

where \({\tilde{\theta }}_{0,n}\) maximizes \(E_{{\mathcal {H}}_{1n}}(\log L(\theta ,\phi _0))\) with respect to \(\theta \), and where \(E_{{\mathcal {H}}_{1n}}\) denotes the expected value under \({\mathcal {H}}_{1n}\). Hence, \(M_{{\mathcal {H}}_{1n}}({\tilde{\theta }}_{0,n},\phi _0) = 0\), where \(M_{{\mathcal {H}}_{1n}}(\theta ,\phi ) = E_{{\mathcal {H}}_{1n}}[m({{\tilde{T}}},\delta ,\theta ,\phi )]\). We will now treat the four terms \(T_1,\ldots ,T_4\) separately.

The term \(T_1\) can be decomposed as follows:

similarly as in the proof of Theorem 1. A linear expansion for the expression \({{\hat{S}}}(t) - S(t)\) can be derived in a similar way as in the proof of Proposition 1, except that the data \(({{\tilde{T}}}_i,\delta _i)\), \(i=1,\ldots ,n\), are now a triangular array. This leads to

Next, for \(T_2(t)\) note that \(T_2(t) = - \frac{\partial S_{1,\theta _0}(t)}{\partial \theta } ({\tilde{\theta }}_{0,n}-\theta _0) + o_P(\Vert {\tilde{\theta }}_{0,n}-\theta _0\Vert )\). Note that \(0 = M_{{\mathcal {H}}_{1n}}({\tilde{\theta }}_{0,n},\phi _0) = M_{{\mathcal {H}}_{1n}}(\theta _0,\phi _0) + ({\tilde{\theta }}_{0,n}-\theta _0) \frac{\partial M_{{\mathcal {H}}_{1n}}}{\partial \theta }(\theta _0,\phi _0) + o_P(\Vert {\tilde{\theta }}_{0,n}-\theta _0\Vert )\). Hence, using the notation \(H_{{\mathcal {H}}_{1n}}^j(t) = P_{{\mathcal {H}}_{1n}}({{\tilde{T}}} \le t, \delta =j)\), \(j=0,1\), where \(P_{{\mathcal {H}}_{1n}}\) denotes the probability under \({\mathcal {H}}_{1n}\), we have

where \(E_S[m({{\tilde{T}}},\delta ,\theta _0,\phi _0)]\) denotes the expected value assuming that the survival function of *T* equals *S*, and where

which we abbreviate by \((1-n^{-1/2} c) S_{\theta _0}(t) + n^{-1/2} c {{\tilde{S}}}(t)\). Noting that \(E_{S_{\theta _0}}[m({{\tilde{T}}},\delta ,\theta _0,\phi _0)]=0\), (8) equals

Let us now consider the term \(T_3\). Following similar steps as in the proof of Proposition 2, it follows that

where \(\varDelta _{{\mathcal {H}}_{1n}}\) is the \(p \times p\) matrix of partial derivatives of \(M_{{\mathcal {H}}_{1n}}(\theta ,\phi )\) evaluated at \(({\tilde{\theta }}_{0,n},\phi _0)\), and \(\varGamma _{{\mathcal {H}}_{1n}}({\tilde{\theta }}_{0,n},\phi _0) = \frac{\partial }{\partial \phi } M_{{\mathcal {H}}_{1n}}({\tilde{\theta }}_{0,n},\phi _0)\).

Finally, the term \(T_4\) is a bias term, which cannot be simplified further. This shows the result. \(\square \)

## Rights and permissions

## About this article

### Cite this article

Geerdens, C., Janssen, P. & Van Keilegom, I. Goodness-of-fit test for a parametric survival function with cure fraction.
*TEST* **29, **768–792 (2020). https://doi.org/10.1007/s11749-019-00680-4

Received:

Accepted:

Published:

Issue Date:

### Keywords

- Bootstrap
- Cramér-von Mises
- Cure fraction
- Kaplan–Meier
- Parametric models
- Weak convergence

### Mathematics Subject Classification

- 62N01
- 62N02
- 62N03