Bootstrap and permutation rank tests for proportional hazards under right censoring

Abstract

We address the testing problem of proportional hazards in the two-sample survival setting allowing right censoring, i.e., we check whether the famous Cox model is underlying. Although there are many test proposals for this problem, only a few papers suggest how to improve the performance for small sample sizes. In this paper, we do exactly this by carrying out our test as a permutation as well as a wild bootstrap test. The asymptotic properties of our test, namely asymptotic exactness under the null and consistency, can be transferred to both resampling versions. Various simulations for small sample sizes reveal an actual improvement of the empirical size and a reasonable power performance when using the resampling versions. Moreover, the resampling tests perform better than the existing tests of Gill and Schumacher and Grambsch and Therneau . The tests’ practical applicability is illustrated by discussing real data examples.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3

References

  1. Andersen PK (1983) Comparing survival distributions via hazard ratio estimates. Scand J Stat 10:77–85

    MathSciNet  MATH  Google Scholar 

  2. Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer, New York

    Google Scholar 

  3. Begun JM, Reid N (1983) Estimating te relative risk with censored data. J Am Stat Assoc 78:337–341

    MATH  Article  Google Scholar 

  4. Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependence. Ann Stat 29:1165–1188

    MATH  Article  Google Scholar 

  5. Beyersmann J, Di Termini S, Pauly M (2013) Weak convergence of the wild bootstrap for the Aalen–Johansen estimator of the cumulative incidence function of a competing risk. Scand J Stat 40:387–402

    MathSciNet  MATH  Article  Google Scholar 

  6. Bluhmki T, Dobler D, Beyersmann J, Pauly M (2019) The wild bootstrap for multivariate Nelson–Aalen estimators. Lifetime Data Anal 25:97–127

    MathSciNet  MATH  Article  Google Scholar 

  7. Collett D (2015) Modelling survival data in medical research. Wiley, New York

    Google Scholar 

  8. Cox DR (1972) Regression models and life-tables. J R Stat Soc Ser B 34:187–220

    MathSciNet  MATH  Google Scholar 

  9. Chen W, Wang D, Li Y (2015) A class of tests of proportional hazards assumption for left-truncated and right-censored data. J Appl Stat 42:2307–2320

    MathSciNet  Article  Google Scholar 

  10. Chung EY, Romano JP (2013) Exact and asymptotically robust permutation tests. Ann Stat 41:484–507

    MathSciNet  MATH  Article  Google Scholar 

  11. Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26

    MathSciNet  MATH  Article  Google Scholar 

  12. Fleming TR, O’Fallon JR, O’Brien PC, Harrington DP (1980) Modified Kolmogorov–Smirnov test procedures with application to arbitrarily right-censored data. Biometrics 36:607–625

    MathSciNet  MATH  Article  Google Scholar 

  13. Gill RD (1980) Censoring and stochastic integrals (Ph.D. thesis). Mathematical Centre Tracts 124. Mathematisch Centrum, Amsterdam, V

    MATH  Article  Google Scholar 

  14. Gill RD, Schumacher M (1987) A simple test of proportional hazards assumption. Biometrika 74:289–300

    MATH  Article  Google Scholar 

  15. Grambsch PM, Therneau TM (1994) Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 81:515–526

    MathSciNet  MATH  Article  Google Scholar 

  16. Hájek J, Šidák Z, Sen PK (1999) Theory of rank tests, 2nd edn. Academic Press Inc, San Diego

    Google Scholar 

  17. Hall P, Heyde CC (1980) Martingale limit theory and its application. Academic Press, New York

    Google Scholar 

  18. Hall WJ, Wellner JA (1980) Confidence bands for a survival curve from censored data. Biometrika 67:133–143

    MathSciNet  MATH  Article  Google Scholar 

  19. Harrington DP, Fleming TR (1982) A class of rank test procedures for censored survival data. Biometrika 69:553–565

    MathSciNet  MATH  Article  Google Scholar 

  20. Hess KR (1995) Graphical methods for assessing violations of the proportional hazards assumption in Cox regression. Stat Med 14:1707–1723

    Article  Google Scholar 

  21. Heller G, Venkatraman ES (1996) Resampling procedures to compare two survival distributions in the presence of right-censored data. Biometrics 52:1204–1213

    MATH  Article  Google Scholar 

  22. Hothorn T, Hornik K, van de Wiel MA, Zeileis A (2006) A Lego system for conditional inference. Am Stat 60:257–263

    MathSciNet  Article  Google Scholar 

  23. Jacod J, Shiryaev AN (2003) Limit theorems for stochastic processes, 2nd edn. Springer, Berlin

    Google Scholar 

  24. Janssen A (1997) Studentized permutation tests for non-iid hypotheses and the generalized Behrens–Fisher problem. Stat Probab Lett 36:9–21

    MATH  Article  Google Scholar 

  25. Janssen A (2005) Resampling student’s t-type statistics. Ann Inst Stat Math 57:507–529

    MathSciNet  MATH  Article  Google Scholar 

  26. Janssen A, Mayer C-D (2001) Conditional studentized survival tests for randomly censored models. Scand J Stat 28:283–293

    MathSciNet  MATH  Article  Google Scholar 

  27. Janssen A, Neuhaus G (1997) Two-sample rank tests for censored data with non-predictable weights. J Stat Plan Inference 60:45–59

    MathSciNet  MATH  Article  Google Scholar 

  28. Janssen A, Pauls T (2003) How do bootstrap and permutation tests work? Ann Stat 31:768–806

    MathSciNet  MATH  Article  Google Scholar 

  29. Janssen A, Werft W (2004) A survey about the efficiency of two-sample survival tests for randomly censored data. Mitt Math Semin Gießen 254:1–47

    MathSciNet  MATH  Google Scholar 

  30. Kirk AP, Jain S, Pocock S, Thomas HC, Sherlock S (1980) Late results of the Royal Free Hospital prospective controlled trial of prednisolone therapy in hepatitis B surface antigen negative chronic active hepatitis. Gut 21:78–83

    Article  Google Scholar 

  31. Konietschke F, Pauly M (2012) A studentized permutation test for the nonparametric Behrens–Fisher problem in paired data. Electron J Stat 6:1358–1372

    MathSciNet  MATH  Article  Google Scholar 

  32. Kraus D (2009) Checking proportional rates in the two-sample transformation model. Kybernetika 2:261–278

    MathSciNet  MATH  Google Scholar 

  33. Lin DY (1991) Goodness-of-fit analysis for the Cox regression model based on a class of parameter estimators. J Am Stat Assoc 86:725–728

    MathSciNet  MATH  Article  Google Scholar 

  34. Lin D (1997) Non-parametric inference for cumulative incidence functions in competing risks studies. Stat Med 16:901–910

    Article  Google Scholar 

  35. Lin DY, Wei LJ, Ying Z (1993) Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80:557–572

    MathSciNet  MATH  Article  Google Scholar 

  36. Liu RY (1988) Bootstrap procedures under some non-i.i.d. models. Ann Stat 16:1696–1708

    MathSciNet  MATH  Article  Google Scholar 

  37. Mammen E (1992) When does bootstrap work? Asymptotic results and simulations. Springer, New York

    Google Scholar 

  38. Mantel N (1966) Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemoth Rep 50:163–170

    Google Scholar 

  39. Neuhaus G (1993) Conditional rank tests for two-sample problem under random censorship. Ann Stat 21:1760–1779

    MathSciNet  MATH  Article  Google Scholar 

  40. Omelka M, Pauly M (2012) Testing equality of correlation coefficients in two populations via permutation methods. J Stat Plann Inf 142:1396–1406

    MathSciNet  MATH  Article  Google Scholar 

  41. Pauly M (2011) Discussion about the quality of F-ratio resampling tests for comparing variances. TEST 20:163–179

    MathSciNet  MATH  Article  Google Scholar 

  42. Pauly M (2011) Weighted resampling of martingale difference arrays with applications. Electron J Stat 5:41–52

    MathSciNet  MATH  Article  Google Scholar 

  43. Pauly M, Brunner E, Konietschke F (2015) Asymptotic permutation tests in general factorial designs. J R Stat Soc B 77:461–473

    MathSciNet  MATH  Article  Google Scholar 

  44. Peto R, Peto J (1972) Asymptotically efficient rank invariant test procedures (with discussion). J R Stat Soc A 135:185–206

    MATH  Article  Google Scholar 

  45. Prentice RL (1978) Linear rank tests with right censored data. Biometrika 65:167–179

    MathSciNet  MATH  Article  Google Scholar 

  46. R Core Team (2019) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. Accessed 14 June 2019

  47. Scheike TH, Martinussen T (2004) On estimation and tests of time-varying effects in the proportional hazards model. Scand J Stat 31:51–62

    MathSciNet  MATH  Article  Google Scholar 

  48. Schoenfeld D (1982) Partial residuals for the proportional hazards regression model. Biometrika 69:239–241

    Article  Google Scholar 

  49. Schumacher M (1984) Two-sample tests of Cramér–von Mises- and Kolmogorov–Smirnov-type for randomly censored data. Int Stat Rev/Revue Internationale de Statistique 52:263–281

    MATH  Google Scholar 

  50. Sengupta D, Bhattacharjee A, Rajeev B (2004) Testing for the proportionality of hazards in two samples against the increasing cumulative hazard ratio alternative. Scand J Stat 31:51–62

    Article  Google Scholar 

  51. Wei LJ (1984) Goodness of fit for proportional hazards model with censored observations. J Am Stat Assoc 79:649–652

    MathSciNet  MATH  Article  Google Scholar 

  52. Wu C-FJ (1986) Jackknife, bootstrap and other resampling methods in regression analysis. Ann Stat 14:1261–1350

    MathSciNet  MATH  Article  Google Scholar 

Download references

Acknowledgements

The authors thank two referees and an associate editor for increasing the paper’s quality by their helpful comments. Funding was provided by Deutsche Forschungsgemeinschaft (Grant No. PA-2409 5-1).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Marc Ditzhaus.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 119 KB)

Appendix: Proofs

Appendix: Proofs

In the following we give all the proofs. Considering appropriate subsequences we can assume without loss of generality that \(n_1/n\rightarrow \kappa _1\in (0,1)\) and \(n_2/n\rightarrow \kappa _2=1-\kappa _1\). Note that the final statements of all our theorems do not depend on \(\kappa _1\) and \(\kappa _2\) as well as the considered subsequence. For the readers’ convenience we present some technical, known results before giving the actual proofs.

Preliminaries

One basic tool for our proofs are discrete martingale theorems, see Hall and Heyde (1980) and references therein. For our purposes, a simplified version of Theorem 8.3.33 from Jacod and Shiryaev (2003) is sufficient, see also their Theorem 2.4.36.

Proposition 1

[c.f. Jacod and Shiryaev (2003)] For each \(n\in { {\mathbb {N}} }\) let \((\xi _{n,i})_{1\le i \le n}\) be a martingale difference scheme with respect to some filtration \(({\mathscr {F}}_{n,i})_{0\le i \le n}\), i.e., \({ E }(\xi _{n,i}\vert {\mathscr {F}}_{n,i-1})=0\) for every \(1\le i \le n\). Assume that the conditional Lindeberg condition, \(\sum _{i=1}^{n} { E }(\xi _{n,i}^2{\mathbf {1}}\{|\xi _{n,i}|\ge \varepsilon \}\vert \mathscr {F}_{n,i-1})\rightarrow 0\) in probability for all \(\varepsilon >0\), is fulfilled. Define \(M_n\) by \(M_n(t)=\sum _{i=1}^{[nt]}\xi _{n,i}\)\((t\in [0,1])\). Suppose that the predictable quadratic variation process \(\langle M_n\rangle \) given by \(\langle M_n\rangle (t)=\sum _{i=1}^{[nt]} { E }(\xi _{n,i}^2\vert \mathscr {F}_{n,i-1})\)\((t\in [0,1])\) converges in probability pointwisely to a continuous function \(\sigma ^2:[0,1]\rightarrow [0,\infty )\). Then \(M_n\) converges in distribution on the Skorohod space D[0, 1] to the rescaled Brownian motion \(B\circ \sigma ^2\), where B denotes a classical Brownian motion.

Proposition 2

[c.f. Janssen and Mayer (2001)] Assume that all assumptions of our paper are underlying. Let \({{\widetilde{\delta }}}_n=({\widetilde{\delta }}_{n,1},\ldots ,{{\widetilde{\delta }}}_{n,n})\in \{0,1\}^n\) and \(w_n=(w_{n,1},\ldots ,w_{n,n})\in { {\mathbb {R}} }^n\) such that \(\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^n{{\widetilde{\delta }}}_{n,i}w_{n,i}^2\in (0,\infty )\) and \(\max \{|w_{n,i}|:i=1,\ldots ,n\}\le M\in (0,\infty )\) for all \(n\in { {\mathbb {N}} }\). Define for all \(t\in [0,1]\)

$$\begin{aligned}&W_n(t){=}\Bigl ( \frac{n}{n_1n_2} \Bigr )^{1/2} \sum _{i=1}^{[nt]}w_{n,i}{{\widetilde{\delta }}}_{n,i} \Bigl ( c^\pi _{n,i} - \frac{\sum _{j=i}^nc_{n,j}^\pi }{n-i+1} \Bigr ),\\&V_n(t)=\Bigl ( \frac{n}{n_1n_2} \Bigr )^{1/2} \sum _{i=1}^{[nt]}w_{n,i}^2{{\widetilde{\delta }}}_{n,i} \Biggl [ \frac{\sum _{j=i}^n(c_{n,j}^\pi )^2}{n-i+1}- \left( \frac{\sum _{j=i}^nc_{n,j}^\pi }{n-i+1} \right) ^2 \Biggr ],\\&\beta ^2_n(t)=\frac{1}{n-1}\sum _{i=1}^{[nt]}w_{n,i}^2{{\widetilde{\delta }}}_{n,i} \frac{n-i}{n-i+1},\\&\alpha _n(t)=\inf \{s\in [0,1]: \beta _n^2(s)>t \beta _n^2(1)\}, \end{aligned}$$

where for the latter we set \(\inf \emptyset = 1\). Then \(V_n(\alpha _n(t))/V_n(1)\) converges in probability to t for all \(t\in [0,1]\) and \(V_n(1)^{-1/2}W_n\circ \alpha _n\) tends in distribution to a Brownian motion B on the Skorohod space D[0, 1]. Moreover, the sequences \((V_n(1))_{n\in { {\mathbb {N}} }}\) and \((W_n(1))_{n\in { {\mathbb {N}} }}\) are tight, i.e., we have \(\lim _{t\rightarrow \infty }\limsup _{n\rightarrow \infty } P ( |W_n(1)| \ge t)+ P ( |V_n(1)| \ge t)=0\).

Proof

It is easy to check that due to the assumptions on \(w_n\) and \({{\widetilde{\delta }}}_n\) we have \(0<\liminf _{n\rightarrow \infty }\beta _n^2(1)\le \limsup _{n\rightarrow \infty }\beta _n^2(1)<\infty \) and, thus, condition (15) of Janssen and Mayer (2001) is fulfilled. Moreover, the conditions for the regression coefficients in the paper of Janssen and Mayer (2001) hold for our (rescaled) coefficients \({{\widetilde{c}}}_{(i)}= (n_1n_2/n)^{-1/2} c_{(i)}\). Consequently, we can apply their Theorem 1 and Lemma 3. Note that \(\beta _n^2(1)=1\) is assumed in their Lemma 3 as well as in the proof of their Theorem 1, but this can always be ensured by rescaling the weight coefficients \({{\widetilde{w}}}_{n,i}= w_{n,i}/\beta _n(1)\). Now, the desired distributional convergence of \(V_n(1)^{-1/2}W_n\circ \alpha _n\) follows from their Theorem 1. In the proof of Lemma 3 Janssen and Mayer (2001) showed \(\beta _n^2(\alpha _n(t))/\beta _n^2(1)\rightarrow t\) for all \(t\in [0,1]\) and by their (33) we have \([1/\beta _n^{2}(1)]\sup _{t\in [0,1]}|\beta _n^2(t)-V_n(t)|\rightarrow 0\) in probability. Combining both we obtain the convergence in probability of \(V_n(\alpha _n(t))/V_n(1)\). Moreover, their Theorem 1 implies that \(W_n(1)V_n(1)^{-1/2}\) converges in distribution to a standard normal distributed random variable. Finally, the tightness of \((V_n(1))_{n\in { {\mathbb {N}} }}\) and \((W_n(1))_{n\in { {\mathbb {N}} }}\), respectively, follows from \(\limsup _{n\rightarrow \infty }\beta _n^2(1)<\infty \). \(\square \)

The subsequent result concerning linear rank tests is well-known and can be found, for example, in the book of Hájek et al. (1999). Combining their Theorem 3 in Section 3.3.1 and Chebyshev’s inequality we obtain:

Proposition 3

[c.f. Hájek et al. (1999)] Let \(w_{n,1},\ldots ,w_{n,n}\) be real-valued constants. Introduce the linear rank statistic \(\xi _n^\pi =\sum _{i=1}^{n}w_{n,i}(c_{n,i}^\pi -{{\bar{c}}})\), where \(\bar{c}=n^{-1}\sum _{i=1}^nc_{n,i}^\pi =n_2/n\). Then \({ E }(\xi _n^\pi )=0\) and

$$\begin{aligned} \text {Var} (\xi _n^\pi ) = \frac{n_1n_2}{n(n-1)}\sum _{i=1}^{n} (w_{n,i}-{\bar{w}})^2,\quad {{\bar{w}}} = n^{-1}\sum _{i=1}^nw_{n,i}. \end{aligned}$$

In particular, if \( \max _{1\le i \le n}|w_{n,i}|\le M/n\) for all \(n\in { {\mathbb {N}} }\) and some fixed \(M>0\) then \( \text {Var} (\xi _n^\pi )\rightarrow 0\) and \(\xi _n^\pi \) converges in probability to 0.

Proof of Theorem 1

The following lemma and the corresponding proof are extensions of Lemma 5.3 and its proof of Janssen and Werft (2004).

Lemma 1

Suppose that \(A_2=\vartheta A_1\) for some \(\vartheta >0\). Then \((\xi _{n,i})_{1\le i \le n}\) given by

$$\begin{aligned} \xi _{n,i}= \Bigl ( \frac{n}{n_1n_2} \Bigr )^{1/2} \delta _{(i)} \Bigl ( c_{(i)} - \frac{\vartheta \sum _{m=i}^nc_{(m)}}{\sum _{m=i}^n(1-c_{(m)}) + \vartheta \sum _{m=i}^nc_{(m)}}\Bigr ) \end{aligned}$$

is a martingale difference scheme with respect to the filtration \(({\mathscr {F}}_{n,i})_{0\le i \le n}\) defined in (6). Moreover, the predictable quadratic variation process \(\langle M_n\rangle \) of the martingale \(M_n\) given by \(M_n(t)=S_n(t)=\sum _{i=1}^{[nt]}\xi _{n,i}\)\((t\in [0,1])\) equals \({{\widehat{\sigma }}}^2\) from (5).

Proof

Fix \(1\le i\le n\). First, observe that \(\sum _{m=i}^nc_{(m)}\) and \(\delta _{(i)}\) are predictable, i.e., \(\mathscr {F}_{n,i-1}\)-measurable. Since \(c_{(i)}\) equals either 0 or 1 it is easy to see that all postulated statements follow from

$$\begin{aligned} { E }( \delta _{(i)} c_{(i)} \vert {\mathscr {F}}_{n,i-1}) = \delta _{(i)}\frac{ \vartheta \sum _{m=i}^nc_{(m)}}{\sum _{m=i}^n(1-c_{(m)}) + \vartheta \sum _{m=i}^nc_{(m)}}. \end{aligned}$$
(10)

Clearly, (10) is true in the case \(\delta _{(i)}=0\). Hence, it is sufficient for the proof of (10) to consider events \(A\in {\mathscr {F}}_{n,i-1}\) of the form \(A= \{ \delta _{(i)}=1,\,\delta _{(m)}= \delta _m, \,d_{(m)}= d_m;\;m\le i-1\}\) with constants \( \delta _1, \ldots ,\delta _{i-1}\in \{0,1\}\) and pairwise different \(d_1,\ldots ,d_{i-1}\in \{(j,k): 1\le k \le n_j;\,j=1,2\}\). Introduce \(Z_{j,k}={\mathbf {1}}\{d_{(i)}=(j,k)\}\)\((1\le k \le n_j;\, j=1,2)\). Clearly, \(1-c_{(i)}= \sum _{k=1}^{n_1} Z_{1,k}\) and \(c_{(i)}= \sum _{k=1}^{n_2} Z_{2,k}\). That is why we analyse \(Z_{j,k}\) in the next step. For this purpose we define the set \(B_x=\bigl \{ X_{ d_{1}}<\cdots<X_{d_{i-1}}< x\,\, ,\,\, \delta _{d_{(m)}} = \delta _{m}\,\,\text {for}\,\,m\le i-1\,\bigr \}\quad (x>0)\). From Fubini’s theorem and \(\mathrm dF_j / \mathrm dA_j = 1- F_j\) we obtain for all \((j,k)\in J=\{(j,k): 1\le k \le n_j;\,j=1,2\}{\setminus } \{ d_1,\ldots , d_{i-1}\}\) that

$$\begin{aligned} P \left( \left\{ Z_{j,k} = 1 \right\} \cap A \right)&= \int _0^\infty P ( B_x) [1-G_j(x)] \prod _{(r,s)\in J{\setminus } \{(j,k)\} } [1-F_r(x)] [1-G_r(x)]\,\mathrm dF_j\nonumber \\&= \int _0^\infty P ( B_x)\;\prod _{(r,s)\in J } [1-F_r(x)] [1-G_r(x)] \;\mathrm dA_j(x) \nonumber \\&= C^* \left( \, \vartheta + \left( 1 - \vartheta \right) {\mathbf {1}}\{ j = 1 \} \,\right) \end{aligned}$$
(11)

for some \(C^*\ge 0\). Obviously, \( P ( \{ Z_{j,k} = 1 \} \cap A )=0\) for \((j,k)\notin J\). Remind in the following that \(n_1-\sum _{m=1}^{i-1}(1-c_{d_m})\) and \(n_2-\sum _{m=1}^{i-1}c_{d_m}\) elements of J belong to the first and second group, respectively. Thus, we can deduce from summing up all probabilities \( P ( \{ Z_{j,k} = 1 \} \cap A )\) that

$$\begin{aligned} C^*= P (A) / \Bigl [\vartheta \Bigl (n_2-\sum _{m=1}^{i-1}c_{d_m}\Bigr )+ n_1-\sum _{m=1}^{i-1}(1-c_{d_m})\Bigl ]. \end{aligned}$$

Finally, inserting \(C^*\) into (11) and recalling \(c_{(i)}= \sum _{k=1}^{n_2} Z_{2,k}\) proves (10) since \(n_2-\sum _{m=1}^{i-1}c_{d_m}\) equals \(\sum _{m=i}^nc_{(m)}\) under A. \(\square \)

It is easy to see that there is a sequence \((b_n)_{n\in { {\mathbb {N}} }}\) converging to 0 such that \(\max _{1\le i \le n}|\xi _{n,i}|\le b_n\). This implies immediately the conditional Lindeberg condition. Thus, to apply Proposition 1 it remains to show convergence of the corresponding predictable quadratic variation \({{\widehat{\sigma }}}^2\). For this purpose we prefer the following slight modification of its representation in (3)

$$\begin{aligned} {{\widehat{\sigma }}}^2(t,\vartheta ) = \int {\mathbf {1}}\Bigl \{\frac{Y}{n}\ge \frac{n-[nt]+1}{n}\Bigr \} w_{n} \,\mathrm { d }\frac{N}{n},\quad w_{n}=\frac{n^2}{n_1n_2}\frac{\vartheta Y_1Y_2}{(Y_1+\vartheta Y_2)^2}. \end{aligned}$$
(12)

Clearly, the integrand is bounded. As well known, we can deduce from the extended Glivenko–Cantelli theorem that \(\sup \{|Y_{j}(t)/n-y_j(t)|:t\in [0,\infty )\}\rightarrow 0\) as well as \(\sup \{|w_{n}(t)-w(t)|:t\in [0,s]\}\rightarrow 0\) for every \(s<\tau \), both in probability, where \(y_j=\kappa _j(1-F_j)(1-G_j)\) and \(w=(\kappa _1\kappa _2)^{-1}y_1y_2/( y_1 + \vartheta y_2)^2\). Set \(y=y_1+y_2\). Moreover, by the law of large numbers \(N_j(t)/n\) converges in probability to \(L_j(t)=\kappa _j P (X_{j,1}\le t, \Delta _{j,1}=1)=\kappa _j\int _{[0,t]}(1-G_j)\,\mathrm { d }F_j\)\((t\ge 0; \,j=1,2)\) and, thus, \(N(t)/n\rightarrow L(t)=L_1(t)+L_2(t)\). Since \(|w_{n}|\) is uniformly bounded and \(\mathrm dF_j / \mathrm dA_j = 1- F_j\) we obtain that \(\sigma _n^2(t,\vartheta )\) converges in probability to

$$\begin{aligned} \sigma ^2(t,\vartheta )= \frac{\vartheta }{\kappa _1\kappa _2}\int {\mathbf {1}}\{y\ge 1-t\} \frac{y_1y_2}{(y_1+\vartheta y_2)^2}\,\mathrm { d }L. \end{aligned}$$

Applying Proposition 1 yields that \(S_n(\cdot ,\vartheta )\) converges in distribution to \(B\circ \sigma ^2(\cdot , \vartheta )\) on the Skorohod space D[0, 1], where B is a Brownian motion.

In the next step, we plug-in the estimator \({{\widehat{\vartheta }}}\) of the proportionality factor \(\vartheta \). The statement of Theorem 1 holds for general estimators \({{\widehat{\vartheta }}}\) of the proportionality factor \(\vartheta \) fulfilling the subsequent Assumption I. As already mentioned in Sect. 2, by Gill (1980) \({{\widehat{\vartheta }}}_K\) with \(K=(w\circ {{\widehat{F}}})K_L\) obeys a central limit theorem and, in particular, fulfills the subsequent Assumption I.

Assumption I Let \({{\widehat{\vartheta }}}\) be a positive random variable which is bounded with probability one, i.e., \( P ({{\widehat{\vartheta }}} \le \eta _\vartheta )\rightarrow 1\) for some \(\eta _\vartheta \). Suppose that \(n^{1/2}({{\widehat{\vartheta }}} - \vartheta )\) converges in distribution to a real-valued random variable \(Z_\vartheta \).

Obviously, \({{\widehat{\vartheta }}}\) is a consistent estimator for \(\vartheta \) under Assumption I, i.e., \({{\widehat{\vartheta }}}\) tends in probability to \(\vartheta \). Analogously to the convergence of \({{\widehat{\sigma }}}^2(t,\vartheta )\), we can deduce from the consistency that \({{\widehat{\sigma }}}^2(t,{{\widehat{\vartheta }}})\) converges in probability to \(\sigma ^2(t,\vartheta )\) for every \(t\in [0,1]\). It is easy to check that \(S_n(t,\vartheta ) - S_n(t,{{\widehat{\vartheta }}})\) equals \(Z_{n,\vartheta }{{\widetilde{\sigma }}}_n^2(t,\vartheta )\), where

$$\begin{aligned} Z_{n,\vartheta }&= \Bigl (\frac{n_1n_2}{n}\Bigr )^{1/2} \vartheta ^{-1}({{\widehat{\vartheta }}}-\vartheta ),\\ {{\widetilde{\sigma }}}^2(t,\vartheta )&=\frac{n^2}{n_1n_2}\int {\mathbf {1}}\Bigl \{\frac{Y}{n}\ge \frac{n-[nt]+1}{n}\Bigr \} \frac{\vartheta Y_1Y_2}{(Y_1+\vartheta Y_2)(Y_1+{{\widehat{\vartheta }}} Y_2)} \,\mathrm { d }\frac{N}{n}. \end{aligned}$$

Clearly, \(Z_{n,\vartheta }\) converges in distribution to \({{\widetilde{Z}}}_\vartheta = (\kappa _1\kappa _2)^{1/2} Z_\vartheta /\vartheta \). Analogously to the argumentation above, \({{\widetilde{\sigma }}}^2(t,\vartheta )\) converges in probability to \(\sigma _n^2(t,\vartheta )\) for all \(t\in [0,1]\). Hence, for every \(t\in [0,1]\)

$$\begin{aligned} R_n(t,\vartheta )= S_n(t,\vartheta )-S_n(t,{{\widehat{\vartheta }}}) - \frac{{\widehat{\sigma }}^2_n(t,{{\widehat{\vartheta }}})}{{{\widehat{\sigma }}}^2(1,{{\widehat{\vartheta }}})}[S_n(1,\vartheta )-S_n(1,{{\widehat{\vartheta }}})] \rightarrow 0 \end{aligned}$$

in probability. Consequently, \((S_n(\cdot ,\vartheta ), {{\widehat{\sigma }}}^2(\cdot ,{{\widehat{\vartheta }}}), R_n(\cdot ,\vartheta ))\) converges in distribution to \((B\circ \sigma ^2(\cdot ,\vartheta ),\sigma ^2 (\cdot ,\vartheta ), 0)\) on \(D[0,1]\times D[0,1]\times D[0,1]\). By the continuous mapping theorem, respecting that \(t\mapsto \sigma ^2(t,\vartheta )\) is continuous and nondecreasing,

$$\begin{aligned} T_n&= {\widehat{\sigma }}^2_n(1,{\widehat{\vartheta }})^{-1/2}\sup _{t\in [0,1]}\Bigl \{ \Bigl | S_{n}(t,\vartheta ) - \frac{ {\widehat{\sigma }}^2_n(t,{\widehat{\vartheta }})}{{\widehat{\sigma }}^2_n(1,{\widehat{\vartheta }})}S_n(1,\vartheta ) + R_n(t,\vartheta )\Bigr | \Bigr \}\\&\rightarrow \sigma ^2(1,\vartheta )^{-1/2}\sup _{t\in [0,1]}\Bigl \{ \Bigl | B\Bigl (\sigma ^2(1,\vartheta ) t \Bigr ) - t B\Bigl ( \sigma ^2(1,\vartheta ) \Bigr ) {\Bigr |} \Bigr \}={{\widetilde{T}}}. \end{aligned}$$

Note that the assumptions ensure \(\sigma ^2(1,\vartheta )>0\). Let \(B_0\) be a Brownian bridge on [0, 1]. Since \(B(c^2\cdot )\) and cB for every \(c>0\) as well as \(t\mapsto B(t)-tB(1)\)\((t\in [0,1])\) and \(B_0\) have the same distribution, respectively, we obtain that the distribution of \({{\widetilde{T}}}\) equals the one of T.

Proof of Theorem 2

Let \({{\widehat{\vartheta }}}={{\widehat{\vartheta }}}_K\) with \(K=(w\circ {{\widehat{F}}})K_L\) for some \(w\in {\mathscr {W}}\). Following the argumentation of the convergence of \({\widehat{\sigma }}^2(t,{{\widehat{\vartheta }}})\) in the proof of Theorem 1 we obtain under \(H_1\) that in probability

$$\begin{aligned} -\log (1-{{\widehat{F}}}(x))&= \int _0^x \log (1-Y^{-1})^n \,\mathrm { d }\frac{N}{n} \nonumber \\&\rightarrow \int _0^x (1-y^{-1})\,\mathrm { d }L\equiv -\log [1-F(x)]\quad (x<\tau ), \end{aligned}$$
(13)
$$\begin{aligned} {{\widehat{\vartheta }}}_K&\rightarrow \frac{ \int {{\widetilde{K}}} \,\mathrm { d } A_2}{\int {{\widetilde{K}}} \,\mathrm { d } A_1 } \equiv \vartheta _0>0, \quad {{\widetilde{K}}} = (w\circ S) \frac{y_1y_2}{y}, \end{aligned}$$
(14)

where \(y_1,y_2,y,L\) are defined as in the proof of Theorem 1. Similarly, we can deduce that \(n^{-1/2}T_n\) converges in probability under \(H_1\) to

$$\begin{aligned} {\sigma }^2(t,{\vartheta _0})^{-1/2}\sup _{t\in [0,1]}\Bigl \{ \Bigl | S(t,\vartheta _0) - [{\sigma }^2(t,{\vartheta _0})/{\sigma }^2(1,{\vartheta _0})]S(1,\vartheta _0) {\Bigr |}\Bigr \} \equiv {{\widetilde{T}}}, \end{aligned}$$

where

$$\begin{aligned}&S(t,\vartheta _0) = ( \kappa _1\kappa _2)^{-1/2}\int {\mathbf {1}}\{y\ge 1-t\} \frac{y_1y_2}{y_1+\vartheta _0 y_2} ( \,\mathrm { d } A_2-\vartheta _0\,\mathrm { d } A_1),\\&\sigma ^2(t,\vartheta _0)= \vartheta _0( \kappa _1\kappa _2)^{-1}\int {\mathbf {1}}\{y\ge 1-t\} \frac{y_1y_2}{(y_1+\vartheta _0 y_2)^2}(y_2\,\mathrm { d }A_2+y_1\,\mathrm { d }A_1). \end{aligned}$$

Clearly, the consistency of our test follows if \({{\widetilde{T}}}>0\). Contrary to this, suppose that \({{\widetilde{T}}} =0\). Then there is some \(\eta _0\ne 0\) such that \(S(t,\vartheta _0)=\eta _0 \sigma ^2(t,\vartheta _0)\) for all \(t\in [0,1]\). It can easily be seen that

$$\begin{aligned}&\int {\mathbf {1}}\{y\in [s,t)\} \frac{y_1y_2}{y_1+\vartheta _0 y_2} ( \,\mathrm { d } A_2-\vartheta _0\,\mathrm { d } A_1) \nonumber \\&\quad = \eta _1 \int {\mathbf {1}}\{y\in [s,t)\} \frac{y_1y_2}{(y_1+\vartheta _0 y_2)^2}(y_2\,\mathrm { d }A_2+y_1\,\mathrm { d }A_1) \end{aligned}$$
(15)

follows for every \(0\le s<t\le 1\) and some \(\eta _1\ne 0\). Assume that \(\eta _1>0\). The case \(\eta _1<0\) can be treated analogously and, thus, we omit it to the reader. Since the right hand side of Eq. (15) is nonnegative and \((y_1+\vartheta _0 y_2){{\widetilde{K}}}/(y_1y_2)\ge 0\) we can deduce

$$\begin{aligned} \int {\mathbf {1}}\{y\in [s,t)\} {{\widetilde{K}}} ( \,\mathrm { d } A_2-\vartheta _0\,\mathrm { d } A_1)\ge 0 \end{aligned}$$
(16)

for all \(0\le s<t\le 1\). Note that due to the definition of \(\vartheta _0\), see (14), we have equality in (16) for \(s=0\) and \(t=1\). Hence, we get equality in (16) for all \(0\le s<t\le 1\). But this implies \(A_2(x)=\vartheta _0 A_1(x)\) for all \(x {>} 0\) with \({{\widetilde{K}}}(x)>0\) or, equivalently, for all \(x\in (0,\tau )\).

Proof of Theorem 3

To distinguish between the processes depending on the original data and the permuted data, respectively, we add a superscript \(\pi \) to these processes if the permutation versions are meant, namely \({{\widehat{A}}}_j^\pi \), \(Y_j^\pi \), \(N_j^\pi \), \(K_L^\pi \), \(K^\pi \), \(S_n^\pi \), \({\widehat{\sigma }}^{2,\pi }\)\((1\le j\le 2)\). Note that the processes N and Y as well as the pooled Kaplan–Meier estimator \({{\widehat{F}}}\) do not depend on the group membership vector.

Assumption II Let \({{\widehat{\vartheta }}}={{\widehat{\vartheta }}}(c^{(n)},\delta ^{(n)})\) be an estimator fulfilling Assumption I, which depends only on the group memberships \(c^{(n)}\) and the censoring status \(\delta ^{(n)}\) of the ordered values. Let \({{\widehat{\vartheta }}}^\pi ={{\widehat{\vartheta }}}(c_n^\pi ,\delta ^{(n)})\) be the corresponding permutation version of the estimator. Moreover, suppose that a certain conditioned tightness assumption is fulfilled for \((n^{1/2}({{\widehat{\vartheta }}}^\pi -1))_{n\in { {\mathbb {N}} }}\), namely that for every subsequence there is a further sub-subsequence such that along this sub-subsequence we have with probability one

$$\begin{aligned} \lim _{t\rightarrow \infty }\limsup _{n\rightarrow \infty }\, P \Bigl (n^{1/2}\bigl |{{\widehat{\vartheta }}}^\pi -1\bigr |\ge t \bigr \vert \delta ^{(n)} \Bigr )=0. \end{aligned}$$

At the proof’s end, we verify that Assumption II holds for the estimators considered in the paper.

Lemma 2

Let \(w\in {\mathscr {W}}\). Then the estimator \({{\widehat{\vartheta }}}={{\widehat{\vartheta }}}_K\) with \(K=(w\circ {{\widehat{F}}})K_L\) fulfills Assumption II.

As in the proof of Theorem 1, we obtain that \(n^{-1}\sum _{i=1}^{n}\delta _{(i)}\rightarrow \int \mathrm { d }L\ge L(\tau )>0\) in probability, where L and y were defined there. Now, we start with an arbitrary subsequence of \({ {\mathbb {N}} }\). Since we are interested in the conditional distributional convergence we treat the censoring status \(\delta _{(1)},\ldots ,\delta _{(n)}\) as constants. Considering an appropriate sub-subsequence of the pre-chosen subsequence we can assume that \(\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^{n}\delta _{(i)}>0\) and that by Assumption II the sequence \((\sqrt{n}({{\widehat{\vartheta }}}^\pi -1))_{n\in { {\mathbb {N}} }}\) is tight. The latter implies, in particular, \({\widehat{\vartheta }}^\pi \rightarrow 1\) in probability. The proof’s rest goes along the argumentation of the proof of Theorem 1. Nevertheless, we carry out the important steps.

Observe that \(S_n^\pi (t,1)=W_n(t,\delta ^{(n)},w_n)\) and \({\widehat{\sigma }}^{2,\pi }(t,1)=V_n(t,\delta ^{(n)},w_n)\) for \(w_n=(1,\ldots ,1)\). By Proposition 2\({\widehat{\sigma }}^{2,\pi }(1,1)^{-1/2}S_n^\pi \circ \alpha _n\) converge in distribution to a Brownian motion B and \({\widehat{\sigma }}^{2,\pi }(\alpha _n(t),1)/{\widehat{\sigma }}^{2,\pi }(1,1)\) tends in probability to t for all \(t\in [0,1]\), where \(\alpha _n\) is defined as in Proposition 2. Moreover, \(({{\widehat{\sigma }}}^{2,\pi }(1,1))_{n\in { {\mathbb {N}} }}\) is tight. Let us now discuss what happens when we plug-in the estimator \({\widehat{\vartheta }}^\pi \). First, for all \(t\in [0,1]\)

$$\begin{aligned}&\Bigl | {\widehat{\sigma }}^{2,\pi }(t,1)- {\widehat{\sigma }}^{2,\pi }(t,{{\widehat{\vartheta }}}^\pi ) \Bigr |\nonumber \\&\quad =\frac{n^2}{n_1n_2} \Bigl | \int _0^{X_{([nt])}}\, \frac{Y_1^\pi Y_2^\pi }{Y^2} ({\widehat{\vartheta }}^\pi -1)\, \frac{(Y_1^\pi )^2-{{\widehat{\vartheta }}}^\pi (Y_2^\pi )^2 }{(Y_1^\pi +{{\widehat{\vartheta }}}^\pi Y_2^\pi )^2}\,\mathrm { d }\frac{N}{n} \Bigr |\nonumber \\&\quad \le |1-{{\widehat{\vartheta }}}^\pi | \frac{n^2}{n_1n_2}\int \frac{Y_1^\pi Y_2^\pi }{Y^2} \Bigl ( 1 + \frac{1}{{{\widehat{\vartheta }}}^\pi } \Bigr )\,\mathrm { d }\frac{N}{n} = \frac{|1-{{\widehat{\vartheta }}}^\pi |(1+{{\widehat{\vartheta }}}^\pi )}{{{\widehat{\vartheta }}}^\pi } {\widehat{\sigma }}^{2,\pi }(1,1). \end{aligned}$$
(17)

Thus, \({\widehat{\sigma }}^{2,\pi }(\alpha _n(t),{\widehat{\vartheta }}^\pi )/{\widehat{\sigma }}^{2,\pi }(1,1)\) as well as \({\widehat{\sigma }}^{2,\pi }(\alpha _n(t),{\widehat{\vartheta }}^\pi )/{\widehat{\sigma }}^{2,\pi }(1,{\widehat{\vartheta }}^\pi )\) tend to t for all \(t\in [0,1]\). It remains to study \(S_n^\pi (t,1)-S_n^\pi (t,{\widehat{\vartheta }}^\pi )\). This difference equals, compare to the proof of Theorem 1, \(Z_{n}^\pi {\widehat{\sigma }}^{2,\pi }(t)\)\((t\in [0,1])\), where

$$\begin{aligned} Z_{n}^\pi&= \Bigl (\frac{n_1n_2}{n}\Bigr )^{1/2} ({{\widehat{\vartheta }}}^\pi -1),\quad {{\widetilde{\sigma }}}^{2,\pi }(t)\\&=\frac{n}{n_1n_2}\int {\mathbf {1}}\Bigl \{\frac{Y}{n}\ge \frac{n-[nt]+1}{n}\Bigr \} \frac{Y_1^\pi Y_2^\pi }{Y(Y_1^\pi +{{\widehat{\vartheta }}}^\pi Y_2^\pi )} \,\mathrm { d }N. \end{aligned}$$

Analogously to (17), we get \({\widetilde{\sigma }}^{2,\pi }(\alpha _n(t))/{\widehat{\sigma }}^{2,\pi }(1,1)\rightarrow t\) for all \(t\in [0,1]\). Since \((Z_n^\pi )_{n\in { {\mathbb {N}} }}\) and \(({\widehat{\sigma }}^{2,\pi }(1,1))_{n\in { {\mathbb {N}} }}\) are tight we obtain

$$\begin{aligned} R_n^\pi (t)&=S_n^\pi (\alpha _n(t),1)-S_n^\pi (\alpha _n(t),{{\widehat{\vartheta }}}^\pi ) - \frac{{\widehat{\sigma }}^{2,\pi }(\alpha _n(t),{{\widehat{\vartheta }}}^\pi )}{{{\widehat{\sigma }}}^{2,\pi }(1,{{\widehat{\vartheta }}}^\pi )}[S_n^\pi (1,1)-S_n^\pi (1,{{\widehat{\vartheta }}}^\pi )] \\&= Z_n^\pi {\widehat{\sigma }}^{2,\pi }(1,1) \Bigl ( \frac{ {{\widetilde{\sigma }}}^{2,\pi }(\alpha _n(t))}{{\widehat{\sigma }}^{2,\pi }(1,1)}-\frac{{{\widetilde{\sigma }}}^{2,\pi }(1)}{{\widehat{\sigma }}^{2,\pi }(1,1)} \frac{{{\widehat{\sigma }}}^{2,\pi }(\alpha _n(t),{{\widehat{\vartheta }}}^\pi )}{{\widehat{\sigma }}^{2,\pi }(1,{{\widehat{\vartheta }}}^\pi )} \Bigr ) \rightarrow 0 \end{aligned}$$

in probability. Finally, the statement follows from the continuous mapping theorem, compare to the proof of Theorem 1.

Proof (Lemma 2)

As already mentioned in the proof of Theorem 1, \({{\widehat{\vartheta }}}\) fulfills Assumption I. Now, define \(w_{n,i}=w( {{\widehat{F}}}(X_{(i)}))\) and set \(w_n=(w_{n,1},\ldots ,w_{n,n})\). Analogously to the proof of Theorem 1, we obtain that in probability

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^n w_{n,i} \delta _{(i)} = \int w({\widehat{F}}(t))^2 \,\mathrm { d }\frac{N}{n}(t) \rightarrow \int w(F(t))^2\,\mathrm { d }L(t)>0, \end{aligned}$$
(18)
$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^{[n_2/2]} w_{n,i} \delta _{(i)} \rightarrow \int {\mathbf {1}}\{y\ge 1- \kappa _2/2\} w(F(t))^2\,\mathrm { d }L>0, \end{aligned}$$
(19)

where y and L are defined in the proof of Theorem 1 as well as F is given by (13). It is well known that for every subsequence of \({ {\mathbb {N}} }\) there exists a further sub-subsequence such that the convergences in (18) and (19) hold with probability one along this sub-subsequence. Since we are interested in the probability conditioned under \(\delta ^{(n)}\) we can treat the censoring status \(\delta _{(1)},\ldots ,\delta _{(n)}\) as constants and, hence, \(w_{n,1},\ldots ,w_{n,n}\) as constants with \(\lim _{n\rightarrow \infty }\sum _{i=1}^nw_{n,i}\delta _{(i)}\in (0,\infty )\) as well as \(\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^{[n_2/2]} w_{n,i} \delta _{(i)}\in (0,\infty )\) along an appropriate subsequence. Thus, we can apply Proposition 2 to the numerator of

$$\begin{aligned} n^{1/2}({{\widehat{\vartheta }}}^\pi -1)&= \frac{ n^{-1/2}\int (w\circ {{\widehat{F}}}) (Y_1^\pi Y_2^\pi /Y^\pi ) \,(\mathrm { d }{{\widehat{A}}}_2^\pi -\mathrm { d }{{\widehat{A}}}_1^\pi )}{\int K^\pi \,\mathrm { d }{{\widehat{A}}}_1^\pi }\\&=\Bigl ( \frac{n_1n_2}{n^2} \Bigr )^{1/2}\frac{ W_n(1, \delta ^{(n)},w_n)}{\int K^\pi \,\mathrm { d }{{\widehat{A}}}_1^\pi }. \end{aligned}$$

Note that we always have \(Y^\pi _2(X_{(i)})/Y(X_{(i)})\ge (n_2-1)/(2n)\) for \(i=1,\ldots ,[n_2/2]\). Hence, we can bound the denominator \(D_n\) from below as follows:

$$\begin{aligned} \int K^\pi \,\mathrm { d }{{\widehat{A}}}_1^\pi&\ge \frac{n_2-1}{2n} \frac{1}{n} \sum _{i=1}^{[n_2/2]} w_{n,i}\delta _{(i)} (1-c_{n,i}^\pi )\\&=\frac{n_2-1}{2n}\frac{n_1}{n}\frac{1}{n} \sum _{i=1}^{[n_2/2]} ( w_{n,i}\delta _{(i)} ) - \frac{n_2-1}{2n}\sum _{i=1}^{[n_2/2]} \frac{w_{n,i}\delta _{(i)}}{n} (c_{n,i}^\pi - {{\bar{c}}} ), \end{aligned}$$

where \({{\bar{c}}} = n^{-1}\sum _{i=1}^n c_{(i)}=n_2/n\). Applying Proposition 3 for \({{\widetilde{w}}}_{n,i}= (w_{n,i}\delta _{n,i}/n) {\mathbf {1}}\{ i \le [n_2/2]\}\) yields that the second summand converges in probability to 0. The first summand converges to \(M>0\), say. To sum up, \( P (D_n\ge M/2)\rightarrow 1\). Combining this and the tightness result for the numerator according to Proposition 2 gives us the desired tightness of \((n^{1/2}({{\widehat{\vartheta }}}^\pi -1))_{n\in { {\mathbb {N}} }}\). \(\square \)

Proof of Theorems 4 and 5

For simplicity we restrict here to estimators \({{\widehat{\vartheta }}}={{\widehat{\vartheta }}}\) of the form \({{\widehat{\vartheta }}}_K\) with \(K=(w\circ {{\widehat{F}}})K_L\) for \(w\in {\mathscr {W}}\). We give the proof of Theorems 4 and 5 simultaneously. First, we verify that (7) holds for some real-valued random variable \(T_0\), say, under \(H_0\) as well as under \(H_1\), respectively, where the distribution of \( T_0\) depends on the underlying distributions \(F_1,F_2,G_1,G_2\). From this convergence we can immediately deduce the bootstrap test’s consistency (i.e., Theorem 5) since we already know that \(T_n\) converges in probability to \(\infty \) under \(H_1\), compare to Theorem 7 of Janssen and Pauls (2003). In the second step, we show that \(T_0\) has the same distribution as T under \(H_0\) and, thus, Theorem 4 follows.

Recall from the proof of Theorem 1 that \({{\widehat{\sigma }}}^2(t,{{\widehat{\vartheta }}})\rightarrow \sigma ^2(t,\vartheta )\) (\(t\in [0,1]\)), \(\sup \{|Y_{j}(x)/n-y_j(x)|: x\in [0,\infty )\}\rightarrow 0\)\((j=1,2)\) and \(N_j(s)/n\rightarrow L_j(s)\) (\( s\ge 0;\,j=1,2\)), all in probability under \(H_0\). It is easy to check that the same is valid under \(H_1\). Restricting to t and s coming from dense subsets of [0, 1] and \([0,\infty )\), respectively, for every subsequence we can construct a further sub-subsequence such that all the convergences mentioned above hold simultaneously with probability one under \(H_0\) as well as under \(H_1\), respectively. Due to the monotonicity and the continuity of the limits the convergences carry over to the whole intervals [0, 1] and \([0,\infty )\), respectively. Regarding (14) \({\widehat{\vartheta }}\) converges in probability to some \(\vartheta >0\) under \(H_0\) as well as under \(H_1\), where \(\vartheta \) equals the proportionality factor under \(H_0\). Since we are interested in the conditional distribution of \(T_n^G\) given the whole data we can treat \({{\widehat{\sigma }}}^2(\cdot ,{{\widehat{\vartheta }}})\), \(Y_j\), \(N_j\) as nonrandom functions and \({{\widehat{\vartheta }}}\) as a constant. Starting with an arbitrary subsequence we can always construct a further sub-subsequence, compare to the explanation above, that along this sub-subsequence \({{\widehat{\sigma }}}^2(t,{{\widehat{\vartheta }}})\rightarrow \sigma ^2(t,\vartheta )\) (\(t\in [0,1]\)), \(\sup \{|Y_{j}(x)/n-y_j(x)|:x \in [0,\infty )\}\rightarrow 0\)\((r=1,2)\) and \(N_j(s)/n\rightarrow L_j(s)\) (\(s\ge 0\)) as well as \({{\widehat{\vartheta }}}\rightarrow \vartheta \). All following considerations are along this sub-subsequence.

Let \(G_{(i)}\) be the multiplier corresponding to \(X_{(i)}\). Clearly, \(G_{(1)},\ldots ,G_{(n)}\) are (given the data) still independent and identical distributed with the same distribution as \(G_{1,1}\). We can now rewrite the statistic \(S_n^G\) as a linear rank statistic:

$$\begin{aligned} S_n^G(t,{{\widehat{\vartheta }}})= \sum _{i=1}^{[nt]} \Bigl ( \frac{n}{n_1n_2} \Bigr )^{1/2}G_{(i)}\delta _{(i)} \Bigl ( c_{(i)} - \frac{{{\widehat{\vartheta }}} \sum _{m=i}^nc_{(m)}}{\sum _{m=i}^n(1-c_{(m)}) + {{\widehat{\vartheta }}} \sum _{m=i}^nc_{(m)}} \Bigr ). \end{aligned}$$
(20)

To obtain the asymptotic behavior of the process \(t\mapsto S_n^G(t,{{\widehat{\vartheta }}})\) (\(t\in [0,1]\)) we apply Proposition 1. In contrast to the two previous proofs, we have already replaced \(\vartheta \) by \({{\widehat{\vartheta }}}\) here. That is why we do not need to discuss the difference \(S_n^G(t,\vartheta )-S_n^G(t,{{\widehat{\vartheta }}})\) at the proof’s end. Let us now introduce the natural filtration \(\mathscr {G}_{n,i}=\sigma (G_{(1)},\ldots ,G_{(i)})\) (\(0\le i \le n\)). Denoting the summands in (20) by \(\xi _{n,i}\)\((1\le i \le n)\) we, clearly, have \({ E }(\xi _{n,i}\vert \mathscr {G}_{n,i-1})={ E }(\xi _{n,i})=0\). To verify the conditional Lindeberg condition, first note that there is a constant \(\eta >0\) independent of i and n such that \(|\xi _{n,i}|\le \eta n^{-1/2}|G_i|\). Hence, it is sufficient to show for all \(\varepsilon >0\) that

$$\begin{aligned} \sum _{i=1}^n { E }( n^{-1}G_{(i)}^2 {\mathbf {1}}\{|G_{(i)}|\ge n^{1/2} \varepsilon \eta ^{-1}\}) = { E }( G_{(1)}^2 {\mathbf {1}}\{|G_{(1)}|\ge n^{1/2} \varepsilon \eta ^{-1}\}) \end{aligned}$$

tends to 0. This follows immediately from Lebesgue’s dominated limit theorem and \({ E }(G_{(1)}^2)=1\). In the final step we discuss the asymptotic behavior of the predictable quadratic variation process \(t\mapsto \sigma _n^{2,G}(t,{{\widehat{\vartheta }}})\) (\(t\in [0,1]\)) given by

$$\begin{aligned} \sigma _n^{2,G}(t,{{\widehat{\vartheta }}})&=\sum _{i=1}^{[nt]}{ E }(\xi _{n,i}^2\vert {\mathscr {G}}_{n,i-1}) \\&= \frac{n}{n_1n_2} \sum _{i=1}^{[nt]} \delta _{(i)}\Bigl ( c_{(i)} - \frac{{{\widehat{\vartheta }}} \sum _{m=i}^nc_{(m)}}{\sum _{m=i}^n(1-c_{(m)}) + {{\widehat{\vartheta }}} \sum _{m=i}^nc_{(m)}} \Bigr )^2. \end{aligned}$$

Rewriting \(\sigma _n^{2,G}(t,{{\widehat{\vartheta }}})\) in terms of the (nonrandom) functions \(Y_j\), \(N_j\) we obtain for \(t\in [0,1]\)

$$\begin{aligned}&\sigma _n^{2,G}(t,{{\widehat{\vartheta }}})\\&\quad = \frac{n}{n_1n_2} \Bigl ( \int _0^{X_{([nt])}}1 - 2 {{\widehat{\vartheta }}}\frac{Y_2}{Y_1+{{\widehat{\vartheta }}} Y_2}\,\mathrm { d }N_2 + \int _0^{X_{([nt])}} \frac{ {{\widehat{\vartheta }}}^2 Y_2^2}{(Y_1+{{\widehat{\vartheta }}} Y_2)^2} \,\mathrm { d }N \Bigr )\\&\quad =\frac{n^2}{n_1n_2} \Bigl ( \int _{\{ \frac{Y}{n}\ge \frac{n-[nt]+1}{n} \}} \frac{Y_1^2}{(Y_1+{{\widehat{\vartheta }}} Y_2)^2} \,\mathrm { d }\frac{N_2}{n} + \int _{\{ \frac{Y}{n}\ge \frac{n-[nt]+1}{n} \}} \frac{{{\widehat{\vartheta }}}^2 Y_2^2}{(Y_1+{{\widehat{\vartheta }}} Y_2)^2} \,\mathrm { d }\frac{N_1}{n}\Bigr )\\&\qquad \rightarrow \frac{1}{\kappa _1\kappa _2}\Bigl ( \int _{\{y\ge 1-t\}} \frac{y_1^2}{(y_1+\vartheta y_2)^2} \,\mathrm { d }L_1 + \int _{\{y\ge 1-t\}} \frac{\vartheta ^2 y_2^2}{(y_1+\vartheta y_2)^2} \,\mathrm { d }L_1 \Bigr ) \equiv \sigma ^{2,G}(t,\vartheta ). \end{aligned}$$

Applying Proposition 1 yields distributional convergence of \(S_n^G(\cdot ,{{\widehat{\vartheta }}})\) to the rescaled Brownian motion \(B(\sigma ^{2,G}(\cdot ,\vartheta ))\) on the Skorohod space D[0, 1]. Similarly to the proofs of Theorems 1 and 3, we can conclude from the continuous mapping theorem that

$$\begin{aligned} T_n^G \rightarrow \sigma ^2(t,\vartheta )^{-1/2}\sup _{t\in [0,1]}\{ |B(\sigma ^{2,G}(t,\vartheta )) - \frac{\sigma ^{2}(t,\vartheta )}{\sigma ^{2}(1,\vartheta )}B(\sigma ^{2,G}(t,\vartheta )){|} \} \equiv T_0. \end{aligned}$$

Finally, it remains to show \(\sigma ^{2,G}(t,{{\widehat{\vartheta }}})= \sigma ^{2}(t,\vartheta )\) under \(H_0\). By inserting \(L_j(t)=\int y_j(1+(\vartheta -1){\mathbf {1}}\{j=2\})\,\mathrm { d }A_1\) in the formulas of \(\sigma ^{2,G}(t,{{\widehat{\vartheta }}})\) and \(\sigma ^{2}(t,\vartheta )\) the equality follows immediately.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ditzhaus, M., Janssen, A. Bootstrap and permutation rank tests for proportional hazards under right censoring. Lifetime Data Anal 26, 493–517 (2020). https://doi.org/10.1007/s10985-019-09487-9

Download citation

Keywords

  • Wild bootstrap
  • Permutation
  • Logrank test
  • Right censoring
  • Proportional hazards
  • Cox model