Skip to main content
Log in

Nonparametric statistical inference for P(X < Y < Z)

  • Published:
Sankhya A Aims and scope Submit manuscript

Abstract

Let X, Y and Z be three independent random variables from three different populations. The stress-strength model P(X < Y < Z), the volume under the three-class ROC surface, has extensive applications in various areas since it provides a global measure of differences between or among populations. In this paper, we suggest to make statistical inference for P(X < Y < Z) via two methods, the nonparametric normal approximation and the jackknife empirical likelihood, since the usual empirical likelihood method for U-statistics is too complicated to apply. The results of the simulation studies indicate that these two methods work promisingly compared to other existing methods. Some classical and real data sets were analyzed using these two proposed methods. Practically, for simplicity, the nonparametric normal approximation method should be preferred; for better statistical results, one is suggested to use the JEL method although it is more complex than the normal approximation one.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Andrew, D.F. and Herzberg, A.M. (1985). Data: a collection of problems from many fields for the student and research worker. Springer, New York.

    Google Scholar 

  • Arvesen, J.N. (1969). Jackknifing U-statistics. Ann. Math. Statist., 40, 2076–2100.

    Article  MathSciNet  MATH  Google Scholar 

  • Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Psych., 4, 941–953.

    MathSciNet  MATH  Google Scholar 

  • Chandra, S. and Owen, D.B. (1975). On estimating the reliability of a component subject to several different stresses (strengths). Nav. Res. Logist. Q., 22, 31–39.

    Article  MathSciNet  Google Scholar 

  • Dreiseitl, S., Ohno-Machado, L. and Binder, M. (2000). Comparing three-class diagnostic tests by three-way ROC analysis. Med. Decis. Mak., 20, 323–331.

    Article  Google Scholar 

  • Dutta, K. and Srivastava, G.L. (1987). An n-standby system with P(X < Y < Z). IAPQ Trans., 12, 95–97.

    Google Scholar 

  • Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution. Ann. Math. Statist., 19, 293–325.

    Article  MathSciNet  MATH  Google Scholar 

  • Ivshin, V.V. (1998). On the estimation of the probabilities of a double linear inequality in the case of uniform and two-parameter exponential distributions. J. Math. Sci., 88, 819–827.

    Article  MathSciNet  Google Scholar 

  • Jing, B.Y., Yuan, J.Q. and Zhou, W. (2009). Jackknife empirical likelihood. J. Amer. Statist. Assoc., 104, 1224–1232.

    Article  MathSciNet  Google Scholar 

  • Koepsell, T.D., Chi Y.Y., Zhou X.H., Lee W.W., Ramos E.M., and Kukull, W.A. (2007). An alternative method for estimating efficacy of the AN1792 Vaccine for Alzheimer’s Disease. Neurology, 69, 1868–1872.

    Article  Google Scholar 

  • Koroljuk, V.S. and Borovskich, Yu. V. (1994). Theory of U-statistics, mathematics and its applications, vol. 273. Kluwer Academic Publishers: Dordrecht.

    Google Scholar 

  • Kotz, S., Lumelskii, Y. and Pensky, M. (2003). The stress-strength model and its generalizations: theory and applications. World Scientific Publishing: New Jersey.

    Book  Google Scholar 

  • Li, J.L. and Fine, J.P. (2008). ROC analysis with multiple classes and multiple tests: methodology and its applications in microarray studies. Biostatistics, 9, 566–576.

    Article  MATH  Google Scholar 

  • Mossman, D. (1999). Three-way ROCs. Med. Decis. Mak., 19, 78–89.

    Article  Google Scholar 

  • Nakas, C.T. and Yiannoutsos, C.T. (2004). Ordered multiple-class ROC analysis with continuous measurements. Stat. Med., 23, 3437–3449.

    Article  Google Scholar 

  • Nakas, C.T. and Alonzo, T.A. (2007). ROC graphs for assessing the ability of a diagnostic marker to detect three disease classes with an umbrella ordering. Biometrics, 63, 603–609.

    Article  MathSciNet  MATH  Google Scholar 

  • Obuchowski, N.A., Applegate, K.E., Goske, M.J., Arheart, K.L., Myers, M.T. and Morrison, S. (2001). The differential diagnosis for multiple diseases: comparison with the binary-truth state experiment in two empirical studies. Acad. Radiol., 8, 947–954.

    Article  Google Scholar 

  • Owen, A.B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237–249.

    Article  MathSciNet  MATH  Google Scholar 

  • Owen, A.B. (1990). Empirical likelihood ratio confidence regions. Ann. Statist., 18, 90–120.

    Article  MathSciNet  MATH  Google Scholar 

  • Quenouille, M.H. (1956). Notes on bias in estimation. Biometrika, 43, 353–360.

    MathSciNet  MATH  Google Scholar 

  • Reaven, G.M. and Miller, R.G. (1979). An attempt to define the nature of chemical diabetes using a multimensional analysis. Diabetologia, 16, 17–24.

    Article  Google Scholar 

  • Sampat, M.P., Patel, A.C., Wang, Y., Gupta, S., Kan, C.W., Bovik, A.C., and Markey, M.K. (2009). Indexes for three-class classification performance assessment-an empirical comparison. IEEE Trans. Inf. Technol. Biomed., 13, 300–312.

    Article  Google Scholar 

  • Sen, P.K. (1960). On some convergence properties of U-statistics. Calcutta Statist. Assoc. Bull., 10, 1–18.

    MathSciNet  MATH  Google Scholar 

  • Shi, X. (1984). The approximate independence of jackknife pseudo-values and the bootstrap methods. J. Wuhan Univ. Hydra-Electric Eng., 2, 83–90.

    Google Scholar 

  • Tukey, J.W. (1958). Bias and confidence in not-quite large samples. Ann. Math. Statist., 29, 614.

    Article  Google Scholar 

  • Waegeman, W., De Baets, B. and Boullart, L. (2008). On the scalability of ordered multi-class ROC analysis. Comput. Statist. Data Anal., 52, 3371–3388.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou, X.H. and Castellucio, P. (2004). Adjusting for non-ignorable verification bias in clinical studies for Alzheimer’s disease. Stat. Med., 23, 221–230.

    Article  Google Scholar 

Download references

Acknowledgement

We also want to thank two referees for their very helpful comments and criticisms which have helped to improve the paper a great deal. Zhou Wang was partially supported by grant R-155-000-095-112 at the National University of Singapore.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhou Wang.

Appendix: Proof of Theorem 2.2

Appendix: Proof of Theorem 2.2

In this part, we provide the technique details to prove Theorem 2.2. In fact, we prove it for the general three-sample U-statistic U. Before proceeding to the proof of Theorem 2.2, we list some results that will be used. Referring to the proofs below, without loss of generality, we may assume that n 1 ≤ n 2 ≤ n 3 thereafter.

As a direct consequence of Theorem 2.1, we have

$$U-\theta_0=O_p(n^{-1/2}_1). \label{append5} $$
(A.1)

The following Lemma guarantees the existence and uniqueness of the solution to equation (2.8).

Lemma 5.1

Suppose that \(\sigma_{1,0,0}^2>0\) , \(\sigma_{0,1,0}^2> 0\) , \(\sigma_{0,0,1}^2> 0, \liminf_{n\to \infty}(n_1/\) n 2) > 0, and \(\liminf_{n\to \infty}(n_2/n_3)>0\) . Then as n 1→ ∞, we have

$$ P\left\{\underset{1\leq i\leq n}{\emph{min}}(\widehat V_i-E\widehat V_i)<0<\underset{1\leq i\leq n}{\emph{max}}(\widehat V_i-E\widehat V_i)\right\}\longrightarrow1. $$

Proof

The proof is similar to the two-sample case. The reader is referred to Jing et al. (2009) for the detailed proof.

Let \(S_n=n^{-1}\sum_{i=1}^{n}(\widehat V_i-E\widehat V_i)^2\) and \(Q_n=\underset{1\leq i\leq n}{\text{max}}|\widehat V_i-E\widehat V_i|\). Under the conditions of Lemma 5.1, some calculations reveal that, as n 1→ ∞, with probability 1

$$S_n=nS^2_{n_1,n_2,n_3}+o(1),~~Q_n=o(n^{1/2}) \label{append8} $$
(A.2)

and

$$ n^{-1}\sum_{i=1}^{n}|\widehat V_i-E\widehat V_i|^3\leq S_n\times Q_n\leq o(n^{1/2}). \label{append9} $$

Proof of Theorem 2.2. By Lemma 5.1, the solution to equation (2.8) exists and is unique. We next show that this solution γ satisfies \(|\gamma|=O_p(n^{-1/2})\). Noting that, (2.8) together with the fundamental inequality |x±y| ≥ |x| − |y| leads to

$$\begin{array}{lll} 0=|f(\gamma)|&=\frac{1}{n}\left|\sum_{i=1}^{n}(\widehat V_i-E\widehat V_i)-\gamma\sum_{i=1}^{n}\frac{(\widehat V_i-E\widehat V_i)^2}{1+\gamma(\widehat V_i-E\widehat V_i)}\right|\\ &\geq\frac{|\gamma|S_n}{1+|\gamma|Q_n}-\frac{1}{n}\left|\sum_{i=1}^{n}\widehat V_i-\theta_0\right|. \end{array}$$

By (10), the second term is \(O_p(n_1^{-1/2})\). By (A.1), \(S_n=nS^2_{n_1,n_2,n_3}+o(1)~a.s.\), it follows that \(|\gamma|(1+|\gamma|Q_n)^{-1}=O_p(n^{-1/2})\), hence by (A.2) again, \(|\gamma|=O_p(n^{-1/2})\). Further, if let \(\beta_i=\gamma(\widehat V_i-E\widehat V_i)\), then

$$\begin{array}{ll} \underset{1\leq i\leq n}{\text{max}}|\beta_i|&=|\gamma|\underset{1\leq i\leq n}{\text{max}}|\widehat V_i-E\widehat V_i| \\ &=O_p(n^{-1/2})o(n^{1/2})=o_p(1). \label{append10} \end{array} $$
(A.3)

On the one hand, expanding (2.8), we get

$$\begin{array}{lll} 0&=f(\gamma)\\ &=\frac{1}{n}\sum_{i=1}^n\widehat V_i-\theta_0-\gamma S_n+\frac{1}{n}\sum_{i=1}^n\frac{(\widehat V_i-E\widehat V_i)\beta_i^2}{1+\beta_i}, \end{array}$$

where the last term is bounded by

$$ \frac{1}{n}\sum_{i=1}^n\frac{|\widehat V_i-E\widehat V_i|^3}{|1+\beta_i|}\gamma^2=o(n^{1/2})O_p(n^{-1})O_p(1)=o_p(n^{-1/2}). $$

Therefore, we may write

$$\gamma=\left(\frac{1}{n}\sum_{i=1}^n\widehat V_i-\theta_0\right)S_n^{-1}+\tau=(U-\theta_0)S_n^{-1}+\tau \label{append11} $$
(A.4)

where \(|\tau|=o_p(n^{-1/2})\).

On the other hand, in virtue of (A.3) and by a Taylor expansion, we have \(\text{log}(1+\beta_i)=\beta_i-\beta_i^2/2+\alpha_i\), where for some finite A > 0, \(P\{|\alpha_i|\leq A|\beta_i|^3,1\leq i\leq n\}\rightarrow1\) as n→ ∞. Then plugging (A.4) and (2.7) into (2.6), we get

$$\begin{array}{lll} -2\text{log}R(\theta_0)&=-2\sum_{i=1}^n\text{log}(np_i)=2\sum_{i=1}^n\text{log}(1+\beta_i)\\ &=2n\gamma(U-\theta_0)-nS_n\gamma^2+2\sum_{i=1}^n\alpha_i\\ &=\frac{n(U-\theta_0)^2}{S_n}-nS_n\tau^2+2\sum_{i=1}^n\alpha_i, \end{array}$$

where

$$\begin{array}{rll}|-nS_n\tau^2|&=n(nS_{n_1,n_2,n_3}^2+o(1))o_p(n^{-1})=o_p(1),\\ |\sum_{i=1}^n\alpha_i|&\leq A|\gamma|^3\sum_{i=1}^n|\widehat V_i-\theta_0|^3=O_p(n^{-3/2})o(n^{3/2})=o_p(1) \end{array}$$

and by Theorem 2.1 and (A.2), as n→ ∞,

$$ \frac{n(U-\theta_0)^2}{S_n}\overset{d}{\longrightarrow}\chi_1^2. $$

Hence, from Slutsky’s theorem, we have \(-2\text{log}R(\theta_0)\rightarrow_d\chi_1^2\), which completes the proof.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guangming, P., Xiping, W. & Wang, Z. Nonparametric statistical inference for P(X < Y < Z). Sankhya A 75, 118–138 (2013). https://doi.org/10.1007/s13171-012-0010-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13171-012-0010-z

Keywords and phrases.

AMS (2000) subject classification.

Navigation