Abstract
Let X, Y and Z be three independent random variables from three different populations. The stress-strength model P(X < Y < Z), the volume under the three-class ROC surface, has extensive applications in various areas since it provides a global measure of differences between or among populations. In this paper, we suggest to make statistical inference for P(X < Y < Z) via two methods, the nonparametric normal approximation and the jackknife empirical likelihood, since the usual empirical likelihood method for U-statistics is too complicated to apply. The results of the simulation studies indicate that these two methods work promisingly compared to other existing methods. Some classical and real data sets were analyzed using these two proposed methods. Practically, for simplicity, the nonparametric normal approximation method should be preferred; for better statistical results, one is suggested to use the JEL method although it is more complex than the normal approximation one.
Similar content being viewed by others
References
Andrew, D.F. and Herzberg, A.M. (1985). Data: a collection of problems from many fields for the student and research worker. Springer, New York.
Arvesen, J.N. (1969). Jackknifing U-statistics. Ann. Math. Statist., 40, 2076–2100.
Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Psych., 4, 941–953.
Chandra, S. and Owen, D.B. (1975). On estimating the reliability of a component subject to several different stresses (strengths). Nav. Res. Logist. Q., 22, 31–39.
Dreiseitl, S., Ohno-Machado, L. and Binder, M. (2000). Comparing three-class diagnostic tests by three-way ROC analysis. Med. Decis. Mak., 20, 323–331.
Dutta, K. and Srivastava, G.L. (1987). An n-standby system with P(X < Y < Z). IAPQ Trans., 12, 95–97.
Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution. Ann. Math. Statist., 19, 293–325.
Ivshin, V.V. (1998). On the estimation of the probabilities of a double linear inequality in the case of uniform and two-parameter exponential distributions. J. Math. Sci., 88, 819–827.
Jing, B.Y., Yuan, J.Q. and Zhou, W. (2009). Jackknife empirical likelihood. J. Amer. Statist. Assoc., 104, 1224–1232.
Koepsell, T.D., Chi Y.Y., Zhou X.H., Lee W.W., Ramos E.M., and Kukull, W.A. (2007). An alternative method for estimating efficacy of the AN1792 Vaccine for Alzheimer’s Disease. Neurology, 69, 1868–1872.
Koroljuk, V.S. and Borovskich, Yu. V. (1994). Theory of U-statistics, mathematics and its applications, vol. 273. Kluwer Academic Publishers: Dordrecht.
Kotz, S., Lumelskii, Y. and Pensky, M. (2003). The stress-strength model and its generalizations: theory and applications. World Scientific Publishing: New Jersey.
Li, J.L. and Fine, J.P. (2008). ROC analysis with multiple classes and multiple tests: methodology and its applications in microarray studies. Biostatistics, 9, 566–576.
Mossman, D. (1999). Three-way ROCs. Med. Decis. Mak., 19, 78–89.
Nakas, C.T. and Yiannoutsos, C.T. (2004). Ordered multiple-class ROC analysis with continuous measurements. Stat. Med., 23, 3437–3449.
Nakas, C.T. and Alonzo, T.A. (2007). ROC graphs for assessing the ability of a diagnostic marker to detect three disease classes with an umbrella ordering. Biometrics, 63, 603–609.
Obuchowski, N.A., Applegate, K.E., Goske, M.J., Arheart, K.L., Myers, M.T. and Morrison, S. (2001). The differential diagnosis for multiple diseases: comparison with the binary-truth state experiment in two empirical studies. Acad. Radiol., 8, 947–954.
Owen, A.B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237–249.
Owen, A.B. (1990). Empirical likelihood ratio confidence regions. Ann. Statist., 18, 90–120.
Quenouille, M.H. (1956). Notes on bias in estimation. Biometrika, 43, 353–360.
Reaven, G.M. and Miller, R.G. (1979). An attempt to define the nature of chemical diabetes using a multimensional analysis. Diabetologia, 16, 17–24.
Sampat, M.P., Patel, A.C., Wang, Y., Gupta, S., Kan, C.W., Bovik, A.C., and Markey, M.K. (2009). Indexes for three-class classification performance assessment-an empirical comparison. IEEE Trans. Inf. Technol. Biomed., 13, 300–312.
Sen, P.K. (1960). On some convergence properties of U-statistics. Calcutta Statist. Assoc. Bull., 10, 1–18.
Shi, X. (1984). The approximate independence of jackknife pseudo-values and the bootstrap methods. J. Wuhan Univ. Hydra-Electric Eng., 2, 83–90.
Tukey, J.W. (1958). Bias and confidence in not-quite large samples. Ann. Math. Statist., 29, 614.
Waegeman, W., De Baets, B. and Boullart, L. (2008). On the scalability of ordered multi-class ROC analysis. Comput. Statist. Data Anal., 52, 3371–3388.
Zhou, X.H. and Castellucio, P. (2004). Adjusting for non-ignorable verification bias in clinical studies for Alzheimer’s disease. Stat. Med., 23, 221–230.
Acknowledgement
We also want to thank two referees for their very helpful comments and criticisms which have helped to improve the paper a great deal. Zhou Wang was partially supported by grant R-155-000-095-112 at the National University of Singapore.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proof of Theorem 2.2
Appendix: Proof of Theorem 2.2
In this part, we provide the technique details to prove Theorem 2.2. In fact, we prove it for the general three-sample U-statistic U. Before proceeding to the proof of Theorem 2.2, we list some results that will be used. Referring to the proofs below, without loss of generality, we may assume that n 1 ≤ n 2 ≤ n 3 thereafter.
As a direct consequence of Theorem 2.1, we have
The following Lemma guarantees the existence and uniqueness of the solution to equation (2.8).
Lemma 5.1
Suppose that \(\sigma_{1,0,0}^2>0\) , \(\sigma_{0,1,0}^2> 0\) , \(\sigma_{0,0,1}^2> 0, \liminf_{n\to \infty}(n_1/\) n 2) > 0, and \(\liminf_{n\to \infty}(n_2/n_3)>0\) . Then as n 1→ ∞, we have
Proof
The proof is similar to the two-sample case. The reader is referred to Jing et al. (2009) for the detailed proof.
Let \(S_n=n^{-1}\sum_{i=1}^{n}(\widehat V_i-E\widehat V_i)^2\) and \(Q_n=\underset{1\leq i\leq n}{\text{max}}|\widehat V_i-E\widehat V_i|\). Under the conditions of Lemma 5.1, some calculations reveal that, as n 1→ ∞, with probability 1
and
Proof of Theorem 2.2. By Lemma 5.1, the solution to equation (2.8) exists and is unique. We next show that this solution γ satisfies \(|\gamma|=O_p(n^{-1/2})\). Noting that, (2.8) together with the fundamental inequality |x±y| ≥ |x| − |y| leads to
By (10), the second term is \(O_p(n_1^{-1/2})\). By (A.1), \(S_n=nS^2_{n_1,n_2,n_3}+o(1)~a.s.\), it follows that \(|\gamma|(1+|\gamma|Q_n)^{-1}=O_p(n^{-1/2})\), hence by (A.2) again, \(|\gamma|=O_p(n^{-1/2})\). Further, if let \(\beta_i=\gamma(\widehat V_i-E\widehat V_i)\), then
On the one hand, expanding (2.8), we get
where the last term is bounded by
Therefore, we may write
where \(|\tau|=o_p(n^{-1/2})\).
On the other hand, in virtue of (A.3) and by a Taylor expansion, we have \(\text{log}(1+\beta_i)=\beta_i-\beta_i^2/2+\alpha_i\), where for some finite A > 0, \(P\{|\alpha_i|\leq A|\beta_i|^3,1\leq i\leq n\}\rightarrow1\) as n→ ∞. Then plugging (A.4) and (2.7) into (2.6), we get
where
and by Theorem 2.1 and (A.2), as n→ ∞,
Hence, from Slutsky’s theorem, we have \(-2\text{log}R(\theta_0)\rightarrow_d\chi_1^2\), which completes the proof.
Rights and permissions
About this article
Cite this article
Guangming, P., Xiping, W. & Wang, Z. Nonparametric statistical inference for P(X < Y < Z). Sankhya A 75, 118–138 (2013). https://doi.org/10.1007/s13171-012-0010-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13171-012-0010-z
Keywords and phrases.
- Confidence intervals
- empirical likelihood
- jackknife
- ROC curve
- stress-strength model
- Studentized U-statistics