Abstract
We developed a Bayesian approach to sample size calculations for cross-sectional studies designed to estimate sensitivity and specificity of one or more diagnostic tests. Sample size calculations can be made for common study designs such as one test in one population, two conditionally independent or dependent tests in ≤2 populations, and three tests in ≤2 populations. We determine a sample size combination that yields high predictive probability, with respect to the future study data, of accurate and precise estimates of sensitivity and specificity. We also consider hypothesis testing for demonstrating the superiority or equivalence of one diagnostic test relative to another. The predictive probability can also be computed when the sample size combination is fixed in advance, thereby providing a “power-like” measure for the future study. The method is straightforward to implement using the S-Plus/R library emBedBUGS together with WinBUGS.
Similar content being viewed by others
References
Adcock, C. J. (1997), “Sample Size Determination: A Review,” The Statistician, 46, 261–283.
Albert, P. S., and Dodd, L. (2004), “A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error Withouta a Gold Standard,” Biometrics, 60, 427–435.
Alonzo, T. A., Pepe, M. S., and Moskowitz, C. S. (2002), “Sample Size Calculations for Comparative Studies of Medical Tests for Detecting Presence of Disease,” Statistics in Medicine, 21, 835–852.
Black, M. A., and Craig, B. A. (2002), “Estimating Disease Prevalence in the Absence of a Gold Standard,” Statistics in Medicine, 21, 2653–2669.
Bouma, A., Stegeman, J. A., Engel, B., de Kluijver, E. P., Elbers, A. R. W., and De Jong, M. C. M. (2001), “Evaluation of Diagnostic Tests for the Detection of Classical Swine Fever in the Field Without a Gold Standard,” Journal of Veterinary Diagnostic Investigation, 13, 383–388.
Branscum, A. J., Gardner, I. A., and Johnson, W. O. (2004), “Bayesian Modeling of Animal and Herd-Level Prevalences,” Preventive Veterinary Medicine, 66, 101–112.
— (2005), “Estimation of Diagnostic Test Sensitivity and Specificity Through Bayesian Modeling,” Preventive Veterinary Medicine, 68, 145–163.
Cameron, A. R., and Baldock, F. C. (1998), “A New Probability Formula for Surveys to Substantiate Freedom from Disease,” Preventive Veterinary Medicine, 34, 1–17.
Dendukuri, N., and Joseph, L. (2001), “Bayesian Approaches to Modeling the Conditional Dependence Between Multiple Diagnostic Tests,” Biometrics, 57, 158–167.
Dendukuri, N., Rahme, E., Belisle, P., and Joseph, L. (2004), “Bayesian Sample Size Determination for Prevalence and Diagnostic Test Studies in the Absence of a Gold Standard Test,” Biometrics, 60, 388–397.
Dubey, J. P., Thulliez, P., Weigel, R. M., Andrews, C. D., Lind, P., and Powell, E. C. (1995), “Sensitivity and Specificity of Various Serologic Tests for Detection of Toxoplasma gondii Infection in Naturally Infected Sows,” American Journal of Veterinary Research, 56, 1030–1036.
Enøe, C., Georgiadis, M. P., and Johnson, W. O. (2000), “Evaluation of Sensitivity and Specificity of Diagnostic Tests and Disease Prevalence When the True Disease Status is Unknown,” Preventive Veterinary Medicine, 45, 61–81.
Frössling, J., Bonnett, B., Lindberg, A., and Björkman, C. (2003), “Validation of a Neospora caninum iscom ELISA Without a Gold Standard,” Preventive Veterinary Medicine, 57, 141–153.
Gardner, I. A., Stryhn, H., Lind, P., and Collins, M. T. (2000), “Conditional Dependence Between Tests Affects the Diagnosis and Surveillance of Animal Diseases,” Preventive Veterinary Medicine, 45, 107–122.
Gelfand, A. E., and Smith, A. F. M. (1990), “Sampling Based Approaches to Calculating Marginal Densities,” Journal of the American Statistical Association, 85, 398–409.
Georgiadis, M. P., Johnson, W. O., Gardner, I. A., and Singh, R. (2003), “Correlation-Adjusted Estimation of Sensitivity and Specificity of Two Diagnostic Tests,” Applied Statistics, 52, 63–76.
Georgiadis, M. P., Gardner, I. A., and Hedrick, R. P. (1998), “Field Evaluation of Sensitivity and Specificity of a Polymerase Chain Reaction (PCR) for Detection of N. Salmonis in Rainbow Trout,” Journal of Aquatic Animal Health, 10, 372–380.
Georgiadis, M. P., Johnson, W. O., and Gardner, I. A. (2005), “Sample Size for Estimation of the Accuracy of two Diagnostic Tests in the Absence of a Gold-Standard,” Preventive Veterinary Medicine, 71, 1–10.
Hui, S. L., and Walter, S. D. (1980), “Estimating the Error Rates of Diagnostic Tests,” Biometrics, 36, 167–171.
Johnson, W. O., Gastwirth, J. L., and Pearson, L. M. (2001), “Screening Without a Gold Standard: The Hui-Walter Paradigm Revisited,” American Journal of Epidemiology, 153, 921–924.
Johnson, W. O., Su C-L., Gardner I. A., and Christensen, R. (2004), “Sample Size Calculations for Surveys to Substantiate Freedom of Populations from Infectious Agents,” Biometrics, 60, 165–171.
Joseph, L., Gyorkos, T. W., and Coupal, L. (1995), “Bayesian Estimation of Disease Prevalence and the Parameters of Diagnostic Tests in the Absence of a Gold Standard,” American Journal of Epidemiology, 141, 263–272.
Spiegelhalter, D., Thomas, A., Best, N., and Lunn, D. (2004), WinBUGS 1.4 User Manual, MRC Biostatistics Unit: Cambridge. Available online at www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Branscum, A.J., Johnson, W.O. & Gardner, I.A. Sample size calculations for studies designed to evaluate diagnostic test accuracy. JABES 12, 112–127 (2007). https://doi.org/10.1198/108571107X177519
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1198/108571107X177519