Abstract
Poor reproducibility of diagnostic criteria is seldom acknowledged as a cause for low precision in clinical research. Yet, very few clinical reports communicate the levels of reproducibility of the diagnostic criteria they use. For example, of 11 – 13 original research papers published per issue in the 10 last 2004 issues of the journal Circulation, none did, and of 5 – 6 original research papers published per issue in the 10 last 2004 issues of the Journal of the American Association only one out of 12 did. These papers involved quality of life assessments, which are, notoriously, poorly reproducible. Instead, many reports used the averages of multiple measurements in order to improve precision without further comment on reproducibility. For example, means of three blood pressure measurements, means of three cardiac cycles, average results of morphometric cell studies from two examiners, means of 5 random fields for cytogenetic studies were reported. Poor reproducibility of diagnostic criteria is, obviously, a recognized but rarely tested problem in clinical research. Evidence-based medicine is under pressure due to the poor reproducibility of clinical trials.1,2 As long as the possibility of poorly reproducible diagnostic criteria has not been systematically addressed, this very possibility cannot be excluded as a contributing cause for this. The current paper reviews simple methods for routine assessment of reproducibility of diagnostic criteria/tests. These tests can answer questions like (1) do two techniques used to measure a particular variable, in otherwise identical circumstances, produce the same results, (2) does a single observer obtain the same results when he/she takes repeated measurements in identical circumstances, (3) do two observers using the same method of measurement obtain the same result.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Julius S. The ALLHAT study: if you believe in evidence-based medicine. Stick to it. Hypertens 2003; 21: 453–454.
Cleophas GM, Cleophas TJ. Clinical trials in jeopardy. Int J Clin Pharmacol Ther 2003; 41: 51–56.
Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin 1979; 2: 420–428.
SPSS 10, SPSS Statistical Software, Chicago, IL, version 2004.
Anonymous. Calculating Cohen’s kappas. http://www.colorado.edu/geography/gcraft/notes/manerror/html/kappa.html
Perloff JK. The clinical recognition of congenital heart disease. Philadelphia, Saunders 1991.
Cleophas AF, Zwinderman AH, Cleophas TJ. Reproducibility of polynomes of ambulatory blood pressure measurements. Perfusion 2001; 13: 328–335.
Imbert-Bismut F, Messous D, Thibaut V, Myers RB, Piton A, Thabut D, Devers L, Hainque B, Mecardier A, Poynard T. Intra-laboratory analytical variability of biochemical markers of fibrosis and activity and reference ranges in healthy blood donors. Clin Chem Lab Med 2004; 42: 323–333.
Petrie A, Sabin C. Assessing agreement. In: Medical statistics at a glance. Blackwell Science, London UK, 2000, page 93.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Cleophas, T.J., Zwinderman, A.H., Cleophas, T.F. (2006). Testing Reproducibility. In: Statistics Applied to Clinical Trials. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-4650-6_27
Download citation
DOI: https://doi.org/10.1007/978-1-4020-4650-6_27
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-4229-4
Online ISBN: 978-1-4020-4650-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)