Zusammenfassung
Die Bewertung diagnostischer Tests (DT) oder Maßnahmen unterliegt keinen einheitlichen Regularien. Jedoch ist eine gründliche Evaluierung von DT, die für den Einsatz in der medizinischen Routinepraxis vorgesehen sind, aus wirtschaftlichen, medizinischen und ethischen Gründen empfehlenswert beziehungsweise häufig sogar unabdingbar. Der vorliegende Beitrag handelt wichtige Aspekte dieser Evaluierung ab: die Facetten des Bewertungsproblems bei DT, einschlägige Validitätsmaße für qualitative und quantitative DT, Probleme der Aussagekraft und Übertragbarkeit von Validitätsschätzungen, Ziele und Aufbau von Evaluierungsstudien sowie spezifische methodische Probleme dieser Studien.
Abstract
The evaluation of diagnostic tests (DT) or procedures is not subject to uniform regulations. However, a thorough evaluation of DT which are intended for use in routine medical practice is advisable or even indispensible for economic, medical, and ethical reasons. This article addresses some important aspects of this evaluation: the facets of “evaluating a DT”, common validity measures for qualitative and quantitative DT, the applicability and generalizability of estimates of diagnostic accuracy, objectives and design of studies for evaluating DT, as well as specific methodological problems of theses studies.
Notes
GMDS: Deutsche Gesellschaft für Medizinische Dokumentation, Informatik und Statistik
Literatur
Gross R, Löffler M (1997) Prinzipien der Medizin. Springer, Berlin Heidelberg New York
Sadegh-Zadeh K (1977) Grundlagenprobleme einer Theorie der klinischen Praxis. Teil 1: Explikation des medizinischen Diagnosebegriffs. Metamed 1:76–102
Köbberling J, Trampisch HJ, Windeler J (1989) Memorandum zur Evaluierung diagnostischer Maßnahmen. Schattauer, Stuttgart New York
Feinstein AR (1975) Clinical biostatistics. XXXI. On the sensitivity, specificity and discrimination of diagnostic tests. Clin Pharmacol Ther 17:104–116
Schwarz JA (2005) Leitfaden klinische Prüfungen von Arzneimitteln und Medizinprodukten. Edition Cantor, Aulendorf
Committee For Medicinal Products For Human Use (CHMP) (2008) Guideline on clinical evaluation of diagnostic agents. Draft. European Medicines Agency, London. http://www.emea.europa.eu/pdfs/human/ewp/111998enrev1.pdf
U.S. Dept. of Health and Human Services, Food and Drug Administration (2004) Guidance for industry. Developing medical imaging drug and biological products. Part 1: Conducting safety assessments. Part 2: Clinical indications. Part 3: Design, analysis and interpretation of clinical studies. Rockville, MD. http://www.fda.gov/cber/guidelines.htm
Grimes DA, Schulz KF (2002) Uses and abuses of screening tests. Lancet 359:881–884
Palli D, Russo A, Saieva C et al (1999) Intensive vs clinical follow-up after treatment of primary breast cancer: 10 year update of a randomized trial. J Am Med Assoc 281:1586
The Givio Investigators (1994) Impact of follow-up testing on survival and health-related quality of life in breast cancer patients. J Am Med Assoc 271:1587–1592
Schünemann HJ, Oxman AD, Grozek J et al (2008) Rating quality of evidence and strength of recommendations for diagnostic tests and strategies. BMJ 336:1106–1110
Lord SJ, Irwig L, Simes RJ (2006) When is measuring sensitivity and specificity sufficient to evaluate a diagnostic test, and when do we need randomized trials? Ann Intern Med 144:850–855
Schwarzer G, Vach W, Schumacher M (2000) On the misuses of artificial neural networks for prognostic and diagnostic classification in oncology. Statist Med 19:541–561
Galen RS, Gambino SR (1979) Norm und Normabweichung klinischer Daten. Fischer, Stuttgart New York
Abel U, Wollermann C (2003) Methodological aspects of the evaluation of postoperative cancer surveillance. Part I: Validity. Clinical Laboratory 49:367–377; Part II: Efficacy. Clinical Laboratory 49:379–398
Sackett DL, Haynes RB, Guyatt GH, Tugwell P (1991) Clinical epidemiology: a basic science for clinical medicine, 2nd edn. Little, Brown & Co, Boston
Deeks JJ, Altman DG (2004) Diagnostic tests 4: likelihood ratios. Br Med J 329:168–169
Marienhagen J (2003) Evaluation der Thallium-201 (TI-201)-Chlorid-Single-Photon-Emission-Computed-Tomography (SPECT) zur Differentialdiagnose maligner supratentorieller Hirntumoren. Abschlussarbeit zur Postgraduellen Ausbildung Medizinische Biometrie. Abteilung Medizinische Biometrie, Universität Heidelberg
Hilgers RA (1991) Distribution-free confidence bounds for ROC curves. Meth Inform Med 30:96–101
Hanley JA (1998) Receiver operating characteristics (ROC) curves. In: Armitage P, Colton T (eds) Encyclopedia of biostatistics, Vol.5. Wiley, Chichester, pp 3738–3745
Zhou XH, Obuchowski NA, McClish DK (2002) Statistical methods in diagnostic medicine. Wiley-Interscience, Wiley, New York
Pepe MS (2003) The statistical evaluation of medical tests for classification and prediction. Oxford Univ Press, Oxford
Toledano AY (2003) Three methods for analysing correlated ROC curves: a comparison in real data sets from multi-reader, multi-case studies with a factorial design. Statist Med 22:2919–2933
Braun TM, Alonzo TA (2008) A modified sign test for comparing paired ROC curves. Biostatistics 9:364–372
Abel U, Holle R (2008) Bewertung der Güte von Mess- und Diagnoseverfahren: Bewertung der Validität. In: Rasch D, Herrendörfer G, Bock J et al (Hrsg) Verfahrensbibliothek – Versuchsplanung und -auswertung, 2. Aufl. Oldenbourg, München Wien; Kap. 6.11.3100-6/11/3600. (Text auf CD-ROM als pdf); S 1565–1580
Li F, Fine JP (2008) ROC analysis with multiple classes and multiple tests: methodology and its application in microarray studies. Biostatistics 9:566–576
Internet-Adressen zu ROC Software: http://www.klinikum.uni-heidelberg.de/ Vortraege.5931.0.html#c10832, http://faculty.washington.edu/azhou/books/software.doc
Abel U (1993) Die Bewertung diagnostischer Test. Hippokrates, Stuttgart
Mulherin SA, Miller WC (2002) Spectrum bias or spectrum effect? Subgroup variation in diagnostic test evaluation. Ann Intern Med 137:598–602
Ransohoff DF, Feinstein AR (1978) Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. New Engl J Med 299:926–930
Goehring C, Perrier A, Morabia A (2004) Spectrum bias: a quantitative and graphical analysis of the variability of medical diagnostic test performance. Statist Med 23:125–135
Irwig L, Bossuyt P, Gatsonis C, Lijmer J (2002) Designing studies to ensure that estimates of test accuracy are transferable. BMJ 324:669–671
Li J, Fine JP, Safdar N (2007) Prevalence-dependent diagnostic accuracy measures. Statist Med 26:3258–3273
Köbberling J, Richter K, Trampisch HJ, Windeler J (1991) Methodologie der medizinischen Diagnostik. Springer, Berlin Heidelberg
Fryback DG, Thornbury JR (1991) The efficacy of diagnostic testing. Med Decis Making 11:88–94
Medical Services Advisory Committee (2005) Guidelines for the assessment of diagnostic technologies. Department of Health and Ageing, Canberra. http://www.msac.gov.au
Bossuyt PM, Irwig L, Craig J, Glasziou P (2006) Comparative accuracy: assessing new tests against existing diagnostic pathways. BMJ 332:1089–1092
Vainio H, Bianchini F (2002) Breast cancer screening. IARC handbook of cancer prevention Vol 7. IARC Press, Lyon
Irwig L, Tosteson ANA, Gatsonis C et al (1994) Guidelines for meta-analyses evaluating diagnostic tests. Ann Intern Med 120:667–676
Reid MC, Lachs MS, Feinstein AR (1995) Use of methodological standards in diagnostic test research. J Am Med Assoc 274:645–651
Flynn K (1996) Assessing diagnostic technologies. Technology assessment program. Report No. 1. Health Services Research and Development Service, Office of Research and Development, VA Medical Center, Boston
Greenhalgh T (1997) How to read a paper: papers that report diagnostic or screening tests. BMJ 315:540–543
Jensen K, Abel U (2000) Methodik diagnostischer Validierungsstudien – Fehler in der Studienplanung und Auswertung. Med Klinik 95 (Sondernummer):54–60
Sutton AJ, Abrams KR, Jones DR et al (2000) Methods for meta-analysis in medical research. Chapter 14.4: Meta-analysis of diagnostic test accuracy. John Wiley & Sons Ltd., Chichester, pp 209–215
Deeks JJ (2001) Systematic reviews of evaluations of diagnostic and screening tests. BMJ 323:157–162
Devillé WL, Buntinx F, Bouter LM et al (2002) Conducting systematic reviews of diagnostic studies: didactic guidelines. http://www.biomedcentral.com/1471-2288/3/9
Knottnerus JA, van Weel C, Muris JWM (2002) Evaluation of diagnostic procedures. BMJ 324:477–480
Bossuyt PM, Reitsma JB, Bruns DE et al (2003) Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Clinical Chemistry 49:1–6
Whiting P, Rutjes AWS, Reitsma JB et al (2003) The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. http://www.biomedcentral.com/1471-2288/3/25
IQWiG – Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen (2008) Allgemeine Methoden. Version 3.0 vom 27.05.2008. IQWiQ, Köln. http://iqwig.net/download/IQWiG_Methoden_Version_3_0.pdf
Leeflang MMG, Deeks JJ, Gatsonis C, Bossuyt PMM (2008) Systematic reviews of diagnostic test accuracy – new developments within the Cochrane Collaboration. Submitted for publication. http://dare.uva.nl/document/107241
Köbberling J (1985) Use and usefulness of diagnostic tests. The oral glucose tolerance test and the so called chlorpropamid alcohol flush test. In: Jesdinsky HJ, Trampisch HJ (Hrsg) Prognose- und Entscheidungsfindung in der Medizin. Proc. 30. Jahrestagung der GMDS. Springer, Berlin Heidelberg New York, S 352–369
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abel, U., Jensen, K. Klinische Studien außerhalb des Arzneimittelgesetzes. Bundesgesundheitsbl. 52, 425–432 (2009). https://doi.org/10.1007/s00103-009-0825-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00103-009-0825-5