Summary
Evaluation of clinical performance for physicians in training is central to assuring qualified practitioners. The time-honored method of oral examination after a single patient suffers from several measurement shortcomings. Too little sampling, low reliability, partial validity and potential for evaluator bias undermine the oral examination. Since 1975, standardized clinical examinations have developed to provide broader sampling, more objective evaluation criteria and more efficient administration. Research supports reliability of portrayal and data capture by standardized patients as well as the predictability of future trainee performance. Methods for setting pass marks for cases and the whole test have evolved from those for written examinations. Pass marks from all methods continue to fail an unacceptably high number of learners without additional adjustments. Studies show a positive impact of these examinations on learner study behaviors and on the number of direct observations of learners’ patient encounters. Standardized clinical performance examinations are sensitive and specific for benefits of a structured clinical curriculum. Improvements must include better alignment of a test’s purpose, measurement framework and scoring. Data capture methods for clinical performance at advanced levels need development. Checklists completed by standardized patients do not capture the organization or approach a learner takes in the encounter. Global ratings completed by faculty hold promise but more work is needed. Future studies should investigate the validity of case and test-wise pass marks. Finally research on the development of expertise should guide the next generation of assessment tasks, encounters and scoring in standardized clinical examinations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abrahamowicz, M., Tamblyn, R. M., Ramsay, J. O., Klass, D. K., & Kopelow, M. L. (1990). Detecting and correcting for rater-induced differences in standardized patient tests of clinical competence.Academic Medicine65, S25–S26.
Allen, S. S., Bland, C. J., Harris, I. B., Anderson, D., Poland, G., Satran, L., & Miller, W. (1991). Structured clinical teaching strategy.Medical Teacher 13177–184.
Anderson, D. C., Harris, I. B., Allen, S., Satran, L., Bland, C. J., Davis-Feickert, J. A., Poland, G. A., & Miller, W. J. (1991). Comparing students’ feedback about clinical instruction with their performances.Academic Medicine 6629–34.
Anderson, M. B., Stillman, P. L., & Wang, Y. (1994). Growing use of standardized patients in teaching and evaluation in medical education.Teaching and Learning in Medicine 615–22.
Barrows, H. S., & Bennett, K. (1972). The diagnostic (problem solving) skill of the neurologist: experimental studies and their implications for neurological training.Archives of Neurology 26273–275.
Berk, R. A. (1986). A consumer’s guide to setting performance standards on criterion-referenced tests.Review of Educational Research 56137–172.
Berner, E. S., Hamilton, L. A. & Best, W. R. (1974). A new approach to evaluating problem-solving in medical students.Journal of Medical Education 49666–671.
Brennan, R. (1983).Elements of generalizability theory.Iowa City, IA: American College Testing Program. Cassell, E. J. (1990).The nature of suffering and the goals of medicine.New York: Oxford University Press.
Cater, J. I., Forsyth, J. S., & Frost, G. J. (1991). The use of the objective structured clinical examination as anaudit of teaching and student performance.Medical Teacher 13253–257.
Cohen, D. S., Colliver, J. A., Marcy, M. S., Fried, E. D., & Swartz, M. H. (1996). Psychometric properties of a standardized-patient checklist and rating-scale form used to assess interpersonal and communication skills.Academic Medicine 71S87–89.
Cohen, R., Rothman, A. I., Poldre, P.&Ross, J. (1991). Validity and generalizability of global ratings in an objective structured clinical examination.Academic Medicine 66545–548.
Cohen, R., Rothman, A. I., Ross, J., & Poldre, P. (1991). Validating an objective structured clinical examination (OSCE) as a method for selecting foreign medical graduates for a pre-internship program.Academic Medicine 66S67–S69.
Colliver, J. A., & Williams, R. G. (1993). Technical issues: test application.Academic Medicine 68454–460.
Colliver, J. A., Marcy, M. L., Travis, T. A., & Robbs, R. S. (1991). The interaction of student gender andstandardized-patient gender on a performance-based examination of clinical competence.AcademicMedicine 66S31–S33.
Colliver, J. A., Markwell, S. J., Vu, N. V., & Barrows, H. S. (1990a). Case specificity of standardized-patient examinations: Consistency of performance on components of clinical competence within and between cases.Evaluation in the Health Professions 13252–261.
Colliver, J. A., Mast, T. A., Vu, N. V.&Barrows, H. S. (1991). Sequential testing with a performance-based examination using standardized patients.Academic Medicine 66S64–S66.
Colliver, J. A., Morrison, L. J., Markwell, S. J., Verhulst, S. J., Steward, D. E., Dawson-Saunders, E.&Barrows, H. S. (1990b). Three studies of the effect of multiple standardized patients on intercase reliability of five standardized-patient examinations.Teaching and Learning in Medicine 2237–245.
Colliver, J. A., Steward, D. E., Markwell, S. J., & Marcy, M. L. (1991). Effect of repeated simulations by standardized patients on intercase reliability.Teaching and Learning in Medicine 315–19.
Colliver, J. A., Vu, N. V., Marcy, M. L., Travis, T. A., & Robbs, R. S. (1993). The effects of examinee and standardized-patient gender and their interaction on standardized-patient ratings of interpersonal and communication skills.Academic Medicine2, 153–157.
Colliver, J. A., Vu, N. V., Markwell, S. J., & Verhulst, S. J. (1991). Reliability and efficiency of components of clinical competence assessed with five performance-based examinations using standardized patients.Medical Education25, 303–310.
Des Marchais, J. E. (1993). A student-centered, problem-based curriculum: 5 years’ experience.Canadian Medical Association Journal 1481567–1572.
Elstein, A. S., Shulman, L. S., & Sprafka, S. A. (1978).Medical problem-solving: an analysis of clinical reasoning.Cambridge, MA: Harvard University Press.
Ericsson, K. A., & Charness, N. (1994). Expert performance: its structure and acquisition.American Psychologist 49725–747.
Frederiksen, N. (1984). The real test bias: Influences of testing on teaching and learning.American Psychologist 39193–202.
Gallagher, T. H., Lo, B., Chesney, M., & Christensen, K. (1997). How do physicians respond to patient’srequests for costly, unindicated services?Journal of General Internal Medicine12, 663–668.
Glass, G. V. (1978). Standards and criteria.Journal of Educational Measurement15, 237–261.
Guilford, J. P. (1965).Fundamental statistics in psychology and education.New York: McGraw-Hill, 486–489.
Hambleton, R. K., & Powell, S. (1993). A framework for viewing the process of standard setting.Evaluation in the Health Professions6, 3–24.
Harden, R. M., & Gleeson, F. A. (1979). Assessment of clinical competence using an objective structured clinical examination (OSCE).Medical Education 1341–54.
Harden, R. M., Stevenson, M., Downie, W. W., & Wilson, G. M. (1975). Assessment of clinical competence using objective structured examination.British Medical Journal1(5955), 447–451.
Hodder, R. V., Rivington, R. N., Calcutt, L. E., & Hart, I. R. (1988). The effectiveness of immediate feedback during the objective structured clinical examination.Medical Education 23184–188.
Jaegar, R. M., & Tittle, C. K. (Eds.) (1980).Minimum competency testing: Motives models measures and consequences.Berkeley, CA: McCutchan.
Kassebaum, D. G. (1990). The measurement of outcomes in the assessment of educational program effectiveness.Academic Medicine65, 293–296.
Kassirer, J. P., & Gorry, G. A. (1978). Clinical problem-solving: A behavioral analysis.Annals of Internal Medicine89, 245–255.
Kohn, L. T., Corrigan, J. M., & Donaldson, M. S. (Eds.) (1999).To err is human: building a safer health system.Committee on Quality of Health Care in America, Institute of Medicine. Washington, D.C.: National Academy Press.
Linn, R. L. (Ed.) (1989).Educational measurementLondon: Collier Macmillan.
Livingston, S. A., & Zieky, M. J. (1982).Passing scores: a manual for setting standard of performance on educational and occupational tests.Princeton, NJ: Educational Testing Service.
Lloyd, J. S., Williams, R. G., Simonton, D. K., & Sherman, D. (1990). Order effects in standardized patient examinations.Academic Medicine 65S51–S52.
Matsell, D. G., Wolfish, N. M.&Hsu, E. (1991). Reliability and validity of the objective structured clinical examination in pediatrics.Medical Education 25293–299.
Mattem, W. D., Weinholtz, D., & Friedman, C. P. (1984). The attending physician as teacher.New England Journal of Medicine 2371129–1132.
Maxwell J. A., Cohen, R. M., & Reinhard, J. D. (1983). A qualitative study of teaching rounds in a department of medicine.Proceedings of Annual Conference on Research in Medical Education22, 192–197.
Morrison, L. J., & Barrows, H. S. (1994). Developing consortia for clinical practice examinations: The Macy Project.Teaching and Learning in Medicine 623–27.
Mosier, C. L. (1943). On the reliability of a weighted composite.Psychometrika 8161–168.
Newble, D. L (1988). Eight years’ experience with a structured clinical examination.Medical Education 22200–204.
Newble, D., & Jaeger, K. (1983). The effects of assessments and examinations on the learning of medical students.Medical Education 17165–171.
Newble, D. L., & Swanson, D. B. (1983). Psychometric characteristics of the objective structured clinical examination.Medical Education 22325–334.
Norcini, J. J. (1990). Equivalent pass/fail decisions.Journal of Educational Measurement27, 59–66.
Norcini, J. J. (1992). Approaches to standard setting for performance-based examinations.Proceedings of the Fifth Ottawa Conference on the Assessment of Clinical Competence.Dundee, Scotland, 33–37.
Norcini, J. J. Jr. (1999). Standards and reliability in evaluation: when rules of thumb don’t apply.Academic Medicine 741088–1090.
Norcini, J., Stillman, P., Regan, M. B., Haley, H., Sutnick, A., Williams, R., & Friedman, M. (1992). Scoring and standard-setting with standardized patients. Presented at the annual meeting of the American Educational Research Association, San Francisco, CA.
Norman, G. (1985). Objective measurement of clinical performance.Medical Education 1943–47.
Petrusa, E. R. (1987). The effect of number of cases on performance on a standardized multiple-stations clinical examination.Journal of Medical Education 62859–860.
Petrusa, E. R., Blackwell, T. A., & Ainsworth, M. A. (1990). Reliability and validity of an objective structured clinical examination for assessing the clinical performance of residents.Archives of Internal Medicine 150573–577.
Petrusa, E. R., Blackwell, T. A., Carline, J., Ramsey, P. G., McGaghie, W. C., Colindres, R., Kowlowitz, V., Mast, T. A., & Soler, N. (1991). A multi-institutional trial of an objective structured clinical examination.Teaching and Learning in Medicine 386–94.
Petrusa, E. R., Hales, J. W., Wake, L., Harward, D. H., Hoban, D., & Willis, S. (2000). Prediction accuracy and financial savings for four screening tests of a sequential test of clinical performance.Teaching and Learning in Medicine 124–13.
Petrusa, E. R., Guckian, J. C.&Perkowski, L. C. (1984). A multiple station objective clinical evaluation.Proceedings of the Twenty-third Annual Conference on Research in Medical Education 23211–216.
Petrusa, E. R., Richards, B., Willis, S., Smith, A., Harward, D., & Camp, M.G. (1994). Criterion referenced pass marks for a clinical performance examination. Presented at the annual meeting of the Association of American Medical Colleges, Washington, DC.
Poldre, P. A., Rothman, A. I., Cohen, R., Dirks, F., & Ross, J. A. (1992). Judgmental-empirical approach to standard setting for an OSCE. Presented at the annual meeting of the American Educational Research Association, San Francisco, CA.
Rethans, J. J., Drop, R., Sturmans, F., & Van der Vleuten, C. (1991). A method for introducing standardized (simulated) patients into general practice consultations.British Journal of General Practice 4194–96.
Reznick, R., Smee, S., Rothman, A., Chalmers, A., Swanson, D., Dufresne, L., Lacombe, G., Baumber, J., Poldre, P., & Levasseur, L. (1992). An objective structured clinical examination for the licentiate: report of the pilot project of the Medical Council of Canada.Academic Medicine67, 487–494.
Roloff, M. E., & Miller, G. R. (1987).Interpersonal processes. New directions in communication research.Newbury Park, CA: Sage Publications.
Ross, J. R., Syal, S., Hutcheon, M. A., & Cohen, R. (1987). Second-year students’ score improvement during an objective structured clinical examination.Journal of Medical Education 62857–858.
Rothman, A. I., Cohen, R., Dirks, F. R., & Ross, J. (1990). Evaluating the clinical skills of foreign medical school graduates participating in an internship preparation program.Academic Medicine 65391–395.
Rothman, A., Poldre, P., Cohen, R., & Ross, J. (1993).Standard setting in a multiple station test of clinical skills. Presented at the annual meeting of the American Educational Research Association.
Rutala, P. J., Witzke, D. B., Leko, E. O., & Fulginiti, J. V. (1990). The influence of student and standardized-patient genders on scoring in an objective structured clinical examination.Academic Medicine66, S28–S30.
Rutala, P. J., Witzke, D. B., Leko, E. E., Fulginiti, J. V., & Taylor, P. J. (1990). Student fatigue as a variableaffecting performance in an objective structured clinical examination.Academic Medicine65, S53–S54.
Shatzer, J. H., Wardrop, J. L., Williams, R. G., & Hatch, T. F. (1994). The generalizability of performance on different-station-length standardized patient cases.Teaching and Learning in Medicine 654–53.
Shatzer, J. H., DaRosa, D., Colliver, J. A., & Barkmeier, L. (1993). Station-length requirements for reliable performance-based examination scores.Academic Medicine 68224–229.
Stillman, P. L., Haley, H. L., Regan, M. B.&Philbin, M. M. (1991a). Positive effects of a clinical performance assessment program.Academic Medicine 66481–483.
Stillman, P. L., Regan, M. B., Swanson, D. B., Case, S., McCahan, J., Feinblatt, J., Smith, S. R., Williams, J., & Nelson, D. V. (1990). An assessment of the clinical skills of fourth-year students at four New England medical schools.Academic Medicine 65329–326.
Stillman, P., Swanson, D., Regan, M. B., Philbin, M. M., Nelson, V., Ebert, T., Ley, B., Parrino, T., Shorey, J., & Stillman, A. (1991b). Assessment of clinical skills of residents utilizing standardized patients. A follow-up study and recommendations for application.Annals of Internal Medicine 114393–401.
Subkoviak, M. J. (1976). Estimating reliability from a single administration of a mastery test.Journal of Educational Measurement 13265–276.
Swanson, D. B., & Norcini, J. J. (1989). Factors influencing the reproducibility of tests using standardized patients.Teaching and Learning in Medicine 1158–166.
Swartz, M. H., Colliver, J. A., Bardes, C. L., Charon, R., Fried, E. D., & Moroff, S. (1999). Global ratings of videotaped performance versus global rating of actions recorded on checklists: a criterion for performance assessment with standardized patients.Academic Medicine 741028–1032.
Tamblyn, R. M., Klass, D. J., Schnabl, G. K., & Kopelow, M. L. (1991). The accuracy of standardized patient presentation.Medical Education 25100–109.
Van der Vleuten, C. P. M. (1996). The assessment of professional competence: developments, research and practical implications.Advances in Health Sciences Education 141–67.
Van der Vleuten, C. P. M., & Swanson, D. B. (1990). Assessment of clinical skills with standardized patients: state of the art.Teaching and Learning in Medicine2, 58–76.
Vu, N. V., & Barrows, H. S. (1994). Use of standardized patients in clinical assessments: recent developments and measurement findings.Educational Researcher 2323–30.
Vu, N. V., Barrows, H. S., March, M. L., Verhulst, S. J., Colliver, J. A.&Travis, T. (1992). Six years of comprehensive, clinical performance-based assessment using standardized patients at the Southern Illinois University School of Medicine.Academic Medicine 6743–50.
Williams, R. G., Barrows, H. S., Vu, N. V., Verhulst, S. J., Colliver, J. A., Marcy, M., & Steward, D. (1987). Direct, standardized assessment of clinical competence.Medical Education 21482–489.
Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence.Journal of Educational Measurement 30187–213.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Petrusa, E.R. (2002). Clinical Performance Assessments. In: Norman, G.R., et al. International Handbook of Research in Medical Education. Springer International Handbooks of Education, vol 7. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0462-6_26
Download citation
DOI: https://doi.org/10.1007/978-94-010-0462-6_26
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-3904-8
Online ISBN: 978-94-010-0462-6
eBook Packages: Springer Book Archive