Accuracy of rating scale interval values used in multiple mini-interviews: a mixed methods study

Bégin, Philippe; Gagnon, Robert; Leduc, Jean-Michel; Paradis, Béatrice; Renaud, Jean-Sébastien; Beauchamp, Jacinthe; Rioux, Richard; Carrier, Marie-Pier; Hudon, Claire; Vautour, Marc; Ouellet, Annie; Bourget, Martine; Bourdy, Christian

doi:10.1007/s10459-020-09970-1

Accuracy of rating scale interval values used in multiple mini-interviews: a mixed methods study

Published: 06 May 2020

Volume 26, pages 37–51, (2021)
Cite this article

Advances in Health Sciences Education Aims and scope Submit manuscript

Philippe Bégin ORCID: orcid.org/0000-0002-9089-4604^1,7,
Robert Gagnon¹,
Jean-Michel Leduc¹,
Béatrice Paradis¹,
Jean-Sébastien Renaud²,
Jacinthe Beauchamp^3,4,
Richard Rioux⁵,
Marie-Pier Carrier⁶,
Claire Hudon²,
Marc Vautour^3,4,
Annie Ouellet³,
Martine Bourget² &
…
Christian Bourdy¹

603 Accesses
1 Citation
Explore all metrics

Abstract

When determining the score given to candidates in multiple mini-interview (MMI) stations, raters have to translate a narrative judgment to an ordinal rating scale. When adding individual scores to calculate final ranking, it is generally presumed that the values of possible scores on the evaluation grid are separated by constant intervals, following a linear function, although this assumption is seldom validated with raters themselves. Inaccurate interval values could lead to systemic bias that could potentially distort candidates’ final cumulative scores. The aim of this study was to establish rating scale values based on rater’s intent, to validate these with an independent quantitative method, to explore their impact on final score, and to appraise their meaning according to experienced MMI interviewers. A 4-round consensus-group exercise was independently conducted with 42 MMI interviewers who were asked to determine relative values for the 6-point rating scale (from A to F) used in the Canadian integrated French MMI (IFMMI). In parallel, relative values were also calculated for each option of the scale by comparing the average scores concurrently given to the same individual in other stations every time that option was selected during three consecutive IFMMI years. Data from the same three cohorts was used to simulate the impact of using new score values on final rankings. Comments from the consensus group exercise were reviewed independently by two authors to explore raters’ rationale for choosing specific values. Relative to the maximum (A = 100%) and minimum (F = 0%), experienced raters concluded to values of 86.7% (95% CI 86.3–87.1), 69.5% (68.9–70.1), 51.2% (50.6–51.8), and 29.3% (28.1–30.5), for scores of B, C, D and E respectively. The concurrent score approach was based on 43,412 IFMMI stations performed by 4345 medical school applicants. It provided quasi-identical values of 87.1% (82.4–91.5), 70.4% (66.1–74.7), 51.2% (47.1–55.3) and 31.8% (27.9–35.7), respectively. Qualitative analysis explained that while high scores are usually based on minor details of relatively low importance, low scores are usually attributed for more serious offenses and were assumed by the raters to carry more weight in the final score. Individual drop or increase in final MMI ranking with the use of new scale values ranged from − 21 to + 5 percentiles, with the average candidate changing by ± 1.4 percentiles. Consulting with experienced interviewers is a simple and effective approach to establish rating scale values that truly reflects raters’ intent in MMI, thus improving the accuracy of the instrument and contributing to the general fairness of the process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Best Practices for Reducing Bias in the Interview Process

Article 12 October 2022

Qualitative Analysis of Multiple Mini Interview Interviewer Comments

Article 29 July 2019

Reliability and acceptability of six station multiple mini-interviews: past-behavioural versus situational questions in postgraduate medical admission

Article Open access 16 March 2017

Abbreviations

95% CI:: 95% confidence interval
IFMMI:: Integrated French MMI
MMI:: Multiple mini-interviews
OSCE:: Objective structured clinical examination

References

Cook, D. A., et al. (2015). A contemporary approach to validity arguments: A practical guide to Kane’s framework. Medical Education, 49(6), 560–575.
Article Google Scholar
Dawson, P. (2017). Assessment rubrics: Towards clearer and more replicable design, research and practice. Assessment and Evaluation in Higher Education, 42(3), 347–360.
Article Google Scholar
Dowell, J., et al. (2012). The multiple mini-interview in the UK context: 3 years of experience at Dundee. Medical Teacher, 34(4), 297–304.
Article Google Scholar
Eva, K. W., Macala, C., & Fleming, B. (2018). Twelve tips for constructing a multiple mini-interview. Medical Teacher, 41, 1–7.
Google Scholar
Gauthier, G., St-Onge, C., & Tavares, W. (2016). Rater cognition: Review and integration of research findings. Medical Education, 50(5), 511–522.
Article Google Scholar
Griffin, B. N., & Wilson, I. G. (2010). Interviewer bias in medical student selection. Medical Journal of Australia, 193(6), 343–346.
Article Google Scholar
Humphrey-Murto, S., et al. (2017). Using consensus group methods such as Delphi and Nominal Group in medical education research(). Medical Teacher, 39(1), 14–19.
Article Google Scholar
International vocabulary of metrology—Basic and general concepts and associated terms (VIM). JCGM 200:2012 3rd edition.
Jamieson, S. (2004). Likert scales: How to (ab)use them. Medical Education, 38(12), 1217–1218.
Article Google Scholar
Kane, M., & Bridgeman, B. (2017). Research on validity theory and practice at ETS. In R. E. Bennett & M. von Davier (Eds.), Advancing human assessment: The methodological, psychological and policy contributions of ETS (pp. 489–552). Cham: Springer.
Chapter Google Scholar
Kelly, M. E., et al. (2014). The fairness, predictive validity and acceptability of multiple mini interview in an internationally diverse student population—A mixed methods study. BMC Medical Education, 14, 267.
Article Google Scholar
Knapp, T. R. (1990). Treating ordinal scales as interval scales: An attempt to resolve the controversy. Nursing Research, 39(2), 121–123.
Article Google Scholar
Knorr, M., & Hissbach, J. (2014). Multiple mini-interviews: Same concept, different approaches. Medical Education, 48(12), 1157–1175.
Article Google Scholar
Leduc, J. M., et al. (2017). Impact of sociodemographic characteristics of applicants in multiple mini-interviews. Medical Teacher, 39(3), 285–294.
Article Google Scholar
Liddell, T. M., & Kruschke, J. K. (2018). Analyzing ordinal data with metric models: What could possibly go wrong? Journal of Experimental Social Psychology, 79, 328–348.
Article Google Scholar
Nanna, M., & Sawilowsky, S. (1998). Analysis of Likert scale data in disability and medical rehabilitation research. Psychological Methods, 3, 55–67.
Article Google Scholar
Patton, M. Q. (2002). Two decades of developments in qualitative inquiry: A personal. Experiential Perspective, 1(3), 261–283.
Google Scholar
Pau, A., et al. (2013). The multiple mini-interview (MMI) for student selection in health professions training—A systematic review. Medical Teacher, 35(12), 1027–1041.
Article Google Scholar
Popham, W. J. (1997). What’s wrong—And what’s right—With rubrics. Educational Leadership, 55(2), 72–75.
Google Scholar
Rees, E. L., et al. (2016). Evidence regarding the utility of multiple mini-interview (MMI) for selection to undergraduate health programs: A BEME systematic review: BEME Guide No. 37. Medical Teacher, 38(5), 443–455.
Article Google Scholar
Renaud, J.-S., et al. (2016). Sélection des candidats en médecine: Validité prédictive des mini entrevues multiples en contexte francophone. Pédagogie Médicale, 17(1), 7–21.
Article Google Scholar
Sadler, D. R. (2009). Indeterminacy in the use of preset criteria for assessment and grading. Assessment and Evaluation in Higher Education, 34(2), 159–179.
Article Google Scholar
Sebok, S. S., & Syer, M. D. (2015). Seeing things differently or seeing different things? Exploring raters’ associations of noncognitive attributes. Academic Medicine, 90(11 Suppl.), S50–S55.
Article Google Scholar
St-Onge, C., et al. (2017). Validity: One word with a plurality of meanings. Advances in Health Sciences Education: Theory and Practice, 22(4), 853–867.
Article Google Scholar
Timmerman, B. E. C., et al. (2011). Development of a ‘universal’ rubric for assessing undergraduates’ scientific reasoning skills using scientific writing. Assessment and Evaluation in Higher Education, 36(5), 509–547.
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the interviewer who participated to the consensus group exercise and the applicants who filled the demographics surveys.

Funding

None.

Author information

Authors and Affiliations

Faculty of Medicine, Université de Montréal, Montreal, Canada
Philippe Bégin, Robert Gagnon, Jean-Michel Leduc, Béatrice Paradis & Christian Bourdy
Faculty of Medicine, Université Laval, Quebec City, Canada
Jean-Sébastien Renaud, Claire Hudon & Martine Bourget
Faculty of Medicine, Université Sherbrooke, Sherbrooke, Canada
Jacinthe Beauchamp, Marc Vautour & Annie Ouellet
Centre de Formation Médicale du Nouveau-Brunswick, Moncton, Canada
Jacinthe Beauchamp & Marc Vautour
Faculty of Social Science, Université du Québec à Montréal, Montreal, Canada
Richard Rioux
Faculty of Education Sciences, Université Laval, Quebec City, Canada
Marie-Pier Carrier
CHU Sainte-Justine, 3175 Chemin de la Côte-Sainte-Catherine, Montreal, QC, H3T 1C5, Canada
Philippe Bégin

Authors

Philippe Bégin
View author publications
You can also search for this author in PubMed Google Scholar
Robert Gagnon
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Michel Leduc
View author publications
You can also search for this author in PubMed Google Scholar
Béatrice Paradis
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Sébastien Renaud
View author publications
You can also search for this author in PubMed Google Scholar
Jacinthe Beauchamp
View author publications
You can also search for this author in PubMed Google Scholar
Richard Rioux
View author publications
You can also search for this author in PubMed Google Scholar
Marie-Pier Carrier
View author publications
You can also search for this author in PubMed Google Scholar
Claire Hudon
View author publications
You can also search for this author in PubMed Google Scholar
Marc Vautour
View author publications
You can also search for this author in PubMed Google Scholar
Annie Ouellet
View author publications
You can also search for this author in PubMed Google Scholar
Martine Bourget
View author publications
You can also search for this author in PubMed Google Scholar
Christian Bourdy
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

PB, BP, RR and CB designed the consensus group workshop. PB, RR and CB administered the consensus group workshops. PB and BP collected and analyzed the data from the consensus group workshop and performed the qualitative assessment of raters’ comments. PB, JSR and RG conceived the concurrent rating approach. PB, JB, CH, MV, HL, AO, MB, CB designed and administered the MMIs and acquired the data for the concurrent rating analysis. PB and RG performed the concurrent rating analyses. JML, RG, RR and CB designed the demographic survey. PB, JML, CH, MB, HL, RR and AO administered the surveys to applicants. PB and JML analyzed demographic characteristics’ association with interval bias. PB drafted the manuscript and all authors revised it critically for important intellectual content. All authors approved the final version of the manuscript and agreed to be accountable for all aspects of the work.

Corresponding author

Correspondence to Philippe Bégin.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

Ethical approval has been granted by the ethical committee of Université de Montréal on April 20th 2017 (CPER-17-034-D, amended May 9th 2017) and April 5th 2018 (CPER-17-038-D).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bégin, P., Gagnon, R., Leduc, JM. et al. Accuracy of rating scale interval values used in multiple mini-interviews: a mixed methods study. Adv in Health Sci Educ 26, 37–51 (2021). https://doi.org/10.1007/s10459-020-09970-1

Download citation

Received: 29 August 2019
Accepted: 27 April 2020
Published: 06 May 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s10459-020-09970-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accuracy of rating scale interval values used in multiple mini-interviews: a mixed methods study

Abstract

Access this article

Similar content being viewed by others

Best Practices for Reducing Bias in the Interview Process

Qualitative Analysis of Multiple Mini Interview Interviewer Comments

Reliability and acceptability of six station multiple mini-interviews: past-behavioural versus situational questions in postgraduate medical admission

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accuracy of rating scale interval values used in multiple mini-interviews: a mixed methods study

Abstract

Access this article

Similar content being viewed by others

Best Practices for Reducing Bias in the Interview Process

Qualitative Analysis of Multiple Mini Interview Interviewer Comments

Reliability and acceptability of six station multiple mini-interviews: past-behavioural versus situational questions in postgraduate medical admission

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation