INTRODUCTION

In 2004, the Japanese Ministry of Health, Labor and Welfare established a new postgraduate clinical training program, mandating that residents complete a 2-year internship after graduating from medical school and passing the National Examination for Medical Practitioners. In those 2 years, residents are expected to achieve a basic set of skills for primary care practice, obtained through rotations in internal medicine, surgery, and emergency medicine during the first year, and pediatrics, obstetrics and gynecology, psychiatry, and community medicine during the second year. In 2010, the required departments for rotation were reduced from seven to three: internal medicine, emergency department (ED), and community medicine. Residents are allowed to rotate on surgery, anesthesiology, pediatrics, OB-GYN, or psychiatry as electives. After this 2-year residency, they choose their subspecialty program.

Knowledge of diagnostic errors such as cognitive bias or heuristics is an important factor in preventing such errors. No studies to date have explored experiential variables, such as workload, ER rotations, or minutes of study per day, that might improve D-KAT scores. In a previous study, we found that GM-ITE scores were improved with optimal resident caseload, ED rotations, and access to online resources.1 The purpose of the current study was to assess diagnostic error knowledge among residents throughout Japan and to compare the results with the benchmark scores in a previous US study.2 We also evaluate the relationships between diagnostic error knowledge and self-study, clinical knowledge, and experience.

METHODS

We conducted a nationwide study of Japanese PGY-1 and -2 residents. The Diagnostic Error Knowledge Assessment Test (D-KAT) and General Medicine In-Training Examination (GM-ITE) were administered at the end of the 2014 academic year. Immediately after the D-KAT and GM-ITE were conducted, the participants completed a questionnaire regarding their clinical experience and self-study habits, assessing ED rotations per month (the average frequency of night duty per month), the mean number of inpatients handled at one time, and mean daily minutes of self-study.

Diagnostic error knowledge was assessed using the D-KAT, a 13-item multiple-choice test developed and validated by Reilly et al. in 2010.2 The GM-ITE is a clinical knowledge test that uses a methodology similar to that of the US Internal Medicine In-Training Examination (IM-ITE).3 The purpose of the IM-ITE is to provide residents and program directors with an objective, reliable, and valid assessment of clinical knowledge through a multiple-choice examination.4 , 5 Our GM-ITE, which was designed and written by a committee of experienced physicians assembled by the Japan Institute for Advancement of Medical Education Program (a non-profit organization in Tokyo, Japan), included 87 questions testing a wide range of clinical knowledge, from clinical skills and practical medical knowledge to psychosocial care of patients.6

The mean D-KAT scores were compared between Japanese and US residents using Student’s t test.2 In addition, the associations between D-KAT scores and gender, PGY, ED rotations per month, mean number of inpatients handled at any given time, and mean daily minutes of self-study were analyzed with or without adjusting for GM-ITE scores (calculated by the domains other than the D-KAT score). Linear mixed models were used to account for hospital variability as random intercepts, with Huber-White (sandwich) variance estimators. Answers of “unknown” (for ED duty and inpatient caseload) were retained in the analyses, whereas non-responders for either ED duty, inpatient caseload, or self-study (N = 224) were excluded from mixed modeling. We also conducted the same analyses for the D-KAT score as a dependent variable, with and without adjusting for the GM-ITE score (except for the D-KAT score). A two-tailed p value of <0.05 was considered statistically significant. Analyses were performed using SAS version 9.4 software (SAS Institute Inc., Cary, NC). Structural equation models were built to test for the direct and indirect determinants of D-KAT performance using standardized estimates of effect and standard goodness-of-fit measures (STATA 14; StataCorp LP, College Station, TX).7 The study was approved by the Ethics Committee of the Tsukuba University Mito Medical Center.

RESULTS

A total of 2652 residents (1123 PGY-1 and 1529 PGY-2) from 258 teaching hospitals participated in the examination; 777 (29.3%) of participants were women. The voluntary survey response rate was 91.6% (2429/2652). Among respondents, the weighted average number of ED rotations was 3.94, with 3% having no rotations, 8% with 1–2, 70% with 3–5, and 17% with 6 or more rotations. By self-report, residents managed an average of 6.9 inpatients (weighted average value): 9% reported managing 4 or fewer, 64% reported 5–9, 19% noted 10–14, and 6% reported 15 or more. Residents reported spending an average of 42.2 min studying (weighted average value), with 39% spending less than 30 min, 38% between 31 and 60 min, 13% 61–90 min, and 4% 91 min or more of daily self-study.

The mean (± SD) D-KAT score among Japanese PGY-2 residents (6.2 ± 1.6) was significantly lower than that of their US PGY-2 counterparts (8.3 ± 1.5; p < 0.001).2 Similar to US resident scores, GM-ITE scores increased from PGY-1 to PGY-2 (6.1 to 6.2, respectively; p = 0.018).8 Using linear mixed models, the D-KAT scores were found to be associated with GM-ITE scores and inpatient caseload (Table 1). The GM-ITE scores were associated with D-KAT scores (estimate of change in score, 0.64; 95% confidence interval, 0.45–0.83; p < 0.001), ED rotations (≥6 rotations: 2.14; 0.16–4.13; p = 0.03), inpatient caseloads (5–9 patients: 1.79; 0.82–2.76; p < 0.001), and average daily minutes of self-study (≥91 min: 2.05; 0.56–3.53; p = 0.01; Table 2).

Table 1 Linear Mixed Model Estimates for D-KAT Scores among Responding Doctors (N = 2428)
Table 2 Linear Mixed Model Estimates for GM-ITE Score (except for D-KAT) among Responding Doctors (N = 2428)

SEM models indicated a direct relationship between GM-ITE scores and D-KAT performance (ß = 0.37, 95% CI: 0.34–0.41; Fig. 1), and suggested that D-KAT scores were indirectly affected by the number of ED rotations (ß = 00.06, 95% CI: 0.02–0.10), inpatient caseload (ß = 00.04, 95% CI: 0.003–0.08), and average daily minutes of study (ß = 00.13, 95% CI: 0.09–0.17). Our model was well fitted to our data (root mean square error of approximation [RMSEA]: 0.03; comparative fit index [CFI]: 0.98; Tucker-Lewis index [TLI]: 0.96).

Figure 1
figure 1

Structural equation modeling for D-KAT scores. D-KAT = Diagnostic Error Knowledge Assessment Test.

DISCUSSION

It is unclear why Japanese residents scored more poorly on diagnostic error knowledge than US residents. We can infer, however, that graduate and postgraduate medical education on clinical reasoning has not reached a level similar to that in the US. Medical education in clinical clerkships in Japan has long been provided during subspecialty rotations that occur in the PGY years following the first 2 years. This long-held educational tradition has brought fewer educational opportunities for clinical reasoning and even postgraduate clinical training because of the short training period in the department of general medicine (GM) or general internal medicine (GIM).6 Compared to subspecialty physicians, teachers of GIM or GM departments may be better able to provide their students and residents specific clinical problem-solving knowledge, including that for diagnostic error.

Our study suggests that self-study, time spent in ED rotations, and caseload are all directly associated with GM-ITE scores, and are indirectly associated with D-KAT scores through the relationship between GM-ITE and D-KAT scores. A greater number of ED rotations and minutes of self-study and a larger inpatient caseload appeared to contribute to general improvement in clinical knowledge and skill. An appropriate caseload in both emergency and inpatient settings may provide residents with the experience needed to reflect on various cognitive biases, leading to improved clinical and diagnostic reasoning. In addition, the development of independent curricula, such as simulation-based training, reflective practice, or active metacognitive review, may be useful.9,10, 11 Our SEM models allowed us to explore the potential causal relationships between our variables, but given the nature of our cross-sectional data, our results can only be speculative. Our findings should be validated in a larger longitudinal study.

In conclusion, diagnostic error knowledge among Japanese residents was poor compared with that among US residents. GM-ITE performance was related to scores on D-KAT. Diagnostic error knowledge was indirectly related to increased clinical experience and self-study. Resident diagnostic error performance may improve with greater opportunities for reflection on inpatient and emergency patient conditions and additional independent curricula.10 , 11