Introduction

Anterior cruciate ligament (ACL) tear is one of the most common injuries in the young and active population with an estimate of 20,000–30,000 tears and over 15,000 ACL reconstruction procedures performed per year in Japan [1]. Reconstruction surgery is necessary if patients wish to participate in sports unrestrictedly and return to their pre-injury level. However, a recent meta-analysis of return-to-sport outcomes demonstrated that while 85% of patients returned to some form of sports participation after surgery, only 64% returned to their pre-injury level. In addition, only 56% were able to return to competitive sports [2]. Patients with ACL reconstruction may not return to their pre-injury sports or level for a variety of reasons. A multicenter cohort study reported that of the non-returners in this study, 50% cited fear of re-injury as a reason for not returning [3]. A concept of ‘psychological readiness’ following ACL injury has been highlighted as an important factor during sports activity [4].

To assess psychological factors in patients with ACL injury, several measurement instruments were developed and validated [5, 6]. Among those, the Tampa Scale for Kinesiophobia (TSK) has been used to evaluate fear of re-injury, pain or movement in ACL injury patients [7, 8]. Kinesiophobia, which has been defined as an excessive, irrational and debilitating fear of physical movement and activity resulting from a feeling of vulnerability to painful injury or re-injury [9]. According to the past study, the not return to sports group has higher mean scores of TSK-11 than return to sports group after ACL reconstruction surgery [7]. Although the original English version of TSK was translated into Japanese in 2013 [10,11,12], it has not been validated to be applied to patients with ACL injury.

Purpose

The purpose of this study was to evaluate the Japanese version of TSK in patients with ACL injury according to the COSMIN checklist.

Materials and methods

Participants

This prospective study was performed at the Juntendo university hospital from Sep 2016 to Apr 2017. Patients with the following criteria were included in this study: (1) diagnosed with ACL injury by physical examination and MRI, (2) understand Japanese language, (3) completed the TSK-J, the IKDC Subjective Knee Form (IKDC-SKF), the JACL-25, the Visual Analog Scale for Sports (VAS-Sports/ 0-100 mm) and the Patient Global Impression of Change (PGIC/ 1–7 points), (4) age between 16 and 65 years, and (5) with no mental illness.

Evaluation of measurement properties

The TSK contains 17 items related to pain, fear of movement and re-injury. The score ranges from 17 to 68, and the higher scores, the greater pain, fear of movement and re-injury [13]. Translation and cultural adaptation of the Japanese TSK was performed according to the Principles of Good Practice approach, which is allowing for different ways to achieve the same goal for each step in the process of translation [11, 14].

The IKDC-SKF has been developed to assess knee conditions, including symptoms, functions, and sports activities for patients with a variety of knee problems [15]. It consists of 19 items with a range from 0 to 100, and the higher score indicates fewer symptoms, better functions, and higher sports activities. It has been widely used to assess physical factors after ACL reconstruction surgery in many countries [16, 17].

The JACL-25 was developed and validated to assess fear of motion during daily activity and sports participation for patients with ACL injuries [6]. It contains 25 items with the scores range from 0 to 100, and a higher score indicates a worse condition. Each item in JACL-25 was defined with specific life-experiences (knee instability condition) that may frequently occur in patients with ACL injury.

The PGIC reflects patients who make a subjective judgment about the meaning of change (improvement) following treatment. It is answered on a 7-point scale of 1 = very much worse; 2 = much worse; 3 = minimally worse; 4 = no change; 5 = minimally improved; 6 = much improved; 7 = very much improved.

Measurement properties

We evaluated the reliability, validity, responsiveness, and interpretability to the TSK-J according to the COSMIN guidelines. In addition, the quality of the TSK-J was also evaluated by current updated criteria for good measurement properties [18].

Reliability

This domain contains three measurement properties, i.e., internal consistency, test-retest reliability, and measurement error [19]. Internal consistency is considered as a measure of scale reliability and evaluates how closely related a set of items are as a group. Also, test-retest reliability is the closeness of the agreement between the results of successive measurements of the same measurand carried out under the same conditions of measurement. To avoid testing patients with the unstable condition or occurring recall bias, the re-test was performed after 4 weeks after the primary test. At last, the measurement error was calculated by using the standard error of measurement (SEM).

Validity

This domain also contains three measurement properties, i.e., content validity, criterion validity and construct validity [20]. Content validity mainly examines the measurement aim, the target population and the concepts of the questionnaire. It should provide researchers or clinicians to select item related to the target population. Criterion validity was evaluated by calculating the correlation between TSK-J and IKDC-SKF, which is widely used as a “gold” standard instrument. Construct validity was assessed by testing predefined specific hypotheses; that is, how many results are in accordance with predefined hypotheses.

  1. 1.

    The TSK-J scores will have a strong positive correlation with the JACL-25 scores.

  2. 2.

    Patients who answer the PGIC scale to “improved (including minimally improved, much improved, and very much improved)” will have a lower TSK-J mean score than those who answer to “no change”.

  3. 3.

    The TSK-J scores will have a strong negative correlation with the VAS-Sports scores.

  4. 4.

    The TSK-J scores will demonstrate a strong negative correlation with the following times after surgical treatment.

It has been suggested that hypotheses are specified in advance, and at least 75% of the results are in correspondence with these hypotheses. Also, the correlation coefficient could be considered in five degrees of very strong (r = 0.80 to 1.00), strong (r = 0.60 to 0.79), moderate (r = 0.40 to 0.59), weak (r = 0.20 to 0.39), and very weak (r = 0.00 to 0.19), respectively.

Responsiveness

Responsiveness has been defined as the ability of a questionnaire to detect clinically important change over time in the construct to be measured [19]. We calculated the change scores of the TSK-J between baseline (pre-surgery) and post-surgery with the time interval of 10 weeks.

Interpretability

Floor or ceiling effect was defined as more than 15% of participants reported the minimum or maximum scores, respectively. In addition, the smallest detectable change for individual changes (SDCind) and the group change of SDCgro were also calculated. Minimal important change (MIC) of Within-group was measured with a mean score change who reported the PGIC scale as “minimally improved” at repeat time (14 weeks) according to an anchor-based approach. Additionally, the minimal important difference (MID) was calculated by the mean score change between “minimally improved” and “no change” group [21].

Statistical analysis

All analyses were performed with the R-studio Software (R-Studio, Inc., Boston, USA).

Good measurement properties were defined by using the updated criteria for COSMIN guideline [18]. Internal consistency was calculated with Cronbach alpha and had been deemed to be sufficient if it is ≥0.70. The test-retest reliability was calculated by using the intra-class correlation coefficient (ICC2,1) and recommended as a minimum standard for reliability if it is greater than 0.7. The measurement error was calculated by using the ANOVA analysis. Pearson or Spearman correlation coefficients were used to calculating the correlation with gold standard and analysis of the discriminative hypothesis. Correlation with the gold standard was sufficient if the r ≥ 0.70. Responsiveness was calculated both the Cohen’s d and the receiver operating characteristics (ROC) curve (AUC) and at least 0.70 of AUC to be sufficient. At last, the SDCind and SDCgro were calculated according to the formula of SDCind = 1.96 * √2 *SEM and SDCgro = SDCind /√n, respectively [22].

Results

222 of 255 the patients included in this study. 33 patients excluded from this study due to: patients who did not complete either of the questionnaires = 18, did not answer more than two items in the IKDC-SKF =7, and under 16-year old =8. Their demographics were presented as below (Table 1).

Table 1 Demographics of participants

Missing data

222 patients completed a total of 350 times in this study. Of this, 8 patients did not answer 1 or more items of the TSK-J instrument. The amount of missing data was 0.47% of the 17 items in TSK-J.

Reliability

The results of the internal consistency, test-retest reliability and measurement error for the TSK-J were listed in Table 2.

Table 2 Measurement properties of reliability

Internal consistency of the TSK-J was good, with the Cronbach’s alpha (95% CI) of 0.79 (0.76 to 0.83).

Also, test-retest reliability was excellent with the ICC 2,1 of 0.90 (0.81 to 0.95) (Time interval, days ± SD = 28.77 ± 8.3) (n = 43).

The measurement error of the SEM for the TSK-J was 2.75.

Validity

Content validity

The content validity of TSK-J was presented below (Table 3).

Table 3 Content validity of the TSK-J

Criterion validity

The criterion validity of the TSK-J between the IKDC-SKF resulted in a moderate but significant correlation coefficient (r < − 0.49, P < 0.001 in Table 4).

Table 4 Correlation between the TSK-J and other outcomes (n = 222)

Construct validity

  1. 1.

    The TSK-J had a moderate positive correlation with the JACL-25 (r = 0.48) (Table 4)

  2. 2.

    Patients who answered the PGIC scale to “improved (including minimally improved, much improved, and very much improved)” had a lower TSK-J mean score than those who answered to “no change” (“improved” = − 0.7, “no change” = 0.5, Table 5).

  3. 3.

    The TSK-J had a moderate negative correlation with the VAS-Sports (r = − 0.48) (Table 4)

  4. 4.

    The TSK-J scores had no change until about 400 days after ACL reconstruction surgery (r = − 0.12) (Fig. 1, Table 5)

Table 5 Responsiveness
Fig. 1
figure 1

Correlation between TSK-J and following time. Almost no change found in the TSK-J following time (r = − 0.12)

Responsiveness

The ES of Cohen’s d was − 0.2 (small effect size), and the correlation between the TSK-J and following time was − 0.12 (Table 5, Fig. 1) The AUC for the TSK-J was 0.54, (Fig. 2, Table 5) and P-value of AUC of the TSK-J shows no significant difference (P > 0.05).

Fig. 2
figure 2

ROC between “no change” and “improved” (n = 72). ROC, receiver operating characteristics. The area under the curve (AUC) between “no change” and “improved” demonstrate fail accuracy of AUC = 0.54

Interpretability

There were no floor or ceiling effects in the TSK-J scale. The SDC for the TSK-J scale was 7.6 for individuals, and 1.2 for groups. The MIC and MID were − 0.8 and − 1.3, respectively (Table 6).

Table 6 Interpretability

Discussion

This is the first study to assess the validity, reliability, and responsiveness of the TSK for patients with ACL injury according to the COSMIN checklist. The internal consistency and test-retest reliability resulted in good reliability. In the validity domain, the content validity was interpretable, the criterion validity between the TSK-J and the IKDC-SKF resulted in a moderate correlation coefficient, which is lower compared to the IKDC-SKF and the JACL-25 (Table 4) Only one of four hypotheses (No. 2) in the construct validity domain was in accordance with the hypothesis. Furthermore, the responsiveness of TSK-J resulted in low rating and very weak time-dependent change (Fig. 1). The MID for the TSK was − 1.3, which means that change as large as 1.3, it may be important for patients. There were no floor or ceiling effects.

According to the past validation study, only the internal consistency of TSK-11 has been validated for patients with ACL injury [13], besides that, other measurement properties of validity and responsiveness were unknown in both the TSK and TSK-11. In this present study, the data indicated a low rating of the responsiveness by calculating the change scores (ES of the Cohen’d = 0.2) of the TSK-J between before and post-surgery with the time interval of 10 weeks and very weak correlation of time-dependent change (r = − 0.12 following post-surgery 1-year). Compared to the IKDC-SKF and the JACL-25, both results of validity and responsiveness indicated insufficient rating (Tables 4 and 5) However, one study reported that the TSK-11 scores after ACL-reconstruction surgery continued to decrease through 12 weeks and significantly different from baseline [8]. Another study also reported that not return to sports group has higher mean scores of TSK-11 than return to the sports group at both 6 months and 1 year after ACL reconstruction [7].To find out which factors lead to this gap, we also calculated the correlation between each item scores and TSK total scores (Table 7), found that only 2 of 17 items have good correlations (r > 0.7) and 5 items lower than 0.5. This result indicated that the item 4, 5, 8, 16, and 17 might not suit the patients with ACL injury because of the low correlation. This data could affect the result of the validity and the responsiveness. For another factor, we speculate that the cultural difference between Asians and Westerners may show different results when testing psychological factors. It has been argued that there are significant psychological differences between East Asians and Westerners that are rooted in long-standing differences between East Asian and Western civilisations [26]. The attentional differences were further presented to be an important factor contributing to cultural differences between Japanese and American in higher cognitive mechanisms [27]. Japanese has unique characteristics in response questions. For example, they did not like directly to reject someone or something, but response soft or reject indirectly. We noticed that patients in this study intended to answer the middle answer (“disagree” or “agree”) than extreme answer (“strongly disagree” or “strongly agree”).

Table 7 Correlation between each item (n = 222)

Further study was needed to edit or adapt some content of TSK-J items to obtain more appropriate scale (remove weak correlation items or edit them more correlatively) which may help Japanese clinician assess kinesiophobia more exactly.

Psychological factors have been significantly associated with returning to the preinjury activity. There are several questionnaires has been applied to evaluate psychological readiness for patients after ACL surgery. One of the common scales, the ACL-RSI has been translated and validated to evaluate psychological readiness to resume sport after ACL reconstruction in many countries [28,29,30,31]. However, the Japanese version of the ACL-RSI has not been validated and not contain psychological factors of fear of pain during movement. Therefore, we did not use it for the study. The TSK also has been used to evaluate the psychological factors of kinesiophobia for patients with ACL injury.

This study has a limitation. As the measurement properties, some of their sample size in this study may not be sufficient (ICC, SEM, and MID et al. n < 50), despite the criteria for measurement properties that positive rating should be with a sample size of at least 50 patients to be considered [32].

Conclusion

The TSK-J has good reliability for assessing patients with ACL injury. However, its low validity and responsiveness indicate that it is not the best patient-reported outcome measure for psychological factors for patients with ACL injury.