Introduction

Integration of palliative care in oncology is recommended by European Society for Medical Oncology (ESMO) and American Society of Clinical Oncology (ASCO) as oncological palliative care will enhance quality of life (QoL), and may also positively influence the course of illness [1]. In their landmark paper, Temel et al. showed that early palliative care in fact leads to significant improvements in both health-related quality of life (HRQoL) and mood [2]. For high-quality oncological palliative care in advanced cancer patients it is essential to monitor HRQoL in clinical practice in a suitable manner [3]. HRQoL generally consists of four domains: physical well-being, psychological well-being, social well-being, and spiritual well-being. Especially the spiritual well-being is important in patients with advanced cancer due to the confrontation with death [4,5,6,7,8,9,10,11,12]. Monitoring symptoms and HRQoL is extremely important in advanced cancer care, because it increases awareness among health care professionals to better anticipate on patients’ changing needs [13, 14] and improves clinical outcomes (i.e. fewer emergency room visits, fewer hospitalizations, a longer duration of palliative chemotherapy, and superior quality-adjusted survival), as recently demonstrated by Basch et al. [14].

The best method to monitor HRQoL in patients is to ask patients themselves, as asking health professionals or relatives is considered a less accurate method for estimating the HRQoL of a patient [15]. Inclusion of patient-reported outcome measures (PROMs) in routine clinical practice is, beside clinical benefits, also associated with improvements in discussion of patient outcomes during consultations and patient satisfaction [16,17,18]. However, the implementation of PROMS in routine practice is challenging because information regarding psychometric quality of measurement instruments is fragmented and standardization is lacking [19].

Earlier reviews have identified a variety of HRQoL measurement instruments that were appropriate for use in oncological palliative care [20,21,22,23,24,25,26,27]. However, none of these reviews could serve as a guide for an adequate and comprehensive choice of a measurement instrument for routine clinical practice because none used explicit criteria assessing measurement properties. For this reason, in 2010 Albers et al. [28] made an inventory of available HRQoL measurement instruments that were suitable for the use in palliative care and assessed the quality of these instruments. This review identified 29 different measurement instruments and showed a wide variety in measurement aim, content, target population, method (e.g. interview, questionnaire), completion time/length, and clinimetric quality [28]. In the last six years, a growing body of research has been published on the quality of existing HRQoL measurement instruments and also the development of new instruments is ongoing. It remains unclear what PROMs are most suitable for advanced cancer patients, receiving oncological palliative care nowadays.

Because the measurement of HRQoL in advanced cancer patients is a rapidly evolving field and the importance of PROMs in clinical practice is growing, an updated review on HRQoL measurement instruments seems appropriate. The aim of this study is to evaluate the quality of self-administered instruments measuring HRQoL of patients with advanced cancer for use in oncological palliative care nowadays. The methodological quality of the measurement instruments is described in terms of measurement properties and measurement quality. This review aims to contribute to more clarity regarding the availability and quality of self-administered HRQoL measurement instruments for patients with advanced cancer and to support health care professionals in an adequate selection of suitable PROMs in advanced cancer patients in clinical practice.

Methods

Search strategy

An electronic search of the database PubMed, Embase, PsycInfo, and CINAHL was performed to identify papers about instruments to measure HRQoL in advanced cancer patients that were published in English or Dutch between January 1990 and September 2016. Non-validation studies (article type) were excluded. A search strategy was developed for finding relevant publications in electronic literature databases, based on the search strategy of Albers et al. [28]. The computerized search was conducted using a search strategy to find studies on HRQoL measurement instruments in oncological palliative care: ‘palliative’, ‘instruments’, and ‘QoL’. A detailed description of the MeSH-terms and keywords used in the search can be found in Supplement 1. The search string was initially developed in PubMed and later adapted for the other databases. Additionally, all Validation Studies (article type) of the 29 identified HRQoL measurement instruments of the review of Albers et al. [28] were added. In addition, the reference lists of selected articles were screened to retrieve relevant publications which had not been found in the computerized search.

Study selection process

Two reviewers (NR and HF) used a stepwise procedure to identify relevant studies. Firstly, all papers’ titles and abstracts were assessed for relevance by one of the reviewers (NR) to see if the study describes the development or validation of a measurement instrument and whether the study involves (at least two domains of) HRQoL as outcome measurement. Irrelevant titles were excluded. Secondly, abstracts were screened by two reviewers (NR and HF) on the following inclusion criteria: (i) the study concerned the development or validation of a self-administered measurement instrument; (ii) non-primary tumour-specific HRQoL (and at least two of its domains) was a primary or secondary objective of the study; (iii) the target population of the study included adult patients (i.e. ≥ 18 years old) with advanced or metastatic cancer; (iv) the measurement instrument used in the study was provided in Dutch or English language; (v) only full-text English or Dutch reports were included. Consensus regarding exclusion based on these exclusion criteria was reached after a consensus meeting. Of all the studies that did not pass the selection process, the reasons for exclusion were listed. Full-text papers were also assessed on the above-mentioned criteria and conference abstracts were excluded.

Data extraction procedure

Two reviewers (NR and JvR) independently reviewed five randomly selected papers using a standard data extraction sheet and compared results to evaluate uniformity. Then, all papers were divided between the two researchers (NR and JvR) for data extraction. The procedure to confirm uniformity was repeated three times during the data extraction phase.

The methodological quality of included validation studies was assessed using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist devised by Mokkink et al. [29]. Supplement 2 gives an overview and a description of the criteria used to assess quality. The assessment for the methodological quality of studies on measurement properties of health status measurements instruments covers nine topics: internal consistency, reliability, measurement error, content validity, construct validity (i.e. structural validity, hypotheses testing, and cross-cultural validity), criterion validity, and responsiveness. The methodological quality of the selected publications was assessed by two researchers (NR and JvR). The quality assessment was evaluated in the same manner as described earlier.

Results

Selection of papers

A flowchart of the selection process is presented in Fig. 1. In total, 4088 articles were identified from the different electronic databases, excluding duplicates. Initially, 3854 papers were excluded based on screening of relevance of title and abstract. The abstracts of the remaining 234 articles were assessed in depth for eligibility by two researchers (NR and HF). Finally, 126 studies were suitable for full-text assessment. During full-text assessment, 37 studies were excluded. A number of studies (n = 11) were excluded because no full text was available after multiple attempts to retrieve the paper by contacting the author via Research gate or Email. Of these 11 papers, three were published more than 10 years ago, six were published in low-impact journals (impact factor < 2), which were often less accessible and two were untraceable. Other papers were excluded if they were a congress abstract (n = 14), the measurement instrument used in the study was in a language other than Dutch or English (n = 2), it was a duplicate (n = 4), it was not a self-administered measurement instrument (n = 4), it was not an measurement instrument (n = 2), or the measurement instrument was unidimensional or disease specific (n = 29). After checking reference lists of the selected articles, nine additional articles were identified. In total, 69 papers were included in this systematic review.

Fig. 1
figure 1

Flowchart study process

Study characteristics

The selected studies had between 10 and 3282 participants (21,077 participants in total) of whom 22–99% were men. Across studies, the average age of participants ranged from 51 to 79 years. Twenty percent of the studies included palliative patients suffering from various life-threatening illnesses (e.g. heart failure, end-stage lung disease, advanced renal disease, late-stage Parkinson disease, cancer), with the majority suffering from advanced cancer. Other studies focussed on cancer patients of which most studies (67%) included a mixed cancer population (i.e. various primary cancer sites). The remaining studies (13%) selected one specific primary cancer site: 4% patients with lung cancer, 3% women with breast cancer, 3% patients with brain tumours, 1% men with prostate cancer, and 1% patients with colorectal cancer.

Health-related quality of life measurement instruments

Table 1 gives an overview of all the measurement instruments that were included in this review including the full form of the used acronyms. Across studies 39 measurement instruments were identified. Instruments were originally developed between 1972 (General Health Questionnaire -12) and 2013 [European Organisation of Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire (QLQ)—Social Well-being 36]. The EORTC QLQ Core 30 (EORTC QLQ-C30) was most popular because ten studies (14%) validated this measure and seven studies (10%) administered a module of the EORTC (i.e. QLQ-Bone Metastases module 22 (QLQ-BM22), QLQ-Brain module 20 (QLQ-BN20), QLQ-Oral Health 17 (QLQ-OH17), and QLQ-SWB36). Nine studies (13%) validated the Edmonton Symptom Assessment Scale (ESAS) (or a modified or revised version of the ESAS), seven studies (10%) used the McGill Quality of Life Questionnaire (MQOL) (or the revised version), and four (6%) studies validated the Palliative care Outcome Scale (POS). For the majority of the measurement instruments (58%) they measure (HR)QoL, eight instruments (11%) with symptom assessment or the impact of symptoms on daily functioning. For other measurement instruments it is argued that they assess spiritual well-being or spiritual distress (14%), psychological disorders or depressive symptoms (5%), core concerns and palliative needs (2%), or parenting concerns for adults with cancer (2%).

Table 1 Overview of the included measurement instruments

The number of items the measurement instruments contained ranged between one [Minimal Documentation System (MIDOS) and Quality of Life in Life-Threatening Illness-Patient version (QOLLTI-P)] and 106 [Resident Assessment Instrument for Palliative Care (RAI-PC)]. The scoring of the measurement instruments was most often calculated as a total score and a subscale score (44%) or merely subscale scores (19%) or only a total score (14%). Other measurement instruments used single-item scores (5%), or a combination of single (visual analogue scale) items, subscale, and a total score (12%). One measurement instrument (2%) used content analysis to analyse responses.

Eight measurement instruments (19%) focused on the general population or patients in general, nine (21%) were targeted at palliative patients, nine (21%) at patients with cancer, and eight (19%) at patients with advanced cancer in specific. The target population of four measurement instruments (9%) were patients with brain tumours or brain metastases in specific. The remaining measurement instruments (12%) focused on bone or spina metastases, chest malignancies in cancer patients, and anorexia or cachexia. Most measurement instruments (33%) had a recall time of one week or had no recall time (14%). Others used a recall time of three days (7%), two weeks (2%), one month (2%), or one day (2%). The completion time of seven measurement instruments (16%) was reported. The completion time ranged between three [Patient-Evaluated Problem Scores (PEPS)] to 30 min (MQOL).

Measurement properties

None of the measurement instruments were adequately assessed for all measurement properties (Table 2). Information about the content validity (94%) was most often reported and in most cases adequate (58%). Information on the construct validity was reported by the majority of the studies (70%). However, compared to other measurement properties, the construct validity was most often inadequately tested (30%). Furthermore, information about the absolute measurement error, responsiveness, and interpretability was often incomplete (6, 22, and 51% respectively) or completely missing (88, 74, and 46% respectively).

Table 2 Rating of measurement properties of the instruments

Considering the available information on measurement properties, the EORTC QLQ Core 15 palliative questionnaire (QLQ-C15-PAL) showed best results. For instance, the EORTC QLQ-C15-PAL showed good content and construct validity, and the absolute measurement error and interpretability was also good. Other measurement properties had not been tested for the EORTC QLQ-C15-PAL. Equivalently, the EORTC QLQ-BM22 also appeared to have adequate psychometric properties because it appeared to have a good content and construct validity and the measurement instrument is reliable and responsive.

The ESAS showed good content validity, and the absolute measurement error and interpretability was good. However, information was lacking on other measurement properties. Other measurement instruments that had reasonable psychometric properties were the Assessment of Quality of life at the End of Life (AQOL), Quality of life at the End of life (QUAL-EC), and the Spiritual Attitude and Involvement List (SAIL). They had good content and construct validity, the internal consistency was good, but other information on measurement properties was lacking or missing.

The EORTC QLQ-C30 had undergone the most validation studies compared to other instruments but the studies did not adequately evaluate some important fundamental psychometric properties. The content validity, construct validity, and absolute measurement error of the EORTC QLQ-C30 was good. Evidence on other psychometrics characteristics of the EORTC QLQ-C30 was unclear.

The POS, QUAL-E, and MQOL were also tested by multiple studies. The POS had good content validity and construct validity, but the internal consistency was inadequate. Information on other measurement properties was lacking or missing. The QUAL-E showed a good content validity and construct validity. However, the internal consistency and reliability was inadequately tested and information on other measurement properties was incomplete. The revised version of the QUAL-E (QUAL-EC) showed improved measurement properties. The MQOL had adequate content validity, but inadequate construct validity. There was conflicting evidence regarding the internal consistency of the MQOL, and other measurement properties were inadequately tested.

There was consensus across two studies that the Hospice Quality of Life Index (HQLI) had inadequate construct validity. Results about the content validity were inconsistent, the internal consistency of the measurement instrument was good, and other psychometric information was lacking. For the EORTC QLQ-SWB36 and the QOLLTI-P, information on any of the measurement properties was absent. Other measurement instruments such as the EORTC QLQ-BN20, EQ-5D, Functional Assessment of Chronic Illness Treatment (FACIT-G), MIDOS, GHQ-12, Hospital Anxiety and Depression Scale (HADS), Rotterdam Symptom Checklist (RSCL), PEPS, Memorial Symptom Assessment Scale (MSAS-SF), and the RAI-PC were inadequately assessed because information on the measurement properties was incomplete or missing.

Discussion

Our systematic literature review identified 39 self-administered instruments measuring HRQoL mainly in patients with advanced cancer. None of the included studies reported sufficient information on psychometric properties of these measurement instruments according to the COSMIN criteria. Surprisingly, even basic psychometric properties such as construct validity and reliability were often inadequately tested. It appears that selecting an appropriate measurement instrument for testing construct validity and formulating specific hypotheses can be challenging. Furthermore, our findings show that adequate testing of responsiveness was not a priority in previous studies. PROMs are often used in clinical practice to monitor symptoms over time, it is therefore of great importance that a measurement instrument is responsive to changes. Despite incomplete information in the included studies, results of this review indicate that the EORTC QLQ-C15-PAL is an adequate instrument to measure HR in patients with advanced cancer. The EORTC QLQ-BM22, a module for patients with bone metastases, also appears to be suitable in this patient population. The EORTC QLQ-BM22 is a module and should be administered together with the EORTC QLQ-C30. Consequently, the measurement instrument is more extensive compared to the EORTC QLQ-C15-PAL. The length of a measurement instrument should be taken into account because there is little time for administration in clinical practice and a lower burden can foster compliance [99].

Due to medical advances, cancer is increasingly perceived as a chronic illness. Patients stretch the palliative phase by a longer survival and there is an increasing awareness to detect the palliative phase at an earlier stage when patients are relatively fit. The EORTC QLQ-C15-PAL may not be appropriate to administer in the beginning of the palliative phase due to its focus on symptoms at the end of life. When the EORTC QLQ-C15-PAL is administered in relatively healthy patients, a patients’ actual HRQoL may be lower than what the EORTC QLQ-C15-PAL scores indicate and the EORTC QLQ-C30 will provide a more accurate reflection of a patients’ HRQoL. The EORTC QLQ-C30 is the most commonly used disease-specific measure world-wide [100] and has been used in more than 3000 studies [101]. The routine use of the EORTC QLQ-C30 in clinical practice appears to improve physician–patient communication and HRQoL [102], but the implementation has its challenges (e.g. timing, frequency, interpretations of scores by health care professionals, and the absence of thresholds for clinical importance) [103]. Surprisingly, the present review showed that the psychometric quality of this measurement instrument has been examined many times but not adequately in patients with advanced cancer. Therefore, a thorough validation of the internal consistency, reliability, responsiveness, and interpretability of the EORTC QLQ-C30 in advanced cancer patients is advocated.

Another consideration regarding the reviewed HRQoL measurement instruments is that many of the instruments did not measure all aspects of HRQoL. Moreover, measurement instruments that only addressed one domain of HRQoL were excluded from our study. The spiritual domain is especially important at the end of life, but this domain was not often included in existing measurement instruments [28]. For instance, the EORTC QLQ-C15-PAL also did not include certain topics that appear to be relevant for patients in the end of life: Quality of care, Preparation for death, Spirituality or Transcendence [78, 90, 104,105,106,107]. The EORTC QLQ-C15-PAL was derived from the EORTC QLQ-C30 and the authors confirmed that existential or spiritual issues were mentioned by health care professionals and some patients as important additional topics to the measurement instrument. Therefore, the authors suggested that the EORTC QLQ-C15-PAL is supplemented by single items, modules, or questionnaires regarding spirituality when deemed necessary. This suggestion is especially valuable for clinical practice where the spirituality domain is not easily assessed in a regular doctor’s appointment and many oncologists have not received specific training in palliative care.

Practical implications

For clinical practice it is important to monitor whether the latent construct that is being measured is represented by the selected instrument at the time of measurement and take the objective of measurement instrument into account when selecting an instrument. For instance, when interested in change over time one could argue that the EORTC QLQ-C15-PAL is less sensitive compared to the EORTC QLQ-C30 because it uses fewer items. However, sensitivity to change may also be improved by eliminating items that poorly represent the construct they were designed to measure [108]. In other words, improving measurement precision will enlarge sensitivity. Therefore, the EORTC QLQ-C15-PAL may actually be more sensitive to change over time when measuring HRQoL at the end of life in specific. However, because the EORTC QLQ-C15-PAL does not include items on spirituality the latent construct of HRQoL at the end of life is not fully measured. This reduces the sensitivity of the measurement instrument because the range where change can be detected over time is small [108]. Up to now, little is known about the measurement invariance of the QLQ-C15-PAL or EORTC QLQ-C30 in advanced cancer patients. Further validation to improve available information regarding minimal important differences and clinical relevance of differences in scores can aid interpretability in clinical practice [30]. PROMs have the potential to personalize care by identifying patients’ needs but an accurate image of the patients’ needs can only be achieved when administering the right measurement instrument at the right time for the right purposes.

This study has certain strength and limitations. It is important that the validation of instruments is performed in a consistent manner and evaluated as such. Using the COSMIN criteria in this review promoted a consistent evaluation. A limitation of this review is that there is no guarantee that our study selection procedure was sufficiently extensive. Even though references of included studies were checked, it is possible that certain validation studies were missed. Finally, this review only included measurement instruments that were not cancer site specific, meaning that the target population of the instrument was not focussed at patients with specific primary cancer sites. It is possible that for certain cancer sites, the EORTC QLQ-C15-PAL may not be the most adequate measure.

In conclusion, this review identified many self-administered instruments that measure HRQoL in patients with advanced cancer in clinical practice. Many of the existing measurement instruments have not yet been evaluated in an adequate manner, making it difficult to compare instruments. Considering the available information, the EORTC QLQ-C15-PAL and the EORTC QLQ-BM22 appeared to have best psychometric properties. However, there is no ‘one size fits all’, meaning that when selecting a measurement instrument in clinical practice it is important to take certain aspects into account such as the burden of administration and the objective of measurement (e.g. change over time). It is important that health care professionals possess up-to-date knowledge on the quality of HRQoL measurement instruments to make an adequate selection in clinical practice. For instance, health care professionals should be aware that it is important to supplement existing measurement instruments with relevant items on spirituality or preparation of dying, depending on the patients’ position within the palliative phase to accurately measure HRQoL. Validation of self-administered HRQoL measurement instruments is an important ongoing development because information on psychometric properties will enhance comparisons between instruments. This review contributes to improved clarity regarding the availability and quality of HRQoL measurement instruments for patients with advanced cancer and supports health care professionals in an adequate selection of suitable PROMs in advanced cancer patients in clinical practice. Being able to accurately and routinely measure HRQoL in patients with advanced cancer will stimulate the personalized health care approach leading to improved cancer care, clinical outcomes, and HRQoL.