Introduction

The 15D is a generic preference-based health-related quality of life (HRQoL) instrument that can be used to provide preference-weights for quality-adjusted life-year (QALY) calculations when paired with valuation data [1, 2]. It is used internationally [1] and was recently a part of a large multi-instrument comparison study (MIC) [3, 4]. The 15D assesses HRQoL via a descriptive system with 15 dimensions, covering physical, mental, and social aspects of health [2, 5]. Originally Finnish, the 15D has been translated into 30 languages, including Norwegian [1, 6]. 15D valuation studies have been conducted in Finland [7] and Denmark [8]. These studies used the same study design involving three valuation tasks, based on the visual analogue scale (VAS), and were carried out through postal administration [7, 8]. In this study, we addressed a selection of problems known to be related to the original 15D valuation system. We reviewed the original 15D valuation tasks with a focus on how the information they provide is combined to estimate 15D algorithm values. An earlier study suggested that the way information from the three original 15D valuation tasks is combined increases the likelihood of larger error terms which makes the link between the VAS tasks and the health state values less transparent [9]. We propose a new 15D value algorithm estimation procedure that uses information from only one of the original valuation tasks [9] and anchors the worst possible health state in an empirically assessed value.

There is no value algorithm (or tariff, or value set) based on preferences of the Norwegian general adult population for any generic preference-based HRQoL instrument. It is the main aim of this study to use a new 15D valuation procedure to estimate a Norwegian 15D value algorithm. Further, we compared the resulting Norwegian 15D health state values with corresponding values of other generic preference-based instruments collected in seven disease groups and a healthy sample.

Methods

The 15D instrument

The 15D health state space comprises 15 dimensions, each with five levels of function [2, 5]. As such, a 15D health state can be represented as a vector \(\mathbf{l}=\left( {{l_1}, \ldots ,{l_{15}}} \right)\), where each \({l_j}\) specifies the level of function on dimension \(j\). The 15D has a large number of dimensions and levels and its full health state space contains 515 health states. It is cognitively challenging to value 15D health states, consisting of 15 attributes and unfeasible to directly evaluate a representative set of 15D health states [7]. Instead, each level of function of the 15D descriptive system was valued separately with VAS-based valuation tasks and simplifying assumptions from multi-attribute utility theory were used to derive 15D health state values in the original 15D algorithm estimation procedure [7, 10].

A value algorithm estimation procedure details the steps needed in order to convert valuation data from valuation tasks into a value algorithm (Fig. 1). A 15D value algorithm consists of a look-up table with algorithm values which allow estimating a value to each 15D health state (Fig. 1). As the 15D health state space is considerably large, it is common to present a look-up table which allows to estimate 15D health state values by adding up 15 algorithm values (Fig. 1). We thus describe a procedure that converts valuation data, i.e., the participants’ responses to the VAS-based valuation tasks, into a value algorithm that yields health state values \({V_H}(\mathbf{l})\) for all 15D health states \(\mathbf{l}\).

Fig. 1
figure 1

Overview of the 15D valuation system. VAS visual analogue scale, MAU multi-attribute utility, QALY quality-adjusted life years

The original 15D valuation tasks and the original value algorithm estimation procedure

The original 15D valuation procedure consisted of three VAS-based valuation tasks: the top task, the bottom task, and the within dimension tasks (Fig. 1). The top and the bottom tasks each used a single VAS on which respondents were asked to evaluate the best and the worst levels of all 15 dimensions. These tasks were intended to provide information about how important each dimension was perceived. In the 15 within dimension tasks, the respondents were asked to assign a VAS-score to each level of impaired function, referred to as L2 through L5, and “being dead” for each dimension separately, while L1 was fixed at 100 (Appendix 1). In the original procedure, the resulting level scores were multiplied with the importance weight for each level, which was “extrapolated linearly” [7] based on information from the top and bottom tasks [2]. A step-by-step description of the original procedure can be found in Appendix 3 in Michel et al. [9].

The new 15D valuation tasks and the new value algorithm estimation procedure

The new 15D algorithm estimation procedure used information of the within dimension tasks and a pits-task (Fig. 1). While the within dimension tasks were identical with the tasks used in the original valuation procedure, the pits-task was only part of the new valuation procedure. Of the original three tasks, only information from the within dimension tasks was retained. Thereby problems from combining information from different valuation tasks were avoided, and the link between the VAS-scores and the resulting health state values has become more transparent [9]. The within dimension tasks provided relative VAS-scores for all levels. The pits-task provided an average score for the worst possible 15D health state, L5 on all 15 dimensions, that has been directly scored on a VAS together with “being dead” (Appendix 2). Besides the pits-task, no interactions between the 15D levels were assessed.


The QALY model assumes that preferences for health states can be expressed on a ratio-scale, where zero corresponds to “being dead” [11], where “being dead” is a proxy for health states without an intrinsic (QALY-) value, since “being dead” suspends time [12,13,14]. Using the pits-score’s relation to “being dead” and “perfect health” to determine the range of the 15D health state values provided a plausible link to the ratio-scale used in the QALY model.

The new value algorithm estimation procedure was performed on averaged within dimension VAS-scores that were weighted to match the general Norwegian population. Sixteen task-specific sets of weights have been estimated (one for each of the fifteen within dimension tasks, and one for the pits-task). Post-stratification weights have been applied (see Appendix 3 for information on which variables were used for post-stratification and a detailed description of the weighting procedure).

Health state values represent disutility values via the identity \(~v~=~1 - u\). The new value algorithm estimation procedure yields a 15D health state value \({V_{H~}}(\mathbf{l})\) for a 15D health state \({\mathbf{l}} = \left( {l_{1} \cdots l_{{15}} } \right)~\) using the following functional form:

$${V_H}\left( \mathbf{l} \right)\mathop =\limits^{{{\text{def}}}} 1 - \frac{{{V_{Pits}}}}{{\mathop \sum \nolimits_{j} {S_{j,5}}}}~\mathop \sum \limits_{{1 \leqslant j \leqslant 15}} {S_{j,{l_j}}}=1 - ~\mathop \sum \limits_{{1 \leqslant j \leqslant 15}} \omega \times {S_{j,{l_j}}}=1 - \mathop \sum \limits_{{1 \leqslant j \leqslant 15}} {T_{j,{l_j}}}$$
  • Step 1: For each dimension \(j\) and level \(i~\) (L2–L5 and “being dead”), compute the weighted means \(\overline {s} _{i}^{j}\) over the respondents’ raw VAS-scores from the within dimension tasks (Table 6 in Appendix 4).

  • Step 2: For each level \(i\) (L2–L5), estimate the level’s relative scores \(~{{S}_{j,i}}~\), within each dimension \(~j\), as \(~{\widehat {S}_{j,i}}~\mathop =\limits^{{{\text{def}}}} ~\frac{{100 - \overline {s} _{i}^{j}}}{{100 - \overline {s} _{{{\text{death}}}}^{j}}}\), resulting in within dimension disutility values (where \({\widehat {S}_{j,1}}=0\) since \(\overline {s} _{1}^{j}=100~\) by definition, see Table 7 in Appendix 4).

  • Step 3: Estimate the mean of the respondents’ empirically obtained scores for the pits-state as \({\widehat {V}_{{\text{Pits}}}}\mathop =\limits^{{{\text{def}}}} ~\frac{{100 - {{\overline {s} }_{{\text{Pits}}}}}}{{100 - {{\overline {s} }_{{\text{death}}}}}}\), resulting in one disutility value for the pits-state representing the estimated health state value-range.

  • Step 4: Estimate the within dimension disutility table \({{T}_{j,i}}~\) (Table 1) by rescaling the level scores \({\widehat {S}_{j,i}}~\) by the rescaling factor \(\widehat {\omega }\mathop =\limits^{{def}} \frac{{{{\widehat {V}}_{{\text{Pits}}}}}}{{\mathop \sum \nolimits_{j} {{\widehat {S}}_{j,5}}}}: {{\widehat {T}}_{j,i}} \mathop=\limits^{{def}} {\widehat {\omega }}\mathop \cdot {{\widehat {S}}_{j,i}}\). This normalizing constant ensures that the final range of health state utilities is bounded by 1 for “perfect health” and \(1 - {\widehat {V}_{{\text{Pits}}}}\) for the pits-state.

    A 15D health state value \({V_{H~}}\left( \mathbf{l} \right)\) is a simple sum of fifteen of the disutility values presented in Table 1.

Table 1 The Norwegian 15D algorithm disutility values

We estimated a Norwegian 15D value algorithm using this new procedure. We present the range and the dimensions with the largest and the smallest disutility values. The original 15D algorithm estimation procedure was applied to the Norwegian valuation data in an earlier study [9] for comparison, and the resulting values were not recommended for use in healthcare decision making. We refer to the values resulting from the previous comparison as original 15D values, in contrast to the new 15D values estimated in the current study.

The Norwegian valuation studies

The main Norwegian 15D valuation study was conducted in 2010 through the marked research firm TNS Gallup. The survey was self-completed and consisted of demographic variables, the 15D descriptive system (self-reported health), and the original 15D valuation tasks. Parallel Web and postal surveys were conducted. The postal survey was sent to a random sample of about 5000 postal addresses from the Norwegian National Population Registry. Participants received a prepaid response envelope. The Web sample was recruited by sending mails to 1936 individuals pre-registered in an online panel maintained by TNS Gallup (TNS-Gallup Panel). Several waves of emails were sent out to reach a responding sample of 1000+ individuals that resembled the Norwegian general population in terms of age, gender, educational level, and geographic distribution. Due to technical limitations in the software used for the Web data collection, the task presentation differed between the Web and postal sample (see Appendix 1 for details). For details about how the different survey versions were randomized and for the exclusion criteria applied in this study, see [9]. Sensitivity analyses for testing the influence of case exclusion on within dimension task scores and 15D algorithm values were performed and did not indicate any differences between the unselected and the selected sample [15].

An additional face-to-face data collection was conducted in 2015–2016 in order to directly assess the value associated with the worst possible 15D health state. We asked 120 members of the Norwegian general population to provide information on age and gender and to fill in the 15D descriptive system. Further, the participants performed two within dimension tasks to get familiar with the 15D valuation tasks. Finally, the pits-task was presented in which participants were instructed to score the worst possible 15D health state and “being dead” on one VAS, ranging from 0 to 100, anchored in the best and worst imaginable health state (Appendix 2). Two participants had to be excluded due to missing data in this task. Sintonen performed a similar task in the original Finnish 15D valuation study but did not use the value of the worst possible 15D health state when finally estimating the Finnish 15D value algorithm [7].

Empirical performance of the Norwegian 15D health state values

To demonstrate the properties of the Norwegian 15D health state values, we used self-reported health in the Norwegian MIC data set (N = 1177) [4, 16]. Details about the data collection were described elsewhere [4, 16]. Respondents filled in the descriptive systems of AQoL-8D [17], EQ-5D-5L [18], HUI 3 [19], and SF-36v2 to derive SF-6D [20], and the 15D [2]. We visually compared the mean health state values of the different instruments in a healthy sample and in seven different disease groups, all collected in Norway. The existing value sets were used to estimate health state values per instrument for a healthy sample and seven disease groups: asthma, cancer, depression and anxiety, diabetes, hearing disability, arthritis, and heart disease. Note that except for the Norwegian 15D value algorithms, none of the other instruments had a value set that is based on preferences of the Norwegian general population. We applied both the Norwegian 15D value algorithm estimated in this study and the one estimated earlier [9] to self-reported health, using the 15D descriptive system, in the Norwegian MIC data set.

Results

Descriptive statistics

Table 2 provides an overview of the characteristics of the Norwegian general population in 2010 and the unweighted Norwegian valuation study sample. Responses were weighted using post-stratification to better reflect the makeup of the Norwegian adult general population [21]. Weights were calculated separately for each of the 15 within dimension tasks, and for the pits task. Respondents were categorized by age, gender, and education, and weights were calculated by dividing the proportion of the general population in each category by the corresponding proportion of the respondent sample (see Appendix 3 for details).

Table 2 Characteristics of the Norwegian general population in 2010 and the unweighted Norwegian 15D valuation sample

Of 1936 individuals, 1003 finished the Norwegian 15D valuation survey in the Web sample (response rate 52%), while 1276 out of 4899 contacted individuals in the postal sample returned completed surveys (response rate 26%). An overview of the excluded participants can be found in Appendix 5. Participants were aged between 19 and 101 years (mean = 51.6, SD = 16.5), 52% were female, and the most common educational degree was a high school degree (43%). The sample largely matched the underlying population’s characteristics, although responders of the postal survey tend to be older and better educated. Of 118 individuals who provided complete data in the pits-task, 52 participants assigned a higher (“better”) score to “being dead” than to the worst possible 15D health state. While 48 participants chose the same score for “being dead” and the worst possible 15D health state, namely zero.

The Norwegian 15D value algorithm

In terms of disutility values, the Norwegian 15D health state values ranged from 0 to 1.52, being anchored in an empirical estimate of the worst possible 15D health state.

A Norwegian 15D health state value can be computed with the following formula:

$${\widehat{V}_{H~}}\left( \varvec{l} \right)\mathop =\limits^{{{\text{def}}}} 1 - \frac{{{{\widehat{V}}_{{\text{Pits}}}}}}{{\mathop \sum \nolimits_{j} {{\widehat {S}}_{j,5}}}}~\mathop \sum \limits_{{1 \leqslant j \leqslant 15}} {\widehat {S}_{j,{l_j}}}=1 - \mathop \sum \limits_{{1 \leqslant j \leqslant 15}} 0.113 \cdot ~{\widehat {S}_{j,{l_j}}}=1 - \mathop \sum \limits_{{1 \leqslant j \leqslant 15}} {\widehat {T}_{j,{l_j}}}~$$

The values for \({\widehat {S}_{j,i}}\) can be found in Table 7 in Appendix 4 and the values for \({\widehat {T}_{j,i}}~\) are shown in Table 1 (see also Appendix 4).

The 15D health state values resulting from using the new procedure are lower than the original 15D values (Figs. 2, 3). The dimensions with the largest L5 disutility values were mobility (0.1083), discomfort (0.1063), and eating (0.1061), whereas hearing (0.0959), speech (0.0960), and sleep (0.0968) had the smallest disutility values on the worst level of function (Table 1; Fig. 4).

Fig. 2
figure 2

Norwegian 15D health state values compared with health state values of other generic preference-based instruments. Mean health state values by ranked percentiles. 15D_FIN Finnish 15D algorithm values were calculated with the original procedure ([7], scoring sheet available from Harri Sintonen), 15D_NO_original Norwegian 15D algorithm values were calculated with the original procedure (Table 3 in [9]), 15D_NO_new Norwegian 15D algorithm values were calculated with the new procedure

Fig. 3
figure 3

Mean disutility values per disease group and per instrument. Depr/Anx depression and anxiety, hearing dis. hearing disability, 15D_FIN Finnish 15D disutility values were calculated with the original procedure ([7], scoring sheet available from Harri Sintonen), 15D_NO_original Norwegian 15D disutility values were calculated with the original procedure (Table 3 in [9]), 15D_NO_new Norwegian 15D disutility values were calculated with the new procedure

Fig. 4
figure 4

Norwegian disutility algorithm values. Dimensions are ordered by increasing level 5 disutility values of the Norwegian valuation sample. As disutility values are presented, level 1 has a disutility value of zero and level 5 has the largest disutility value

Empirical performance of the Norwegian 15D health state values

In accordance with all other instruments, the largest mean health state disutility value using the Norwegian 15D value algorithm was found for the disease group Depression and Anxiety (Fig. 3). Compared to the original 15D values, the new Norwegian values are in general more in line with corresponding values of other instruments in seven disease groups and a healthy sample (Fig. 3). The same is true when assessing the ranges of the value algorithms (Fig. 2).

Discussion

Summary of results

We estimated a Norwegian 15D value algorithm using a new value algorithm estimation procedure. Compared to the 15D health state values that were calculated with the original valuation procedure, the Norwegian 15D health state values were lower and closer to corresponding values of other generic preference-based instruments. With this study, we aimed to amend the available information on the 15D valuation procedure by increasing transparency and replicability of the valuation procedure used to estimate 15D health state values. We aimed to facilitate future research that empirically compares health state values resulting from different instruments, as well as enabling instrument users and decision makers to make more informed choices between generic preference-based instruments.

The new 15D value algorithm estimation procedure

The new 15D value algorithm estimation procedure provides a more transparent link between the valuation tasks and the resulting 15D health state values. Using fewer valuation tasks reduces the burden and costs related to data collection. Anchoring the 15D algorithm values in an empirically assessed value for the worst possible 15D health state has several advantages. First, this is the most direct way to determine the range of health state values for the instrument. The range is a crucial feature of an instrument. Anchoring the range in an empirical value for the worst possible health state increases the range’s validity compared to the original value range which was based on extrapolation from single dimensions. Second, as the value for the worst possible health state is assessed in relation to “being dead,” the resulting health state values fulfill the requirement of the QALY model better than the original 15D health state values as they are on a ratio-scale with a clearly defined zero. Third, the value of the worst possible health state is the only 15D health state that potentially captures interactions between the levels. This is a valuable amendment to the within dimension tasks scores, which were assessed for each dimension separately.

The empirical value for the worst possible 15D health state assessed in the Norwegian general population is comparable to a similar Finnish value (− .334) [7]. However, this estimate was not used in the Finnish valuation study. The anchoring approach chosen in the new 15D valuation procedure is not without alternatives. Although it possibly is very challenging to value the worst possible 15D health state, having all dimensions at the same worst level of function might still be easier to conceptualize than a health state including different levels of function. Given that it seems unfeasible to value 15D health states that contain different levels of function, we argue that the estimate of the worst possible 15D health state is the most suitable score for anchoring, given the 15D specific constraints due to its large number of health states. One alternative approach would have been to anchor the values in the sum of all L5 disutility values. We did not use this approach as it would have resulted in an unacceptable health state value-range of 0–14.4 in terms of disutility values.

Empirical performance of the Norwegian 15D health state values

The Norwegian 15D health state values are lower than the original 15D values due to a lower value that was assigned to the worst possible 15D health state. As a consequence, Norwegian 15D health state values are more in line with the values of other generic preference-based instruments (Fig. 2). This will reduce the chance for drawing different conclusions about the cost-effectiveness of health interventions due to using different instruments to assess HRQoL for QALY calculations. As there is no gold standard that indicates if health state values are valid representations of people’s preferences for health, we assessed convergent validity by comparing Norwegian 15D health state values to values of other HRQoL instruments [22]. Compared to the original 15D values, the new Norwegian 15D health state values are closer to the values of other generic preference-based instruments in a healthy subsample as well as in seven disease groups. As we do not know to what extent the health state values of the other instruments are valid representations of people’s preferences, conclusions about validity remain preliminary.

Remaining challenges and limitations

The original 15D valuation system has been criticized on several grounds. A widely hold criticism is that VAS is used as a stand-alone valuation method [23,24,25,26]. It has been questioned if VAS can assess respondent’s preferences as the resulting valuations are not the result of a trade-off [27, 28]. A potential concern with the new 15D valuation procedure is that it relies on trade-offs between the levels of different dimensions compared to death to determine the relative weight of dimensions, rather than direct trade-offs between dimensions. The primary strength of this method over direct trade-offs between dimensions is that it reduces the cognitive burden of the tasks, allowing respondents to focus on the impact of each specific level of each dimension compared to death. However, the absence of direct trade-offs could attenuate differences between dimensions. There are also a number of biases related to VAS that might influence health state valuation [23, 29]. However, the VAS has been defended as a preference elicitation method for eliciting health state values under certainty [30] and valuation methods that are similar to those of the 15D have been recently applied in a new method for valuing health [31]. Another aspect that has been criticized is that an additive model is used to estimate 15D health state values. The additive model has been chosen by the 15D instrument developer as other models have not been considered to be feasible due to the number of potential interactions [7]. Using an additive model means to assume that all dimensions are structurally independent. To our knowledge, the structural independence of the 15D dimensions has not been tested.

While this study identified and addressed a selection of methodological shortcomings of the 15D valuation system, others remain to be addressed in future research. Although we were aware of VAS’ shortcomings, we kept using VAS as a valuation method in the new 15D valuation procedure. The use of other valuation methods, or allowing trade-offs between the dimensions, is complicated by the large descriptive system of the 15D. The advantage of keeping the same valuation method as in the original valuation procedure is that already existing 15D value algorithms can be converted into health state values that are more in line with requirements of the QALY model. This can be done by assessing an empirical value for the worst possible 15D health state and anchoring the already existing 15D algorithm values in this pits-estimate. In these kind of studies, the estimate of the worst possible 15D health state could also be assessed with other valuation methods than VAS. However, it has to be kept in mind that valuing a health state with 15 attributes is challenging, especially when assessing worse than death values [32]. Furthermore, we kept using the additive model in the new 15D procedure. It is unsatisfying that no explicit tests have been performed to empirically support this model choice. However, testing the structural independence of the 15 domains and assessing the interactions between them are extensive tasks which were beyond the scope of this study. Additionally, we recommend that future studies use identical task presentations of the within dimension tasks in the Web and the postal sample. Finally, it is worth noticing that the disease classification in the MIC data set was based on self-reports rather than on medical diagnosis. This may limit the clinical accuracy of these groups. For the purpose of comparing different instruments, however, this is not a central concern.

Conclusions

The Norwegian 15D value algorithm is the result of applying a new 15D valuation procedure that uses fewer valuation tasks and is anchored in an empirically assessed value for the worst possible 15D health state. The Norwegian 15D health state values are more in line with the requirements of the QALY model and are more comparable to the values of other HRQoL instruments in seven disease groups and a healthy sample. This study presents the first Norwegian value set for a generic preference-based instrument. We recommend using the new 15D valuation procedure when estimating future 15D value algorithms in general, and the Norwegian 15D health state values specifically for 15D-based health economic analyses in Norway.