# Modelling of lung cancer survival data for critical illness insurances

- 580 Downloads

## Abstract

Modelling of critical illness survival data, being primary developed in the context of, e.g. health insurance contracts, also plays an important role in the currently analysed problems related to secondary insurance market. The aim of this contribution is two-fold. In the first part we describe how to construct a multiple state model for critical illness insurances, which takes into account that a probability of death for a dread disease sufferer depends on the duration of the disease and the survival probabilities are related to the disease stage. Then, in the second part, we focus on modelling of the probabilistic structure of the analysed model for a particular case of dread disease. Based on the actual data for the Lower Silesian Voivodship in Poland, we estimate the transition probabilities for the derived model in case of the risk of lung cancer. For this purpose we use the methodology developed for the construction of multi-state life tables, such as binomial, Poisson and ordinal logistic regression models. The obtained results can be directly used to build the multiple increment–decrement tables, which are useful to valuation not only critical illness insurances and life insurances with accelerated death benefits option but also to viatical settlement contracts and health-related expenses.

## Keywords

Lung cancer Markov chain Multiple increment–decrement table Multiple state model Morbidity model Survival model## 1 Introduction

Modelling of critical illness survival data being primary developed in the context of, e.g. health insurance contracts, also plays an important role in the currently analysed problems related to viatical market and life insurances with an accelerated death benefits option.

Critical illness insurances (CII) called also dread disease insurances (DDI) are typical examples of limited-coverage health insurance products. They provide the policyholder with a benefit in case of dread disease, which are included in a set of illnesses specified by the policy conditions such as heart attack, cancer or stroke (see Dash and Grimshaw 1993; Haberman and Pitacco 1999; Pitacco 1994, 2014). Such insurance policies can be shaped in several different ways for instance depending on the specific insurance market. The basic benefit is a lump sum benefit, which is paid on diagnosis of a specified condition, rather than on disablement. The other type of benefit consists of a set of fixed-amount benefits (annuity payments). It is worth noting that CII policy does not meet any specific needs and doesn’t protect the policyholder against such financial loss as of earnings, reimbursement of medical or other expenses incurred. The insured can use the obtained benefits for any purpose.

Nowadays, due to the growth of the secondary market, increase of interest in products providing an acceleration benefit in a situation related to terminal illness has been observed. An insured person who has a life insurance would like to use it when he has financial problems connected with health. In such situation, the easiest way of receiving financial compensation from life insurance is to withdraw from the insurance contract. Then the insurer is obliged to pay the surrender value. The insured can also sell his/her policy on the secondary market for an amount that is greater than the surrender value (and less than the death benefit). Then the viatical company takes over the payment of the insurance premiums, and in case of death of the insured, it gets the death benefits. Such agreements, called *viatical settlements* (see e.g. Bhuyan 2009; Gatzert 2010; Neeraj 2003), are offered to people who have developed a terminal disease. The other possibility for the insured to receive prior financial gratification is to buy a life insurance with an ADBs option that allows the insured to obtain the death benefit when he is still alive. Sometimes insurance companies allow the insured to re-buy the option of an accelerated payment of death benefits after the diagnosis of the disease. This flexibility is stimulated by the strongly growing viatical market for life insurance.

A statistical model for survival analysis is equivalent to a two-state Markov process with one direct transition from a transient *alive* state to an absorbing *death* state. This model is insufficient in framework of study and analysing the detailed life history data which occur frequently in practice, as for example in CII. In the literature, depending on the analysed problem, there have been observed two basic approaches of designing a suitable model. On the one hand, the *alive* state can be split into two or more transient states which, in applications, typically correspond to occurrences of various medical complications, like for example in case of an acute myocardial infarction (Hougaard and Madsen 1985) or insulin-dependent diabetes (Andersen 1988). On the other hand, the *death* state can be split into two or more absorbing states, which in applications typically correspond to analysing causes of death and the competing risks survival analysis. The basic model for CII (e.g. Haberman and Pitacco 1999; Pitacco 1994, 2014) combines both approaches, but does not include the specific terms of contracts offered by insurers and is not suitable for costing viatical products and life insurances with ADBs option, in particular in case of lung cancer disease.

The aim of this contribution is two-fold. In the first part we present a general multiple state model for critical illness insurances, which takes into account that a probability of death for a dread disease sufferer may depend on the duration of the disease and the payment of benefits associated with a severe disease are related to a diagnosis and the disease stage. Due to the nature of the analyzed products, we propose to split the death state in a different way than in the basic model for CII.

In order to cost insurance and viatical contract, the probabilistic structure is necessary. Then, in the second part, we focus on the modelling of the probabilistic structure of the proposed multiple state model for *products* associated with the risk of lung cancer which is one of the most frequently studied cancers (i.e. Diaconu et al. 2016). For determining the transition matrix related to the discrete-time Markov model, we use the methodology developed for the construction of multi-state life tables. The numerical results are based on the actual data for the Lower Silesian Voivodship in Poland. Thus we derive the transition matrix for the proposed model, which can be used to cost CII contracts, life insurances with ADBs option and viatical settlements.

The paper is organized as follows. In Sect. 2, after a brief description of the classical multiple state model for the CII, we propose a more general multiple state model for such insurances. In Sect. 3 we apply introduced in Sect. 2 a general multiple state model to the insurance against the risk of lung cancer, which is an example of the CII. Then we focus on a study of the description of the probabilistic structure of this model. Based on the actual data for the Lower Silesian Voivodship in Poland, we estimate transition probabilities for CII associated with the lung cancer in Sect. 4. In Sect. 5 we point out obtained results to possible applications in practice.

## 2 An actuarial model for critical illness insurance

Multiple state modelling is a stochastic tool for describing different kind of demographic problems such as: projection elderly disability (Gaag et al. 2005), analysing process of return and reentry migration (Vega and Brazil 2015) and labour-migration dynamics (Bijwaard 2014) or unemployment dynamics (Dbicka and Mazurek 2001). But the most commonly applied multistate model in biostatistics is the active-illness-death model (e.g. Bijwaard 2014; Haberman and Pitacco 1999; Pitacco 2014; Putter et al. 2007).

*multiple state model*. That is, at any time the insured risk is in one of a finite number of states labelled by \(1,2,\ldots ,N\) or simply by letters. Let \({{\mathcal {S}}}\) be the

*state space*. Each state corresponds to an event which determines the cash flows (premiums and benefits). Additionally, by \({\mathcal {T}}\) we denote the

*set of direct transitions*between states of the state space. Thus \({\mathcal {T}}\) is a subset of the set of pairs \(\left( {i,j} \right) \), i.e., \({\mathcal {T}} \subseteq \{ \left( {i,j} \right) \mid i \ne j; i,j \in {\mathcal {S}} \}\). Note that the pair \(({\mathcal {S}},{\mathcal {T}})\) is called a

*multiple state model*, and describes all possible insured risk events as far as their evolution is concerned (usually up to the end of insurance). In this paper we consider an insurance contract issued at time 0 (defined as the time of issue of the insurance contract) and according to a plan terminating at a later time

*n*(

*n*is the term of policy). Moreover, let

*x*be the age at the policy issue.

*a*means that the insured is active or healthy,

*i*indicates that the insured person is ill and suffers from a dread disease and

*d*is related to the death of the insured. A more advanced model was investigated in Haberman and Pitacco (1999), Pitacco (1994), Pitacco (2014), where instead of a single state

*d*, it distinguishes between death being due to dread disease

*d*(

*D*) and other causes

*d*(

*O*)

*c*is a given lump sum (

*death benefit*), \(c^{ad}\) is an additional lump sum (

*disease benefit*) and \(\lambda \) is the so called

*acceleration parameter*\((0 \le \lambda \le 1)\). The amount \(c\lambda +1\mathrm{I}_{\{\lambda =0\}} c^{ad}\) is payable after the dread disease diagnosis, while the remaining amount \(c(1-\lambda )\) is payable after death, if the two random events occur within the policy term

*n*.

if \(\lambda =0\), then the model describes a rider benefit as an additional benefit,

if \(0< \lambda <1\), then the model describes a rider benefit as an acceleration of part of the basic life cover,

if \(\lambda =1\), then the model describes a stand-alone cover.

*i*is absorbing, because the whole insurance cover ceases immediately after dread disease diagnosis and the payment of the sum assured (in this model direct transitions (

*i*,

*d*(

*O*)) and (

*i*,

*d*(

*D*)) are not present). If \(0 \le \lambda <1\), state

*i*is irreversible.

The amount of benefits is determined in the insurance contract (at the time the contract is concluded). Costing of any insurance products is always connected with the probability structure of the model. In case of dread disease cover, data such as incidence rates of dread disease is required. A frequent problem is that the needed data is not available or only in a limited form. Therefore, one has to make assumptions, which have impact on the actuarial values such as premiums and reserves. For the CII designed in Fig. 1a, the outline of possible assumptions and methods for calculating premiums rates for a critical illness cover is presented in Dash and Grimshaw (1993). One of the objectives is to assume that the probability of death of a sick person does not depend on the duration of the disease. Moreover, the dread disease insurances are products, which are very sensitive to the development of medicine. Not all dread diseases are as mortal as some years ago and yet this type of insurances are of long-term type. Thus insurers introduce very strict conditions for the right to receive the benefit associated with a severe disease. Beside a diagnosis, the disease stage is important. This implies that the model presented in Fig. 1a is insufficient.

In this paper we propose a multiple state model for critical illness insurance, which takes into account that the probability of death for a dread disease sufferer may depend on the duration of the disease and the payment of benefits associated with a severe disease depend on a diagnosis and the disease stage.

*i*has to be divided into two states:

- \(i^D\)
the insured person is ill. In this stage the remission of the disease is still possible, although return to health state is impossible.

- \(i^{DD}\)
the insured person is terminally ill. In this stage the remission of the disease is very unlikely.

*d*(*O*,*D*)the death of the insured person who is ill or due to other cases.

*d*(*DD*)the death of the insured person who is terminally ill.

Note that in the model presented in Fig. 1b we describe death states in a different way than in Fig. 1a. The main reason is that in described CII, values of benefits are connected with the insured’s health situation just before his death, not with the cause of death.

Moreover, the direct transition (*a*, *d*(*DD*)) is omitted. In particular, this means that all dread disease deaths of the terminally ill insured are represented by the state *d*(*DD*). It is a situation analogous to the one given in Fig. 1a where deaths due to dread disease are represented by the pair of transitions (*a*, *i*) and (*i*, *d*(*D*)), the direct transition (*a*, *d*(*D*)) is not possible.

Note that the term *terminally ill* in the context of health care refers to a person who is suffering from a serious illness and whose life is not expected to go beyond 24 months at the maximum. In practice the period of permanently illness is different for each disease. For example the HIV+ patients with more than approximately 4,5 years of life expectancy are treated as patients in relatively good health.

Let \(m_s\) be an maximum future lifetime of *s*-years-old terminaly ill person. We assume that \(m_s=m\) for each *s*-years-old terminally ill person. Moreover, it is reasonable to assume that the probability of death for a dread disease sufferer depends on the duration of the disease. To consider the influence of illness duration on the mortality probabilities, we split state \(i^{DD}\) into *m* states \(i^{DD(h)}\)\((h=1,2,\ldots ,m)\), where \(i^{DD(h)}\) means that the insured is terminally sick between \(h-1\) and *h* year (compere Amsler 1968; Gregorius 1993; Haberman and Pitacco 1999; Janssen and Manca 2006, 2007). Note that state \(i^{DD(h)}\) is a reflex state (that is strictly transitional and after one unit of time, the insured risk leaves this state). Finally, we arrive at a general multiple state model for critical illness insurances, which is presented in Fig. 1c.

## 3 Lung cancer as an example of dread disease

### 3.1 Multiple state model

After cardiovascular diseases, malignant tumors pose the second cause of death in developed countries. In particular, lung cancer belongs to the group of tumors characterized by the highest morbidity and mortality rates. It is the most frequent in population of men and the second frequent in population of women after breast cancer. Additionally, lung cancer is so-called tumour with unfavourable prognosis. For example in Poland, by analysing epidemiological data it can be concluded that only about 16% of women and 11% of men outlive five years after the diagnosis (Wojciechowska and Didkowska 2014). Because of the high prevalence and mortality rates, the relatively short survival time after the diagnosis, lung cancer is a perfect example of the deadly disease, which should be covered by critical illness insurances.

Epidemiological data confirms the existence of significant differences between the incidence of lung cancer in men and women populations. The morbidity rate is several times higher in men population. In many European countries, in the second part of the eighties of the twentieth century, the tendency of stabilizing the incidence rate is observed among men. A different situation occurs in case of women. The number of cancer cases continues to grow, which is undoubtedly caused by cultural changes, such as an increase in the number of smokers among women in the post-war generation. Due to the growing number of smokers among women, we should expect a further increase in the number of cancer cases in the population of women.

The incidence rate depends also on age (e.g. Arbeev et al. 2005a, b; Ukraintseva and Yashin 2003). Lung cancer occurs very rarely among patients up to forty years of age. The incidence begins to increase after the age of fifty. The peak incidence occurs at the sixth and seventh decades of life. By analysing geographical data, a significant diversity of incidence and mortality rates is observed in different regions of Europe. In Poland, the morbidity and mortality vary significantly among particular provinces (voivodships). Thus, age, sex and region of residence should be taken into account in the analysis of the etiology of lung cancer.

- 1
is alive and not sick with malignant lung tumour (

*a*),- 2
is diagnosed of lung cancer without metastasis to lymph nodes, brain, bones or so-called distant metastases (\(i^D\)),

- 3
is suffering from cancer for a year after diagnosis of distant metastases (\(i^{DD(1)}\)),

- 4
is suffering from cancer for the second year after diagnosis of distant metastases (\(i^{DD(2)}\)),

- 5
is suffering from cancer for a third year after diagnosis of distant metastases (\(i^{DD(3)}\)),

- 6
is suffering from cancer for a fourth after diagnosis of distant metastases (\(i^{DD(4)}\)).

- 7
was healthy or is diagnosed of lung cancer without metastasis (

*d*(*O*,*D*)),- 8
had a lung cancer with distant metastases (

*d*(*DD*)).

### 3.2 Probabilistic structure of the model

We focus on a discrete-time model, where insurance payments are made at the ends of time intervals (years). Let *X*(*x*, *t*) denote the state of an individual (the policy) at time t (\(t \in \text {T}= \{0,1,2,\dots ,n\}\)), where *x* is an *age at entry*. Hence the evolution of the insured risk is given by a discrete-time stochastic process \(\{ {X(x,t); t\in \text {T}} \}\), with values in the finite set \(\mathcal {S}=\{ 1, 2,\ldots ,8\}\). The individual’s presence in a given state or movements between states may have some financial impact like payments of premiums or benefits. Practically it means, that lump sum benefits are paid immediately before the end of the year if the process \(\{ {X(x,t); t\in \text {T}} \}\) changes the state at that year and with this changes paying such benefit is connected. Premiums are paid immediately after the beginning of the year. In order to valuate the insurance contract during *n*-years insurance period, the knowledge of probabilities of realizing particular cash flows is necessary.

*x*).

Under the assumption that \(\{X(x,t)\}\) is a nonhomogeneous Markov chain, to get the sequence of matrices \(\{{\varvec{P}}^{[x]}(k)\}_{k=0}^{n}\), it is enough to know \({\varvec{P}}(0)\) and the sequence of matrices \( {\varvec{Q}}^{[x]}(0), {\varvec{Q}}^{[x]}(1),{\varvec{Q}}^{[x]}(2), \ldots , {\varvec{Q}}^{[x]}(n-1)\), where \({\varvec{Q}}^{[x]}(k)=\left( q^{[x]}_{ij}(k)\right) _{i,j=1}^{8}\) and \(q^{[x]}_{ij}(k)=\mathbb {P}(X(x,k+1)=j | X(x,k)=i)\) is a transition probability.

### 3.3 Multiple increment–decrement tables

*multiple increment–decrement table*(or

*multiple state life table*). The number of functions of such a table is closely linked to multiple state model \((\mathcal {S}, \mathcal {T})\) (cf. Dbicka 2012; Bowers et al. 1986; Haberman 1983a, b; Jordan 1982; Mattsson 1997). The simplest multiple increment–decrement table, which refers to an

*x*year old person is a life table \(\{l_{[x]+k}\}_{k \ge 0}\), where \(l_{[x]+k}\) is a number of those alive at the beginning of time interval \([x+k,x+k+1)\). Then \(d_{[x]+k}=l_{[x]+k}-l_{[x]+k+1}\) is the number of deaths during the time interval \([x+k,x+k+1)\). In general, one can assign for a multiple state model \((\mathcal {S}, \mathcal {T})\) and an

*x*year old person the multiple increment–decrement table, which consists of functions described for each transient state \(i \in \mathcal {S}\):

- \(l^{i}_{[x]+k}\)
denotes the number of lives in state

*i*at age \(x + k\),- \(d^{ij}_{[x]+k}\)
the number of lives at age \(x+k\), which during period \([x+k, x+k+1)\) left the state

*i*and transit to state*j*.

*i*in time interval \((x+k,x+k+1]\), and \(\sum _{j:(j,i) \in \mathcal {T}}d^{ji}_{[x]+k}\) is a number of lives, which entered state

*i*in time interval \((x+k,x+k+1]\).

*x*year old person for \(({\mathcal {S}},{\mathcal {T}})\) given in Fig. 1c for lung cancer disease (i.e. with \(m=4\)). Since \(q_{68}^{[x]}(k)=1\), then \(d_{[x]+k}^{68}=l_{[x]+k}^{6}\). This is why \(d_{[x]+k}^{68}\) can be omitted in (4).

- if
*i*is absorbing, then$$\begin{aligned} q^{[x]}_{ij}(k)= \left\{ \begin{array}{l@{\quad }l} 1 &{}\mathrm{for}\,\,j=i \\ 0 &{}\mathrm{for}\,\, j \ne i \end{array} \right. , \end{aligned}$$(5) - if
*i*is transient, then$$\begin{aligned} q^{[x]}_{ij}(k)= \left\{ \begin{array}{lll} \frac{l^{i}_{[x]+k+1}- \sum _{j:(i,j) \in \mathcal {T}} d^{ij}_{[x]+k}}{l^{i}_{[x]+k}} &{}\quad \mathrm{for}&{} j=i \\ \frac{d^{ij}_{[x]+k}}{l^{i}_{[x]+k}} &{}\quad \mathrm{for}&{} (i,j) \in {\mathcal {T}} \\ 0 &{}\quad \mathrm{for}&{} (i,j) \notin {\mathcal {T}} \end{array}. \right. \end{aligned}$$(6)

*i*at the beginning of

*k*-th year of the insurance period.

*x*is cumbersome and not always needed. In further analysis we suppose that the distribution of \(\{X(x,t)\}\) can be expressed by the distribution of process \(\{X(0,s)\}\) with regard to a 0 year old person. This assumption is known as the

*hypothesis of aggregation*(HA), which can be equivalently formulated in several ways. Observe that, according to the model presented in Fig. 1c for \(m=4\), the

*alive*state has been split into six transient states (i.e. states \(1,2,\ldots 6\)) and the

*death*state has been split into two absorbing states corresponding to health situation of the insured just before his death (i.e. states 7 and 8). Then, HA for the considered model is equivalent to the condition

*x*and

*k*that \(\mathbb {P}(X(x,k)=j)>0\).

*x*we have \({\varvec{Q}}(k):={\varvec{Q}}^{[0]}(x+k)\) with

## 4 Estimation of transition probabilities

### 4.1 Data

Due to the influence of the residence place on morbidity and mortality rates of lung cancer an analysis based on data from Lower Silesia separately for men and women populations was performed. In order to estimate the transition probabilities three databases have been used.

First, in the analysis of future life time, the life tables for 2008, separately for population of men and women were used (www.stat.gov.pl/en/topics/population/life-expectancy/).

Secondly, the information about the morbidity and mortality rates is obtained on the basis of the National Cancer Registry for the Lower Silesia Region (Wojciechowska and Didkowska 2014). The register is created on the basis of individual declarations of tumors by hospitals. Note that, in the year 2008, the percentage of the declarations submitted in Lower Silesia region exceeds 95% and belongs to the best registries in Poland. Therefore, this database is reliable.

In the analysis, the data set of individual hospitalization from the Lower Silesia Department of the National Health Fund was used (unpublished). The number of patients with lung cancer was identified using the disease code (C33 and C34 according to the system of codes from ICD-10). Patients were identified using the coded numbers of the Universal Electronic System for Registration of the Population (Social Security). Data for the period from 2006 to 2011 was included in the analysis. The year 2008, as one of the middle periods, has been established as the reference year. The choice of the middle period allows to consider the histories of hospitalization of these patients in the time horizon from 2006 to 2011.

Populations of men and women are examined separately, due to a different structure of the morbidity and mortality associated with lung cancer. A data set concerning histories of hospitalization because of malignant lung cancer in 2008 in 62 hospitals was used. In the entire Lower Silesia Voivodeship, 2246 men (at age 20–94) and 945 women (at age 23–93) were hospitalized.

Since we do not know the exact date of death, we have only knowledge about a cessation of traditional treatment and the transfer of a patient to a hospice, the survival time is determined with an accuracy of a year.

### 4.2 Active

This section concerns probabilities associated with state 1 of the CII model i.e. probabilities of the first row of the matrix (2)

*t*-th year as the number of cases of illness \(\check{l}_{s}^{(t)}\) per 100,000 of the studied population at age

*s*, calculated in five-year age groups. Because \(\zeta _{s}^{(t)}\) has significant variability, therefore we used the average of the crude cancer incidence rates

*s*year old patients who fell ill in 2008 with the first diagnosis showing the existence of distant metastases is calculated in five-year age groups.

*s*(calculated in five-year age groups) and can be expressed as follows

*t*-th year obtained on basis of data from Wojciechowska and Didkowska (2014). We finally obtain the following formula

The average crude rates of morbidity and mortality from lung cancer estimated on the basis of raw indicators from the years 2006-2010 are shown in Table 1 (Appendix A). The rates were estimated on the basis of reports from the National Cancer Registry (Wojciechowska and Didkowska 2014).

In order to estimate the percentage of people who fell ill in 2008 and were diagnosed with metastatic disease, a cohort of patients with lung cancer who, during 2008, fell ill with lung cancer has been separated. In that year, 1353 men and 605 women were diagnosed with lung cancer in the region of Lower Silesia. Patients received one of the two diagnoses. The first option was a recognition of metastases to lymph nodes in the chest and so called distant metastases. The analysis included an additional period of four weeks after making the first diagnosis. This period, treated as the additional time which is required to obtain the results of diagnostic tests, is taken into account in the model. If, during this period, the existence of metastasis was confirmed, the patient was classified to the same group as patients who received a diagnosis of metastatic disease during the first visit. From a formal point of view, the patient moved at once from the first state to the third.

The diagnosis stating only the tumour incidence in the lungs but without metastasis was identified as the second possible type of diagnosis. In the considered model it is assumed that a patient with a diagnosis of the absence of metastasis goes into the second state.

In Fig. 2 we present probabilities (14) and (15) with respect to age and sex of an insured person. Looking at the graphs, a significant difference in incidence between men and women can be observed. Men succumb to lung cancer several times more often than women. A different regularity is also noticeable. In the older age groups diagnosis without metastases is often posed. For women it is around 60 year of age, for men around 70. In the youngest groups, lung cancer is diagnosed with metastases. In interpreting the results, it should be remembered that the disease is considered from the point of view of a calendar year, rather than the annual individual patient’s medical history. Thus, a history of an insured person who enters the oncological health care system in January (at the beginning of a year) looks quite different from that of a patient who was diagnosed in December (at the end of a year). For this reason, the percentage of people diagnosed with metastases may appear to be lower than it might seem from the epidemiological data. The chosen method of analysis enables to take into account the fact that an insured person could potentially fall ill throughout the year since the inception of the insurance contract. In the event that this happens at the beginning of this year, the chance of occurrence of metastases is higher than when it happens at the end of the year.

### 4.3 Lung cancer without metastasis

Evaluating of transition probabilities for patients who were diagnosed with cancer without metastases [i.e. probabilities of the second row of the matrix (2)] is the next step of the analysis. Estimating of probabilities is associated with the analysis of the history of hospitalization of patients with lung cancer who, during the first admission in 2008, had no metastasis. Patients were hospitalized for the first time in 2006, 2007 or 2008. There is a need to define the cohort of patients, who were ill in 2006 and 2007 and at the beginning of 2008 did not have metastases as well as patients, who fell ill in 2008 without metastases as the first diagnosis. The percentage of patient who, during this year metastases were diagnosed, is calculated. The transition probability \(q_{23}(k)\) can be estimated using \(\varrho _{x+k}\), which is the proportion of patients suffering from lung cancer in 2008, who got metastases during the year and it is estimate based on Lower Silesia Department of the National Health Fund (unpublished). In further considerations, we accept that \(q_{27}(k)=q_{x+k}\). It is connected with the fact, that a sick person, who has not metastases, has a higher risk of death than a healthy person, so we also take into account the possibility of dying for one of many reasons, including lung cancer. Taking into account the above considerations, we obtain \(q_{22}(k)= 1-q_{x+k}-\varrho _{x+k}\).

Patients who, during 2008, suffered from lung cancer and in the initial diagnosis in 2008 they did not have metastases, pose the studied cohort at this stage. 1098 men and 533 women belonged to the analyzed cohorts.

In case of the second state, the examination of history of the disease from the perspective of an insurance company makes the chance of an insured person to remain without metastasis apparently higher than it results from the epidemiological data. While comparing the empirical percentages of diagnosed metastases calculated for particular years of age in a given year in populations of men and women, some differences can be spotted for both populations. In women population, the percentage of diagnosis with metastasis is highest in the age group of 45 years, then it gradually decreases. In male population it grows, reaching a peak in the age group about of 60 years, then subsides gradually. It should be noted that, in age groups with the highest lung cancer incidence rates (from 50 to 70 years), the probability of diagnosis of metastasis for a patient with determined lung cancer is higher in men population.

The chance of getting a diagnosis of metastatic is modelled using a Bernoulli distributed random variable. The parameter \(\varrho \) is defined as the success probability and is estimated using the logistic regression model. In this model, an independent variable is an age of a patient. In both populations, patients below 45 years of age occur very rarely. For this group of patients the probability of diagnosis with metastases is calculated using the nearest neighbor method. This means that the probability is constant in this group and equals the probability of diagnosis for a 45 years-old person. In female population, logistic regression model parameters were estimated for patients over 45 years of age. In the population of men, patients above 45 years of age are divided into two groups, the first form 45 to 59 years of age, and the second above 59 years of age. In these two groups, the probabilities of diagnosis with metastases are estimated using separate logistic regression models. A specified age group division ensures the best fitting of the model to the data.

The division of men population into two groups (to 59 years old and above 59 years old) has enabled obtaining the best fitted model. In case of women population any division is not necessary. A possible explanation for this phenomenon could be related to health behaviour which has influence on the probability of metastases detection. Completely different health behaviour is observed in male and female population in Poland. Women often use medical services regardless of their age. The situation is quite different for men. The men at working age are reluctant to use medical services, preventive examinations and avoid visits to doctors. Changes in health behaviour occur in pre-retirement age. Men above 59 years old are more often begin to use medical services, including preventive examinations. This phenomenon is very visible in the analysis of treatments cost in Poland. The average treatment cost for a man above 60-years old is rapidly growing and it is much higher than the average treatment cost for a woman at the same age. In case of younger groups of patients, the average treatment cost is higher for woman.

Parameter estimators of models for male and female populations are presented in Table 3 (Appendix B). The goodness of fit of model is evaluated on the basis of Wald’s test results, Lemesow Hosmer test and values of deviation. Selection of model was based on information criterion AIC. The results are presented in Table 4 (Appendix B).

### 4.4 Lung cancer with metastasis

After receiving the diagnosis of metastasis, a patient is considered to be terminally ill. This section concerns probabilities associated with this situation i.e. probabilities of rows from third to sixth of the matrix (2).

In particular, the estimation of \(q_{34}(k)\) (and \(q_{38}(k)\)) is equivalent to modelling of the survival time of patients, who had metastatic disease before 2008, who were diagnosed with metastatic during their first visit in 2008 or they got metastases during 2008.

Due to the low incidence rate of lung cancer with metastatic among young people between 20 and 39 years old, the probabilities associated with the states from 3 to 6 were estimated using the nearest neighbor method. Hence this probabilities are equal to probabilities calculated for 40-year old persons.

Based on analysis of the mortality of the cohort members at age \(s=40,41,\ldots ,100\), we estimate transition probabilities (21) for \(i=3,4,5\). Let us introduce a variable \(T_s\), which denotes the number of years that the *s*-years-old patient from the analyzed cohort survived. On the basis of empirical data Lower Silesia Department of the National Health Fund (unpublished) we assume that \(T_s\) takes values from 0 to 3. If \(T_s=0\), which means that a patient died during the first year (counting from the first hospitalization with diagnosed metastases during 2008). If \(T_s=1\), it means that a patient died during the second year et cetera. A patient survives for maximum four years.

Note that according to (21), the survival probabilities connected with the third to sixth states are defined by the probabilities of death.

In Fig. 5 we present probabilities of death for terminally ill men (Fig. 5a) and women (Fig. 5b) with respect to age.

The statistical analyzes presented in Sect. 4 and Appendixes are carried out by means of Statistica 10.

## 5 Conclusions

The multiple state model presented in Sect. 2 is suitable not only for modelling the critical illness insurance contracts but also for other health insurances. It can also be adapted to insurance contracts against the loss of income due to disability or the loss of health (income protection). The introduced model allows for combining CII with life insurance. In such a combination disease benefits are usually provided as an acceleration benefit to a life insurance.

Death probabilities (22)–(24) and (25)–(27) concerning the population of those suffering from lung cancer with metastasis (e.g. states 3, 4, 5, 6, 8) are needed to calculate the value of viatical settlement payments (Dbicka and Heilpern 2017).

## Notes

### Acknowledgements

We thank the reviewer for his/her valuable comments and suggestions which highly improved the presentation of the results.

## References

- Amsler MH (1968) Sur la Modélisation des Risques Vie par les Chaênes de Markov. Transactions of the 18th International Congress of Actuaries. München, vol 5, pp 731–746Google Scholar
- Andersen PK (1988) Multistate models in survival analysis: a study of nephropathy and mortality in diabetes. Stat Med 7:661–670CrossRefGoogle Scholar
- Arbeev KG, Ukraintseva SV, Arbeeva LS, Yashin AI (2005a) Mathematical models for human cancer incidence rates. Demogr Res 12(10):237–272. https://doi.org/10.4054/DemRes.2005.12.10 CrossRefGoogle Scholar
- Arbeev KG, Ukraintseva SV, Arbeeva LS, Yashin AI (2005b) Decline in human cancer incidence rates at old ages: age-period-cohort considerations. Demogr Res 12(11):273–300. https://doi.org/10.4054/DemRes.2005.12.11 CrossRefGoogle Scholar
- Bhuyan VB (2009) Life markets: trading mortality and longevity risk with life settlements and linked securities. Wiley, HobokenCrossRefGoogle Scholar
- Bijwaard GE (2014) Multistate event history analysis with frailty. Demogr Res 30(58):1591–1620. https://doi.org/10.4054/DemRes.2014.30.58 CrossRefGoogle Scholar
- Bowers NL, Gerber HU, Hichmann JC, Jones DA, Nesbitt CJ (1986) Actuarial mathematics. Society of Actuaries, IllinoiszbMATHGoogle Scholar
- Dash A, Grimshaw D (1993) Dread disease cover—an actuarial perspective. J Staple Inn Actuar Soc 33:149–193CrossRefGoogle Scholar
- Data base of histories of hospitalization from Lower Silesia Department of National Health Fund. Unpublished due to confidentiallyGoogle Scholar
- Dȩbicka J (2012) Modelowanie strumieni finansowych w ubezpieczeniach wielostanowych. Statistics and risk series. Editorial Office Wroclaw University of Economics, WroclawGoogle Scholar
- Dȩbicka J, Mazurek E (2001) Net single premium in the supplementary unemployment insurance. Acta Oecon Prag Stat Econom Methods Model Econ Process 9(3):43–52Google Scholar
- Dȩbicka J, Zmyślona B (2016) Construction of multi-state life tables for critical illness insurance—influence of age and sex on the incidence of health inequalities. Sil Stat Rev 14(2016):41–63Google Scholar
- Dȩbicka J, Heilpern S (2017) Investors expected profit from viatical settlements. In: 20-th AMSE. Applications of mathematics in economics. International scientific conference: Szklarska Porȩba, Poland, 30 Aug–3 Sep 2017. Conference proceedings, pp 105–116. https://doi.org/10.15611/amse.2017.20.24
- Diaconu V, Ouellette N, Camarda CG, Bourbeau R (2016) Insight on ’typical’ longevity: an analysis of the modal lifespan by leading causes of death in Canada. Demogr Res 35(17):595–628. https://doi.org/10.4054/DemRes.2016.35.17 CrossRefGoogle Scholar
- Gaag N, Bijwaard G, Beer J, Bonneux L (2005) A multistate model to project elderly disability in case of limited data. Demogr Res 32(3):75–106. https://doi.org/10.4054/DemRes.2015.32.3 CrossRefGoogle Scholar
- Gatzert N (2010) The secondary market for life insurance in the United Kingdom, Germany, and the United States: comparison and overview. Risk Manag Insur Rev 13(2):279–301CrossRefGoogle Scholar
- Gregorius FK (1993) Disability insurance in The Netherlands. Insur Math Econ 13:101–116CrossRefGoogle Scholar
- Haberman S (1983a) Decrement tables and the measurement of morbidity: I. J Inst Actuar 110:361–381CrossRefGoogle Scholar
- Haberman S (1983b) Decrement tables and the measurement of morbidity: II. J Inst Actuar 111:73–86CrossRefGoogle Scholar
- Haberman S, Pitacco E (1999) Actuarial models for disability insurance. Chapman & Hall, Boca RatonzbMATHGoogle Scholar
- Hougaard P, Madsen EB (1985) Dynamic evaluation of short-term prognosis after myocardial infarction. Stat Med 4:29–38CrossRefGoogle Scholar
- Janssen J, Manca R (2006) Applied semi-Markov processes. Springer, New YorkzbMATHGoogle Scholar
- Janssen J, Manca R (2007) Semi-Markov risk models for finance, insurance and reliability. Springer, New YorkzbMATHGoogle Scholar
- Jordan CW (1982) Life contingencies. The Society of Actuaries, ChicagoGoogle Scholar
- Life Tables of Poland (2008). www.stat.gov.pl/en/topics/population/life-expectancy/
- Mattsson P (1997) Some reflections on different disability models. Scand Actuar J 1977(2):110–118CrossRefGoogle Scholar
- Neeraj S (2003) Cashing out life insurance: an analysis of the viatical settlements market, RAND Corporation, Santa Monica, CA RGSD-175, 2003. http://www.rand.org/pubs/rgs-dissertations/RGSD175. 15 Apr 2013
- Pitacco E (2014) Health insurance. Basic actuarial models. EAA series. Springer, New YorkzbMATHGoogle Scholar
- Pittaco E (1994) LTC insurance. From the multistate model to practical implementations. In: Proceedings of the XXV ASTIN Colloquium, Cannes, Frances, pp 437–452Google Scholar
- Putter H, Fiocco M, Geskus RB (2007) Tutorial in biostatistics: competing risks and multi-state models. Stat Med 26(11):2389–2430MathSciNetCrossRefGoogle Scholar
- Ukraintseva V, Yashin AI (2003) Individual aging and cancer risk: how are they related? Demogr Res 9(8):163–196. https://doi.org/10.4054/DemRes.2003.9.8 CrossRefGoogle Scholar
- Vega A, Brazil N (2015) A multistate life table approach to understanding return and reentry migration between Mexico and the United States during later life. Demogr Res 33(43):1211–1240. https://doi.org/10.4054/DemRes.2015.33.43 CrossRefGoogle Scholar
- Wojciechowska U, Didkowska J (2014) Zachorowania i zgony na nowotwory złośliwe w Polsce. Krajowy Rejestr Nowotworw, Centrum Onkologii - Instytut im. Marii Skłodowskiej - Curie. www.onkologia.org.pl/raporty/. 10 Apr 2014

## Copyright information

**OpenAccess**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.