Keywords

1 Introduction

The determination of the mortality models is one of the basic problems not only in the field of life insurance but recently particularly in the economics of healthcare [1, 3,4,5]. Currently, the most frequently used model is Lee-Carter [9, 11, 12, 16, 17] not only the change in mortality associated with age x and calendar year t but also takes into account the influence of belonging to a particular generation (cohort effect) and takes the form: \(ln(\mu _{x,t})=\alpha _x+\beta _x k_t+\varepsilon _{x,t}\). The assumption that the estimated \(a_x\) and \(b_x\) are fixed at time t causes a wave of criticism, especially from the point of view of forecasting. Therefore, there is a need to look for other methods predicting mortality rates that take into account the variability of parameters over time. One of these propositions may be the approach recently proposed in Rossa and Socha [18], Rossa et al. [19], Sliwka and Socha [20], Sliwka [21, 22], and based on the Milevsky-Promislov family of models [15] with extensions [2, 7, 13]. Methods of modeling \(\mu _{x,t}\) taking into account the causes of death have been characterized in article [23].

Works on switchings model, which consist of several subsystems with the same structure and different parameters and which can switch over time according to an unknown switching rule, have been taken, among others in Sliwka and Socha [20]. In the mentioned paper it was shown that modeling of empirical mortality coefficients \(\mu _{x,t}\) using the non-Gaussian linear scalar filters model second order with switchings (nGLSFo2s) allows a more precise estimate of \(\mu _{x,t}\) than using the Gaussian linear scalar filters model with switchings (GLSFs) and Lee-Carter with switchings (LCs) model for some fixed ages x.

In this paper we first propose three extended Milevsky and Promislov models with continuous non-Gaussian filters. We assume that excitations are modeled not only by the second but also by the fourth (nGLSFo4) and the sixth order polynomials (nGLSFo6) of outputs from a linear nGLSF. To estimate the model parameters we use the first and second moments of mortality rates. We show that in considered models some of the parameters can be estimated. Next, we use these models to create hybrid models, where submodels have the same structure and possible different parameters. To estimate the model parameters we use the first and second moments of mortality rates. According to our knowledge, the mortality models proposed above, their hybrid versions and methods for estimating their parameters and switching are new in the field of life insurance.

The paper is organized as follows. In Sect. 2 basic notations and definitions of stochastic hybrid systems are entered. Three new basic models represented by even-order polynomials of outputs from linear Gaussian filter are introduced and the non–stationary solutions of corresponding moment equations are presented in Sect. 3. The derivation of these non–stationary solutions are derived in Appendix. In Sect. 4 the procedure of the parameters estimation and determination of switching points is presented. Based on the adapted numerical algorithm of a nonlinear minimization problem, parameter estimation is performed. In Sect. 5 we have compared empirical mortality rates with theoretical ones obtained from proposed models as well as from standard LC model in two versions with switchings and without switchings. The last Section summarizes the obtained results.

2 Mathematical Preliminaries

Throughout this paper we use the following notation. Let \(|\cdot |\) and \(<\cdot>\) be the Euclidean norm and the inner product in \(\mathbb {R}^n\), respectively. We mark \(\mathbb {R}_{+} = [0,\infty )\), \(\mathbb {T}=[t_0,\infty ),\ t_0 \ge 0\). Let \(\varXi =\left( \varOmega ,\mathcal {F} ,\{\mathcal {F}_t\}_{t \ge 0}, \mathbb {P}\right) \) be a complete probability space with a filtration \(\{\mathcal {F}_t\}_{t \ge 0}\) satisfying usual conditions. Let \(\sigma (t): \mathbb {R}_{+} \rightarrow \mathbb {S}\) be the switching rule, where \(\mathbb {S}= \{1,\ldots , N\}\) is the set of states. We denote switching times as \(\tau _1,\tau _2,\ldots \) and assume that there is a finite number of switches on every finite time interval. Let \(W_k(t)\) be the independent Brownian motions. We assume that processes \(W_k(t)\) and \(\sigma (t)\) are both \(\{\mathcal {F}_t\}_{t \ge 0}\) adapted.

By the stochastic hybrid system we call the vector Itô stochastic differential equations with a switching rule described by

$$\begin{aligned} d\mathbf{x}(t) = \mathbf{f}(\mathbf{x}(t),t, \sigma (t)) dt + \mathbf{g}(\mathbf{x}(t),t,\sigma (t))dW(t), \, (\sigma (t_0),x(t_0))=(\sigma _0,x_0), \end{aligned}$$
(1)

where \(x\in R^n\) is the state vector, \((\sigma _0,x_0)\) is an initial condition, \(t\in T\) and M is a number of Brownian motions. \(f(x(t),t,\sigma (t))\) and \(g(x(t),t, \sigma (t))\) are defined by sets of f(x(t), tl) and g(x(t), tl),respectively i.e.

$$\begin{aligned} f(x(t),t,\sigma (t))=f(x(t),t,l), g(x(t),t,\sigma (t))=g(x(t),t,l)\, \mathrm{for}\, \sigma (t)=l. \end{aligned}$$

Functions \(f:R^n \times T \times S \rightarrow R^n\) and \(g: R^n \times T\times S \rightarrow R^{n}\) are locally Lipschitz and such that \(\forall l\in S, t\in T , f(\mathbf{0},t, l)=g(\mathbf{0},t, l)=\mathbf{0}, k=1,\ldots ,M\). These conditions together with these enforced on the switching rule \(\sigma \) ensure that there exists a unique solution to the hybrid system (1).

Hence it follows that Eq. (1) can be treated as a family (set) of subsystems defined by

$$\begin{aligned} d\mathbf{x}(t, l) = \mathbf{f}(\mathbf{x}(t),t, l) dt + \sum _{k = 1}^M \mathbf{g}_k(\mathbf{x}(t),t,l)dW_k(t), \quad l \in \mathbb {S}\end{aligned}$$
(2)

where \(\mathbf{x}(t, l)\in \mathbb {R}^n\) is the state vector of l- subsystem.

We assume additionally that the trajectories of the hybrid system are continuous. It means, when the stochastic system is switched from \(l_1\) subsystem to \(l_2\) subsystem in the moment \(\tau _j\), then

$$\begin{aligned} \mathbf{x}(\tau _j, l_1) = \mathbf{x}(\tau _j, l_2 ), \quad l_1, l_2 \in \mathbb {S}. \end{aligned}$$
(3)

3 Models with Continuous Non-Gaussian Linear Scalar Filters

We consider a family of mortality models with a continuous nGLSF described by

$$\begin{aligned} \mu _x(t, l) = \mu _{x0}^l \exp \{ \alpha _x^l t + \sum _{i=1}^m q_{x_i}^l y^i(t, l) \} , \end{aligned}$$
(4)
$$\begin{aligned} d y(t, l) = - \beta _{x_1}^l y(t, l) dt + \gamma _{x_1}^l d W(t), \end{aligned}$$
(5)

where \(\mu _x(t,l)\) is a stochastic process representing a mortality rate for a person aged x (\(x \in X={0,1,\ldots ,\omega }\)) at time t; \(\alpha _x^l\), \(\beta _{x_1}^l\), \(q_{x_i}^l\), \(i = 1, ..., m\), \(\mu _{x_0}^l\), \(\gamma _{x_1}^l\) are constant parameters, \(l \in \mathbb {S}\); W(t) is a standard Wiener process.

We will show that the proposed model (4), (5) can be transformed to the formula (2) for all \(l \in \mathbb {S}\).

Introducing new variables \(y_1(t, l) = y(t, l)\),    \(y_i(t, l) = y^i(t, l)\), \(i = 1, ... m\), \(l \in \mathbb {S}\) and applying Ito formula we obtain

$$\begin{aligned} d y_1(t, l) = - \beta _{x_1}^l y_1(t, l) dt + \gamma _{x_1}^l d W(t), \end{aligned}$$
(6)
$$\begin{aligned} d y_2(t, l) = [- 2\beta _{x_1}^l y_2(t, l) + (\gamma _{x_1}^l)^2] dt + 2\gamma _{x_1}^l y_1(t, l) d W(t), \end{aligned}$$
(7)
$$\begin{aligned} \vdots \end{aligned}$$
$$\begin{aligned} \begin{array}{cl} d y_m(t, l) = \\ \left[ - m\beta _{x_1}^l y_{m}(t, l) + \frac{m(m-1)}{2}(\gamma _{x_1}^l)^2 y_{m-2}(t, l)\right] dt + m \gamma _{x_1}^l y_{m-1}(t, l) d W(t). \end{array} \end{aligned}$$

Taking natural logarithm of both sides of Eq. (4) and applying Ito formula for all \(l \in \mathbb {S}\) we find

$$\begin{aligned} d \ln \mu _x(t, l) \!=\! \alpha _x^l\! + \sum _{i=1}^m q_{x_i}^l d y_i(t, l) = \alpha _x^l\! - \sum _{i=1}^m \left[ i \beta _{x_1}^l q_{x_i}^l y_i(t, l) \right. \end{aligned}$$
(8)
$$\begin{aligned} +\left. \frac{i(i-1)}{2}q_{x_i}^l (\gamma _{x_1}^l)^2 y_i(t, l)^{i-2}\right] dt + \sum _{i=1}^m i \gamma _{x_1}^{l} q_{x_i}^l y_{i-1}(t, l) dW(t) \end{aligned}$$
(9)

Now we consider in details three cases of model (4) and (6)–(7), namely for m = 2, 4 and 6.

3.1 Model with Six Order Output of a Scalar Linear Filter

Equations (4) and (6)–(7) for \(m=6\) take the form

$$\begin{aligned} \mu _x(t, l) = \mu _{x0}^l \exp \{ \alpha _x^l t + \sum _{i=1}^6 q_{x_i}^l y^i(t, l) \} , \end{aligned}$$
(10)
$$\begin{aligned} d y_1(t, l) = - \beta _{x_1}^l y_1(t, l) dt + \gamma _{x_1}^l d W(t), \end{aligned}$$
(11)
$$\begin{aligned} d y_2(t, l) = [- 2\beta _{x_1}^l y_2(t, l) + (\gamma _{x_1}^l)^2] dt + 2\gamma _{x_1}^l y_1(t, l) d W(t), \end{aligned}$$
(12)
$$\begin{aligned} d y_3(t, l)=[- 3\beta _{x_1}^l y_3(t, l) + 3(\gamma _{x_1}^l)^2 y_1(t, l)] dt + 3\gamma _{x_1}^l y_2(t, l)dW(t), \end{aligned}$$
(13)
$$\begin{aligned} d y_4(t, l) = [- 4\beta _{x_1}^l y_4(t, l) + 6(\gamma _{x_1}^l)^2 y_2(t, l)] dt + 4\gamma _{x_1}^l y_3(t, l) d W(t), \end{aligned}$$
(14)
$$\begin{aligned} d y_5(t, l) = [- 5\beta _{x_1}^l y_5(t, l) + 10(\gamma _{x_1}^l)^2 y_3(t, l)] dt + 5\gamma _{x_1}^l y_4(t, l) d W(t), \end{aligned}$$
(15)
$$\begin{aligned} d y_6(t, l) = [- 6\beta _{x_1}^l y_6(t, l) + 15(\gamma _{x_1}^l)^2 y_4(t, l)] dt + 6\gamma _{x_1}^l y_5(t, l) d W(t). \end{aligned}$$
(16)

Introducing a new vector state

$$\begin{aligned} \mathbf{z}_x(t, l) = [z_{x_1}(t, l),z_{x_2}(t, l), \cdots ,z_{x_7}(t, l) ]^T = \end{aligned}$$
(17)
$$\begin{aligned}{}[\ln \mu _x(t, l), y_1(t, l), y_2(t, l), y_3(t, l), y_4(t, l), y_5(t, l), y_6(t, l) ]^T, \end{aligned}$$
(18)

Equations (10)–(16) one can rewrite in a vector form

$$\begin{aligned} d \mathbf{z}_x(t, l)\!\! =\!\!\left[ \mathbf{A}_x^6(l) \mathbf{z}_x(t, l) + \mathbf{b}_x^6(l)\right] dt + \left[ \mathbf{C}_x^6(l) \mathbf{z}_x(t, l) + \mathbf{g}_x^6(l)\right] dW(t) \end{aligned}$$
(19)

where

$$\begin{aligned} \mathbf{A}_x^6(l) = [a_{ij}^l], \mathbf{b}_x^6(l) = [b_i^l], \mathbf{C}_x^6(l) = [c_{ij}]^l, \mathbf{g}_x^6(l) = [g_i^l]. \end{aligned}$$
(20)

The elements of the matrices \(\mathbf{A}_x^6(l), \mathbf{C}_x^6(l)\) and vectors \(\mathbf{b}_x^6(l), \mathbf{g}_x^6(l)\) are defined by:

$$\begin{aligned} a_{12}^l = -\beta _{x_1}^l q_{x_1}^l + 3 q_{x_3}^l (\gamma _{x_1}^l)^2, a_{13}^l = - 2\beta _{x_1}^l q_{x_2}^l + 6 q_{x_4}^l(\gamma _{x_1}^l)^2, \end{aligned}$$
$$\begin{aligned} a_{14}^l = - 3 \beta _{x_1}^l q_{x_3}^l + 10 q_{x_5}^l(\gamma _{x_1}^l)^2, a_{15}^l= -4\beta _{x_1}^l q_{x_4}^l + 15 q_{x_6}^l (\gamma _{x_1}^l)^2, \end{aligned}$$
$$\begin{aligned} a_{16}^l = - 5\beta _{x_1}^l q_{x_5}^l, a_{17}^l = - 6 \beta _{x_1}^l q_{x_6}^l, a_{22}^l = -\beta _{x_1}^l, a_{33}^l = -2\beta _{x_1}^l, a_{42}^l = 3 (\gamma _{x_1}^l)^2, \end{aligned}$$
$$\begin{aligned} a_{44}^l = -3\beta _{x_1}^l, a_{53}^l = 6(\gamma _{x_1}^l)^2, a_{55}^l = -4\beta _{x_1}^l, a_{64}^l = 10(\gamma _{x_1}^l)^2, a_{66}^l = -5\beta _{x_1}^l, \end{aligned}$$
$$\begin{aligned} a_{75}^l = 15(\gamma _{x_1}^l)^2, a_{77}^l = -6\beta _{x_1}^l, b_1^l = \alpha _x^l + q_{x_2}^l (\gamma _{x_1}^l)^2, b_3^l = (\gamma _{x_1}^l)^2, \end{aligned}$$
$$\begin{aligned} c_{12}^l = 2q_{x_2}^l\gamma _{x_1}^l, c_{14}^l = 4q_{x_4}^l\gamma _{x_1}^l, c_{15}^l= 5q_{x_5}^l\gamma _{x_1}^l, c_{16}^l = 6q_{x_6}^l\gamma _{x_1}^l, c_{32}^l = 2\gamma _{x_1}^l, \end{aligned}$$
$$\begin{aligned} c_{43}^l = 3\gamma _{x_1}^l, c_{54}^l = 4\gamma _{x_1}^l, c_{65}^l = 5\gamma _{x_1}^l, c_{76}^l = 6\gamma _{x_1}^l, \end{aligned}$$
$$\begin{aligned} g_1^l = q_{x_1}^l\gamma _{x_1}^l, g_2^l = \gamma _{x_1}^l. \end{aligned}$$

We note that similarly to Eq. (2) we may treat Eq. (19) as a family (set) of subsystems. It means, we have obtained new mortality hybrid model. The unknown parameters in family of Eq. (19) are

$$\begin{aligned} \ln \mu _{x_0}^l (=\alpha _{0_x}^l), \alpha _x^l, \beta _{x_1}^l, q_{x_1}^l, q_{x_2}^l, q_{x_3}^l, q_{x_4}^l, q_{x_5}^l, q_{x_6}^l, \gamma _{x_1}^l. \end{aligned}$$
(21)

3.2 Nonstationary Solutions

Using linear vector stochastic differential Eq. (19) and Ito formula we derive differential equations for the first order moments \(E[z_{x_i}(l)] \) and second order moments \(E[z_{x_i}(l) z_{x_j}(l)] \), \( i,j = 1, ..., 7\). Next, we find the nonstationary solutions of the first moment of the processes \(z_{x_i}(t,l)\) for nGLSF of all order models, i.e. (nGLSFo1), (nGLSFo2), ... (nGLSFo6) models

$$\begin{aligned} E[z_{x_1}(t, l)] = \alpha _x^l t + \alpha _{0_x}^l,\quad l\in \mathbb {S}\end{aligned}$$
(22)

In the case of second moment of the processes \(z_{x_i}(t,l)\) we find first the nonstationary solutions for nGLSF even order models. In the case of sixth order model it has the form

$$\begin{aligned} E[z_{x_1}^2\!(t,\! l)]\! = (\alpha _x^l)^2 t^2 - 2 \alpha _x^l \left[ -\alpha _{0_x}^l + q_{x_2}^l \frac{(\gamma _{x_1}^l)^2}{2 \beta _{x_1}^l} \right. \end{aligned}$$
(23)
$$\begin{aligned} \left. + 3q_{x_4}^l\left( \frac{(\gamma _{x_1}^l)^2}{2 \beta _{x_1}^l}\right) ^2 + 15 q_{x_6}^l \left( \frac{(\gamma _{x_1}^l)^2}{2 \beta _{x_1}^l}\right) ^3\right] t + c_{0_x}^l \end{aligned}$$
(24)

where \(l\in \mathbb {S}\), \(q_{x_2}^l=q_{x_4}^l=q_{x_6}^l=1\), \(q_{x_2}^l=q_{x_4}^l=q_{x_6}^l=1\), and \(\alpha _{0_x}^l, c_{0_x}^l\) are constants of integration (see Sect. A).

To obtain the moment equations for nGLSF second and fourth order models and the corresponding stationary and nonstationary solutions we assume that:

  • in the case of second order model the parameters \(q_{x_2}^l=1\), and \(q_{x_4}^l=q_{x_6}^l=0\),

  • in the case of fourth order model the parameters \(q_{x_2}^l=q_{x_4}^l=1\), and \(q_{x_6}^l=0\).

The corresponding nonstationary solution for the second moment of the process \(z_{x_1}(t,l)\) takes the form:

$$\begin{aligned} E[z_{x_1}^2\!(t,\! l)]\! = (\alpha _x^l)^2 t^2 + 2 \alpha _x^l \alpha _{0_x}^l t - 2 \alpha _x^l q_{x_2}^l p t + c_{0_x}^l \end{aligned}$$
(25)

for nGLSF second order model, where \(c_{0_x}^l\) is an integration constant, and

$$\begin{aligned} E[z_{x_1}^2\!(t,\! l)]\! = (\alpha _x^l)^2 t^2 - 2 \alpha _x^l \left[ -\alpha _{0_x}^l + q_{x_2}^l p + 3q_{x_4}^l p^2 \right] t + c_{0_x}^l \end{aligned}$$
(26)

for nGLSF fourth order model, where \(c_{0_x}^l\) is an integration constant and \(p=\frac{(\gamma _{x_1}^l)^2}{2\beta _{x_1}^l}\).

It can be proved that in the case odd order models the nonstationary solutions have similar forms, i.e. in the case of the first order (nGLSFo1) model

$$\begin{aligned} E[z_{x_1}^2\!(t,\! l)]\!= (\alpha _x^l)^2 t^2 + 2 \alpha _x^l \alpha _{0_x}^l t + c_{0_x}^l \end{aligned}$$
(27)

and in the case of other odd order models, i.e. (nGLSFo3), (nGLSFo5), (nGLSFo7) models the nonstationary solutions are the same as the nonstationary solutions for nGLSF even order models, i.e. (nGLSFo2), (nGLSFo4), (nGLSFo6) models, respectively.

4 The Procedure of the Parameters Estimation and Determination of Submodels (based on Switching Points)

4.1 The Procedure

Simultaneous estimation of parameters: \(\alpha _0^l,\alpha _x^l,\beta _x^l,\gamma _x^l,c_{0_x}^l,q_x^l\) (where \(l\in \mathbb {S}\)) nGLSF models of 2, 4 or 6 order given by formulas (22)–(26) using traditional methods does not provide unambiguous results (this problem has already been considered in [20] part 4.1.1, in particular by considering the analytical formula for estimating parameters of GLSFo2 model). Therefore, in this case, a two-step procedure was used to estimate the parameters. In the first step, the \(\alpha _{0_x}\) and \(\alpha _x\) of the first moment \(E[z_{x_1}(t)]\) of the process \(z_{x_1}(t)\) were estimated. In the second step, \(c_{0_x}\) and \(p_x\) of the second moment \(E[z^2_{x_1}(t)]\) were estimated based on the already known \(\widehat{\alpha _{0_x}^l}, \widehat{\alpha _x^l}\),where \(p_x\) was defined as follows: \(p_x=\frac{\gamma _{x_1}^2}{2 \beta _{x_1}}\). The applied procedure allows to obtain unambiguous estimates of all parameters assuming that \(q_{x_i}=1, \forall _{i=1,...,6}\).

One of the fundamental problems in the field of switching models is to find the set of switching points. This problem is closely related to the problem of segmentation of a time series discussed in many papers (see for instance [10, 14]).

In our considerations we propose a procedure which is a combination of a statistical test (based on [6]) and so called Top-Down algorithm. It has the following form.

First we introduce some notations. We assume that an extracted time series (Input) consists of n empirical values \(y_{emp_1},\) \(y_{emp_2}, \ldots , y_{emp_n}\) defined in time points \(t_1, t_2,..., t_n\), respectively. By \({<}t_1, t_2{>}\) we denote an interval that begins at \(t_1\) and ends in \(t_2\). We define three sets

  • \({\mathcal P}\) - the set of non-verified intervals,

  • \({\mathcal R}\) - the set of intervals without switching points,

  • \({\mathcal T}\) - the set of switching points.

Then the initial conditions have the form

$${<}t_1, t_n{>}\, \in {\mathcal P}, \quad {\mathcal R} = \phi , \quad {\mathcal T} = \phi .$$
  • Step 1

    We calculate the values of function \(L(*)\) given by formula (28)

    $$\begin{aligned}&L(\alpha _{0_x}^l,\alpha _x^l,\sigma ^2,\tau )=-\frac{\tau }{2} ln(2\pi )-\frac{\tau }{2} ln(\sigma ^2)-\frac{1}{2\sigma ^2}\sum _{i=1}^{\tau } (y_{emp_i}-E[z_{x_1}(i,l)])^2\nonumber \\&\qquad -\frac{n-\tau }{2}ln(2\pi )-\frac{n-\tau }{2}ln(\sigma ^2)-\frac{1}{2\sigma ^2}\sum _{i=\tau +1}^n (y_{emp_i}-E[z_{x_1}(i,l)])^2 \end{aligned}$$
    (28)

    for all points from an interval \({<}t_1, t_n{>}\) and assuming the random component \(\epsilon _t \sim N(\mu ,\sigma ^2)\).

    If \(\tilde{L}(\tau _1) = max L(*)\) is found at the beginning or at the end of the considered interval, then there is not a switching point in this interval. Then we receive

    $${\mathcal P}= \phi , \quad {{<}t_1, t_n{>}\, \in \mathcal R}, \quad {\mathcal T} = \phi ,$$

    If \(\tilde{L}(\tau _1) = max L(*)\) is found inside the interval for \(\tau _1 = t_k\), then

    $${<}t_1, t_k{>}, {<}t_{k+1}, t_n{>}\, \in {\mathcal P}, \quad {\mathcal R} = \phi , \quad {\tau _1} \in {\mathcal T}.$$
  • Step 2

    Choose an interval from the set \({\mathcal P}\) and check if its length is greater than 2.

  • Step 3

    If “no”, then transfer this interval from the set \({\mathcal P}\) to the set \({\mathcal R}\) and go back to Step 2, if “yes”, go back to Step 1.

  • Step 4

    The procedure is ended when

    • \({\mathcal P} = \phi \),

    • \({\mathcal R}\) consists only with subintervals without switching points,

    • \({\mathcal T}\) consists of all switching points that can be sorted from the smallest to the greatest one.

4.2 The Determination of Submodels

In Subsect. 4.1 we have established the switching points set, which allow to define submodels. From (21) and further considerations we find that unknown parameters in family of (19) are

$$\begin{aligned} \alpha _{0_x}^l, \alpha _x^l,p^l, c_{0_x}^l \end{aligned}$$
(29)

where \(p^l=\frac{(\gamma _{x_1}^l)^2}{2\beta _{x_1}^l}\), and parameters \(q_{x_2},q_{x_4},q_{x_6}\) are equal 0 or 1.

Based on the numerical algorithm of nonlinear minimization with additional conditions of \(\alpha _{0_x}^l\) (\(\forall x\) \(\alpha _{0_x}^l<0\)) parameters (29) given in the formula (23)–(26) were assessed. The algorithm works by generating a population of random starting points and next uses a local optimization method from each of the starting points to converge to a local minimum. As the solution, the best local minimum was chosen.

For a fixed sex, fixed age x, and knowing the switching points (designated in accordance with the procedure described above) two sets of time series of \(\widehat{\mu _{x,t}}\) values were created. In the first case, the estimation of \(\widehat{\mu _{x,t}}\) was based on empirical data from 1958–2010 (using the next 6 years for ex-post error evaluation). Similar estimation based on the years 1958–2016 was done in the second case. In both cases the choice of the theoretical value \(\widehat{\mu _{x,t}}\) at a fixed moment t from the theoretical values of the models (nGLSFo2), (nGLSFo4) and (nGLSFo6) was based on minimization of the absolute error (AE), i.e.

$$\begin{aligned} \min \limits _{i=2,4,6} |\widehat{\mu _{x,t}}^{nGLSFo_i}-\mu _{x,t}|. \end{aligned}$$

In addition, point forecasts for the period 2017–2025 have been determined. The parameters for the Lee-Carter model with switchings were estimated based on the formulas given in the literature [11] and using the same set of switches as in the case of the nGLSF model.

We note that the hybrid model (19) is continuous. However, the moment equations of the first and second-order defined by (22)–(27) are not continuous in switching points because the empirical data of mortality rates we have used were discrete, and these moments are determined separately for every submodel.

5 Results

Selected results for a 45-year old and a 60-year old woman and man presented in Figs. 1 and 2 (source of empirical data: [8]). In Figs. 1 and 2, blue circular points indicate empirical data, red, black and green solid lines indicate the theoretical values of the models: Lee-Carter (LCs), nGLSF order 2 (nGso2) and nGLSF mixed order 2,4, and 6 (nGs) with switchings respectively, while the solid purple line indicates the forecast of the nG (nGsf) for the next five years.

Fig. 1.
figure 1

Mortality rates for women (left side) and men (right side) aged 45 and empirical, theoretical values based on the following models: LCs, nGs, and forecasts (Color figure online)

Fig. 2.
figure 2

Mortality rates for women (left side) and men (right side) aged 60 and empirical, theoretical values based on the following models: LCs, nGs, and forecasts (Color figure online)

To verify the goodness of fit of the proposed nGs models with switchings to the empirical mortality rates and compared with Lee-Carter model the mean squared errors (MSE) between empirical mortality \(\mu _{x,t}\) and theoretical values \(\widehat{\mu _{x,t}}\) in the years 1958–2010 (‘10) and 1958–2016 (‘16) as well as the 95\(\%\) confidence interval for MSE has been calculated. Selected results (45 and 60-year old female and male) are presented in Table 1 (where: \(CI_L\)-lower -, \(CI_U\)-upper confidence interval, \(\{W,M\}_{X,MSE}\) - MSE value for \(\{\)female, male\(\}\) aged X). The results in column 5th illustrate the model (nGso2) considered in [20].

Table 1. Goodness of fit measures (woman-W, man-M) based on MSE.

MSE values calculated on the basis of empirical and theoretical data from 1958–2016 and included in Table 1 and Figs. 1 and 2 provide the following conclusions:

  • the theoretical values of the mortality rate \(\widehat{\mu _{x,t}^{nGs}}\) based on the non-Gaussian linear scalar filters with switching provide closer estimates to empirical values than \(\widehat{\mu _{x,t}}^{LCs}\) based on LC model and \(\widehat{\mu _{x,t}}^{nGso2}\) with switching for both a 45-year-old and a 60-year-old woman and man,

  • the range confidence interval is the smallest for the nGs model compared to all other models given in Table 1, which means greater precision of the proposed nGs for forecasting than the other models presented here,

  • the empirical mortality rates for women are more accurately fitted using the proposed nGs model than for men (lower MSE value),

  • based on graphical results (Fig. 1–Fig. 2), it can be seen that the proposed method of modeling \(\mu _{x,t}\) using nGs more precisely adapts to empirical data, especially for data with a large variance than the LC model (e.g. see empirical data from 1980–1990 for a 60-year-old man on Fig. 2, right side).

Moreover, taking into account all results for people aged \(x=0,\ldots ,100\) years (also partly included in Table 1) it can be seen that the proposed nGs model fits more accurately to the empirical data for younger than older (lower MSE for 45 years old than for 60 years old man and woman).

6 Conclusions

In this paper, three extended Milevsky and Promislov models with excitations modeled by the second, the fourth and the sixth order polynomials of outputs from a linear non-Gaussian filter are proposed and adopted to Polish mortality data. To obtain hybrid models the procedures of parameters estimation and the determination of switching points were proposed. Based on the theoretical values obtained from these three models, one series of theoretical values based on the AE criterion was constructed and compared with the theoretical mortality rates based on classical the Lee–Carter model. In addition, a point forecast was computed. The obtained results confirm the usefulness of the switched model based on the continuous non-Gaussian process for modeling mortality rates.

A natural extension of the research contained in this article is the Markov chain application (homogeneous or heterogeneous), which will be used to describe the space of states built on extended Milevsky and Promislov models with excitations modeled by the second, the fourth and the sixth order polynomials. The issues discussed above will be examined in the next article.