1 Introduction

When Markov chains are used as mathematical models of natural or social phenomena, the transition intensities or probabilities are usually defined in terms of parameters that are relevant to the scientific question at hand. Sensitivity analysis of such models is important because it quantifies the dependence of the model behavior on the parameters. This chapter presents sensitivity results for finite-state, continuous-time absorbing Markov chains, paralleling the approach for discrete-time chains in Chap. 11. In absorbing chains, interest focuses on behavior prior to absorption (time spent in transient states and time to absorption) and on the probabilities of absorption in each absorbing state. Here we will derive formulae for the sensitivity and the elasticity (i.e., proportional sensitivity) of the moments of the time to absorption, the time spent in each transient state, and the number of visits to each transient state.

The most basic difference between discrete-time and continuous-time Markov chains is that the former are defined by transition probabilities, while the latter are defined by transition rates. This leads to differences in the structure of the matrices, but there is a nice parallelism in the results.

Perturbation analysis of Markov chains has a long history (Schweitzer 1968; Meyer 1975). Most of the literature, however, is devoted to discrete-time chains, and most of that focuses on ergodic chains and the perturbation analysis of the stationary distribution; e.g. Funderlic and Meyer (1986), Golub and Meyer (1986), Hunter (2005), Cho and Meyer (2000), and Seneta (1993). Much less attention has been paid to continuous-time chains. Perturbation expansions have been developed for the stationary distribution of ergodic continuous-time chains, with application to queueing models (Altman et al. 2004), and sensitivity results and perturbation bounds presented for transient solutions (Ramesh and Trivedi 1993; Mitrophanov 2004). The operations research literature contains many studies of the sensitivity of performance measures calculated over realizations of a continuous-time ergodic Markov chain; e.g., Cao (1989), Glasserman (1992), and Cao et al. (1996). The results to be presented here complement and extend the existing literature on perturbation analysis of Markov chains, by focusing on the statistical properties of the solutions of absorbing continuous-time chains, by introducing the use of matrix calculus, and (as a consequence of that technique) extending the range of parameters whose effects can be evaluated.

1.1 Absorbing Markov Chains

I consider a finite state, homogeneous, continuous-time Markov chain with intensity matrix Q, where q ij is the rate of transition from stage j to stage i. The intensity matrix satisfies q ij ≥ 0 for i ≠ j and q jj = −∑ijq ij. Note that Q is written in column-to-row orientation, and operates on column vectors. An absorbing chain contains at least one absorbing class of states. Numbering the states so that the transient states appear before the absorbing states leads to the intensity matrix

(12.1)

The matrix U contains rates of transitions among the transient states, and M contains the rates of transition from transient to absorbing states.

I assume that U and M are differentiable functions of a vector θ of parameters, and that Q[θ] remains an intensity matrix for sufficiently small perturbations of θ. This includes as a special case the situation where the elements of θ are simply some or all of the q ij, i ≠ j. The goal of the perturbation analysis is to obtain the derivatives of properties of the chain with respect to θ.

2 Occupancy Time in Transient States

Let s be the number of transient states, and ν ij be the time spent in transient state i by an individual starting in transient state j. Define \({\mathbf {N}}_k = E \left ( \nu _{ij}^k \right )\) as the matrix whose entries are the kth moments, and \({\mathbf {N}}_{\mathrm {dg}} = \left ( {\mathbf {N}}_1 \right )_{\mathrm {dg}}\). The matrix N 1 of expectations is the fundamental matrix of the chain. The first several moments of occupancy times are given by the entries of the matrices

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{N}}_1 &\displaystyle =&\displaystyle - {\mathbf{U}}^{-1} {} \end{array} \end{aligned} $$
(12.2)
$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{N}}_2 &\displaystyle =&\displaystyle 2 {\mathbf{N}}_{\mathrm{dg}} {\mathbf{N}}_1 {} \end{array} \end{aligned} $$
(12.3)
$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{N}}_3 &\displaystyle =&\displaystyle 6 {\mathbf{N}}_{\mathrm{dg}}^2 {\mathbf{N}}_1 {} \end{array} \end{aligned} $$
(12.4)
$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbf{N}}_4 &\displaystyle =&\displaystyle 24 {\mathbf{N}}_{\mathrm{dg}}^3 {\mathbf{N}}_1 {} \end{array} \end{aligned} $$
(12.5)

and, in general, by

$$\displaystyle \begin{aligned} {\mathbf{N}}_k = k {\mathbf{N}}_{\mathrm{dg}} {\mathbf{N}}_{k-1} \qquad k\ge 2 {} \end{aligned} $$
(12.6)

(Iosifescu 1980, Thm. 8.7).

The differentials of the moments (12.2), (12.3), (12.4), and (12.5) are

$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, {\mathbf{N}}_1 &\displaystyle =&\displaystyle \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{N}}_1 \right) d \mbox{vec} \, {\mathbf{U}} {} \end{array} \end{aligned} $$
(12.7)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, {\mathbf{N}}_2 &\displaystyle =&\displaystyle 2 \left\{ \rule{0in}{2.2ex} \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{I}} \right) \mathcal{D}\,(\mbox{vec} \, {\mathbf{I}}) + \left( {\mathbf{I}} \otimes {\mathbf{N}}_{\mathrm{dg}} \right) \right\} \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{N}}_1 \right) d \mbox{vec} \, {\mathbf{U}} {} \end{array} \end{aligned} $$
(12.8)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, {\mathbf{N}}_3 &\displaystyle =&\displaystyle 6 \left\{ \rule{0in}{2.2ex} 2 \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{N}}_{\mathrm{dg}} \right) \mathcal{D}\,(\mbox{vec} \, {\mathbf{I}} ) + \left( {\mathbf{I}} \otimes {\mathbf{N}}_{\mathrm{dg}}^2 \right) \right\} \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{N}}_1 \right) d \mbox{vec} \, {\mathbf{U}} {}\\ \end{array} \end{aligned} $$
(12.9)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, {\mathbf{N}}_4 &\displaystyle =&\displaystyle 24\left\{ \rule{0in}{2.2ex} 3 \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{N}}_{\mathrm{dg}}^2 \right) \mathcal{D}\,(\mbox{vec} \, {\mathbf{I}}) + \left( {\mathbf{I}} \otimes {\mathbf{N}}_{\mathrm{dg}}^3 \right) \right\} \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{N}}_1 \right) d \mbox{vec} \, {\mathbf{U}}\\ {} \end{array} \end{aligned} $$
(12.10)

where I = I s throughout. A recursive relation for all the moments is

$$\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{N}}_k = k \left( {\mathbf{N}}_{k-1}^{\mathsf{T}} \otimes {\mathbf{I}} \right) \mathcal{D}\,(\mbox{vec} \, {\mathbf{I}}) d \mbox{vec} \, {\mathbf{N}} + k \left( {\mathbf{I}} \otimes {\mathbf{N}}_{\mathrm{dg}} \right) d \mbox{vec} \, {\mathbf{N}}_{k-1} \qquad k \ge 2. {}\end{aligned} $$
(12.11)

The variance, standard deviation, and coefficient of variation of the ν ij are important in applications; they are

$$\displaystyle \begin{aligned} \begin{array}{rcl} V \left( \nu_{ij} \right) &\displaystyle =&\displaystyle {\mathbf{N}}_2 - {\mathbf{N}}_1 \circ {\mathbf{N}}_1 {} \end{array} \end{aligned} $$
(12.12)
$$\displaystyle \begin{aligned} \begin{array}{rcl} SD \left( \nu_{ij} \right) &\displaystyle =&\displaystyle \sqrt{V \left( \nu_{ij} \right)} {} \end{array} \end{aligned} $$
(12.13)
$$\displaystyle \begin{aligned} \begin{array}{rcl} CV \left( \nu_{ij} \right) &\displaystyle =&\displaystyle \mathcal{D}\, \left( \mbox{vec} \, {\mathbf{N}}_1 \right)^{-1} \mbox{vec} \, SD \left( \nu_{ij} \right) {}\vspace{-3pt} \end{array} \end{aligned} $$
(12.14)

where the square root is taken elementwise. Their derivatives are

$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, V &\displaystyle =&\displaystyle 2 \left[ \rule{0in}{2.1ex} \left( {\mathbf{N}}^{\mathsf{T}} \otimes {\mathbf{I}} \right) \mathcal{D}\,( \mbox{vec} \, {\mathbf{I}} ) + \left( {\mathbf{I}} \otimes {\mathbf{N}}_{\mathrm{dg}} \right) - \mathcal{D}\,(\mbox{vec} \, {\mathbf{N}} ) \right] d \mbox{vec} \, {\mathbf{N}}_1 {}\\ \end{array} \end{aligned} $$
(12.15)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, SD &\displaystyle =&\displaystyle \frac{1}{2} \mathcal{D}\, \left[ \rule{0in}{2.1ex} \mbox{vec} \, SD \left( \nu_{ij} \right) \right]^{-1} d \mbox{vec} \, V {} \end{array} \end{aligned} $$
(12.16)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, CV &\displaystyle =&\displaystyle \mathcal{D}\, \left( \mbox{vec} \, {\mathbf{N}}_1 \right)^{-1} d \mbox{vec} \, SD \\ &\displaystyle &\displaystyle - \left[ \left( \mbox{vec} \, SD \right)^{\mathsf{T}} \mathcal{D}\, \left( \mbox{vec} \, {\mathbf{N}}_1 \right)^{-1} \otimes \mathcal{D}\, \left( \mbox{vec} \, {\mathbf{N}}_1 \right)^{-1} \right] \\ &\displaystyle &\displaystyle \times \mathcal{D}\, \left( \mbox{vec} \, {\mathbf{I}}_{s^2} \right) \left( {\mathbf{1}}_{s^2} \otimes {\mathbf{I}}_{s^2} \right) d \mbox{vec} \, {\mathbf{N}}_1 {} \end{array} \end{aligned} $$
(12.17)

(suppressing the arguments of V , SD and CV ). Because N 1 usually contains zeros, \(\mathcal {D}\,(\mbox{vec} \, {\mathbf {N}}_1)^{-1}\) must be restricted to the non-zero entries; the coefficient of variation is undefined if the mean is zero.

Derivation

The fundamental matrix N 1 = −U −1. Applying (2.82) yields (12.7). The derivatives of the higher moments are obtained by differentiating N 2N 4 in (12.3), (12.4), and (12.5). For example, the differential of N 4 is

$$\displaystyle \begin{aligned} d {\mathbf{N}}_4 = 24 \left\{ \rule{0in}{2.1ex} 3 {\mathbf{N}}_{\mathrm{dg}}^2 \left( d {\mathbf{N}}_{\mathrm{dg}} \right) {\mathbf{N}}_1 + {\mathbf{N}}_{\mathrm{dg}}^3 \left( d {\mathbf{N}}_1 \right) \right\}, \end{aligned} $$
(12.18)

using the fact that N dg commutes with itself and d N dg. Applying the vec operator gives

$$\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{N}}_4 = 24 \left\{ \rule{0in}{2.1ex} 3 \left( {\mathbf{N}}_2^{\mathsf{T}} \otimes {\mathbf{N}}_{\mathrm{dg}}^2 \right) d \mbox{vec} \, {\mathbf{N}}_{\mathrm{dg}} + \left( {\mathbf{I}}_s \otimes {\mathbf{N}}_{\mathrm{dg}}^3 \right) d \mbox{vec} \, {\mathbf{N}}_1 \right\}. \end{aligned} $$
(12.19)

Substituting (11.12) for dvec N dg and (12.7) for dvec N 1 gives (12.10). Results (12.8) and (12.9) are obtained in similar fashion.

Differentiating the recurrence relationship (12.6) gives

$$\displaystyle \begin{aligned} d {\mathbf{N}}_k = k \left( d {\mathbf{N}}_{\mathrm{dg}} \right) {\mathbf{N}}_{k-1} + s {\mathbf{N}}_{\mathrm{dg}} \left( d {\mathbf{N}}_{k-1} \right). \end{aligned} $$
(12.20)

Apply the vec operator,

$$\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{N}}_k = k \left( {\mathbf{N}}_{k-1}^{\mathsf{T}} \otimes {\mathbf{I}}_s \right) d \mbox{vec} \, {\mathbf{N}}_{\mathrm{dg}} + k \left( {\mathbf{I}}_s \otimes {\mathbf{N}}_{\mathrm{dg}} \right) d \mbox{vec} \, {\mathbf{N}}_{k-1}, \end{aligned} $$
(12.21)

and substitute (11.12) for dvec N dg to obtain (12.11).

The derivative of V  in (12.15) comes from differentiating (12.12),

$$\displaystyle \begin{aligned} d V = d {\mathbf{N}}_2 - 2 {\mathbf{N}}_1 \circ d {\mathbf{N}}_1, \end{aligned} $$
(12.22)

applying the vec operator,

$$\displaystyle \begin{aligned} D \mbox{vec} \, V = d \mbox{vec} \, {\mathbf{N}}_2 - 2 \mathcal{D}\, \left( \mbox{vec} \, {\mathbf{N}}_1 \right) d \mbox{vec} \, {\mathbf{N}}_1, \end{aligned} $$
(12.23)

and then using (12.7) and (12.8). The derivative of \(SD \left ( \nu _{ij} \right )\) in (12.16) follows from (2.83). The derivative of \(CV \left ( \nu _{ij}\right )\) in (12.17) is obtained using (2.84), with x = vec SD and y = vec N 1.

3 Longevity: Time to Absorption

Let η j be the time to absorption for an individual currently in transient state j. The vectors of the kth moments of the time to absorption, η k, satisfy

$$\displaystyle \begin{aligned} \begin{array}{rcl} \boldsymbol{\eta}_1^{\mathsf{T}} &\displaystyle =&\displaystyle {\mathbf{1}}^{\mathsf{T}} {\mathbf{N}}_1 {} \end{array} \end{aligned} $$
(12.24)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \boldsymbol{\eta}_2^{\mathsf{T}} &\displaystyle =&\displaystyle (2) {\mathbf{1}}^{\mathsf{T}} {\mathbf{N}}_1^2 {} \end{array} \end{aligned} $$
(12.25)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \boldsymbol{\eta}_3^{\mathsf{T}} &\displaystyle =&\displaystyle (6) {\mathbf{1}}^{\mathsf{T}} {\mathbf{N}}_1^3 {} \end{array} \end{aligned} $$
(12.26)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \boldsymbol{\eta}_4^{\mathsf{T}} &\displaystyle =&\displaystyle (24) {\mathbf{1}}^{\mathsf{T}} {\mathbf{N}}_1^4 {} \end{array} \end{aligned} $$
(12.27)

and in general

$$\displaystyle \begin{aligned} \boldsymbol{\eta}_k^{\mathsf{T}} = k \boldsymbol{\eta}_{k-1}^{\mathsf{T}} {\mathbf{N}}_1 \qquad k \ge 2 {} \end{aligned} $$
(12.28)

(Iosifescu 1980, Thm. 8.6)

The variance, standard deviation, and coefficient of variation of the time to absorption are

$$\displaystyle \begin{aligned} \begin{array}{rcl} V(\boldsymbol{\eta}) &\displaystyle =&\displaystyle \boldsymbol{\eta}_2 - \boldsymbol{\eta}_1 \circ \boldsymbol{\eta}_1 {} \end{array} \end{aligned} $$
(12.29)
$$\displaystyle \begin{aligned} \begin{array}{rcl} SD \left( \boldsymbol{\eta} \right) &\displaystyle =&\displaystyle \sqrt{V \left( \boldsymbol{\eta} \right)} {} \end{array} \end{aligned} $$
(12.30)
$$\displaystyle \begin{aligned} \begin{array}{rcl} CV \left( \boldsymbol{\eta} \right) &\displaystyle =&\displaystyle \mathcal{D}\, \left( \rule{0in}{2ex}SD(\boldsymbol{\eta}) \right)^{-1} \boldsymbol{\eta}_1 {} \end{array} \end{aligned} $$
(12.31)

with the square root taken elementwise.

The derivatives of the moments in (12.24), (12.25), (12.26), and (12.27) are given by

$$\displaystyle \begin{aligned} \begin{array}{rcl} d \boldsymbol{\eta}_1 &\displaystyle =&\displaystyle \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right) d \mbox{vec} \, {\mathbf{U}} {} \end{array} \end{aligned} $$
(12.32)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d \boldsymbol{\eta}_2 &\displaystyle =&\displaystyle \left\{ 2 \left[ \left( {\mathbf{N}}_1^{\mathsf{T}} \right)^2 \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right] + 2 \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} {\mathbf{N}}_1 \right) \right\} d \mbox{vec} \, {\mathbf{U}} {} \end{array} \end{aligned} $$
(12.33)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d \boldsymbol{\eta}_3 &\displaystyle =&\displaystyle \left\{ \rule{0in}{3ex} 6 \left[ \left( {\mathbf{N}}_1^{\mathsf{T}} \right)^3 \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right] + 6 \left[ \left( {\mathbf{N}}_1^{\mathsf{T}} \right)^2 \otimes \boldsymbol{\eta}_1^{\mathsf{T}} {\mathbf{N}}_1 \right] \right. \\ &\displaystyle &\displaystyle + \left. 3 \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes \boldsymbol{\eta}_2^{\mathsf{T}} {\mathbf{N}}_1 \right) \rule{0in}{3ex} \right\} d \mbox{vec} \, {\mathbf{U}} {} \end{array} \end{aligned} $$
(12.34)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d \boldsymbol{\eta}_4 &\displaystyle =&\displaystyle \left\{ 24 \left[ \left( {\mathbf{N}}_1^{\mathsf{T}} \right)^4 \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right] + 24 \left[ \left( {\mathbf{N}}_1^{\mathsf{T}} \right)^3 \otimes \boldsymbol{\eta}_1^{\mathsf{T}} {\mathbf{N}}_1 \right] \right. \\ &\displaystyle &\displaystyle ~~\left. + 12 \left[ \left( {\mathbf{N}}_1^{\mathsf{T}} \right)^2 \otimes \boldsymbol{\eta}_2^{\mathsf{T}} {\mathbf{N}}_1 \right] + 4 \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes \boldsymbol{\eta}_3^{\mathsf{T}} {\mathbf{N}}_1 \right) \right\} d \mbox{vec} \, {\mathbf{U}} {} \end{array} \end{aligned} $$
(12.35)

and, recursively,

$$\displaystyle \begin{aligned} d \boldsymbol{\eta}_k = k {\mathbf{N}}_1^{\mathsf{T}} d \boldsymbol{\eta}_{k-1} + k \left( {\mathbf{I}}_s \otimes \boldsymbol{\eta}_{k-1}^{\mathsf{T}} \right) d \mbox{vec} \, {\mathbf{N}}_1. {} \end{aligned} $$
(12.36)

The derivatives of the variance, standard deviation, and coefficient of variation of the time to absorption are (suppressing the arguments)

$$\displaystyle \begin{aligned} \begin{array}{rcl} d V &\displaystyle =&\displaystyle 2 \left\{ \left[ \left( {\mathbf{N}}_1^{\mathsf{T}} \right)^2 \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right] + \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} {\mathbf{N}}_1 \right) - \mathcal{D}\, \left( \boldsymbol{\eta}_1 \right) \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes \boldsymbol{\eta}_1^{\mathsf{T}} \right) \right\} d \mbox{vec} \, {\mathbf{U}} {}\\ \end{array} \end{aligned} $$
(12.37)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d SD &\displaystyle =&\displaystyle \frac{1}{2} \mathcal{D}\, \left( SD \right)^{-1} d V {} \end{array} \end{aligned} $$
(12.38)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d CV &\displaystyle =&\displaystyle \mathcal{D}\, \left( \boldsymbol{\eta}_1 \right)^{-1} d SD - \left[ SD^{\mathsf{T}} \mathcal{D}\, \left( \boldsymbol{\eta}_1 \right)^{-1} \otimes \mathcal{D}\, \left( \boldsymbol{\eta}_1 \right)^{-1} \right] \\ &\displaystyle &\displaystyle \times \mathcal{D}\,(\mbox{vec} \, {\mathbf{I}}_s) \left( {\mathbf{1}}_s \otimes {\mathbf{I}}_s \right) d \boldsymbol{\eta}_1. {} \end{array} \end{aligned} $$
(12.39)

Derivation

Differentiating (12.24) for the expected time to absorption gives

$$\displaystyle \begin{aligned} d \boldsymbol{\eta}_1^{\mathsf{T}} = {\mathbf{1}}_s^{\mathsf{T}} d {\mathbf{N}}_1,\end{aligned} $$
(12.40)

Applying the vec operator, substituting (12.7) for dvec N 1, and simplifying gives (12.32). The derivatives of the higher moments are obtained in the same way; e.g., for η 4,

$$\displaystyle \begin{aligned} d \boldsymbol{\eta}_4^{\mathsf{T}} = (24) {\mathbf{1}}_s^{\mathsf{T}} \left[ \rule{0in}{2.2ex} \left( d {\mathbf{N}}_1 \right) {\mathbf{N}}_1^3 + {\mathbf{N}}_1 \left( d {\mathbf{N}}_1 \right) {\mathbf{N}}_1^2 + {\mathbf{N}}_1^2 \left( d {\mathbf{N}}_1 \right) {\mathbf{N}}_1 + {\mathbf{N}}_1^3 \left( d {\mathbf{N}}_1 \right) \right].\end{aligned} $$
(12.41)

Applying the vec operator yields

$$\displaystyle \begin{aligned} \begin{array}{rcl} d \boldsymbol{\eta}_4 &\displaystyle =&\displaystyle 24 \left\{ \left[ \left( {\mathbf{N}}_1^{\mathsf{T}} \right)^3 \otimes {\mathbf{1}}_s^{\mathsf{T}} \right] + \left[ \left( {\mathbf{N}}_1^{\mathsf{T}} \right)^2 \otimes {\mathbf{1}}_s^{\mathsf{T}} {\mathbf{N}}_1 \right] + \left[ {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{1}}_s^{\mathsf{T}} {\mathbf{N}}_1^2 \right]\right.\\ &\displaystyle &\displaystyle \left.+ \left[ {\mathbf{I}}_s \otimes {\mathbf{1}}_s^{\mathsf{T}} {\mathbf{N}}_1^3 \right] \right\} d \mbox{vec} \, {\mathbf{N}}_1.\vspace{-3pt} \end{array} \end{aligned} $$
(12.42)

Substituting (12.7) for dvec N 1 and simplifying using Eqs. (12.24), (12.25), and (12.26) gives (12.35). The derivatives of the second and third moments, (12.33) and (12.34), are obtained in similar fashion.

The recursive formula (12.36) is obtained by differentiating (12.28)

$$\displaystyle \begin{aligned} d \boldsymbol{\eta}_k^{\mathsf{T}} = k \left( d \boldsymbol{\eta}_{k-1}^{\mathsf{T}} \right) {\mathbf{N}}_1 + k \boldsymbol{\eta}_{k-1}^{\mathsf{T}} d {\mathbf{N}}_1.\end{aligned} $$
(12.43)

Apply the vec operator,

$$\displaystyle \begin{aligned} d \boldsymbol{\eta}_k = k {\mathbf{N}}_1^{\mathsf{T}} d \boldsymbol{\eta}_{k-1} + k \left( {\mathbf{I}}_s \otimes \boldsymbol{\eta}_{k-1}^{\mathsf{T}} \right) d \mbox{vec} \, {\mathbf{N}}_1, \end{aligned} $$
(12.44)

substitute (12.7) for dvec N 1, and simplify, to obtain (12.36).

Differentiating (12.29) for the variance yields

$$\displaystyle \begin{aligned} d V = d \boldsymbol{\eta}_2 - 2 \boldsymbol{\eta}_1 \circ d \boldsymbol{\eta}_1. \end{aligned} $$
(12.45)

Applying the vec operator gives

$$\displaystyle \begin{aligned} d V = d \boldsymbol{\eta}_2 - 2 \mathcal{D}\, \left( \boldsymbol{\eta}_1 \right) d \boldsymbol{\eta}_1. \end{aligned} $$
(12.46)

Substituting (12.32) for d η 1 and (12.33) for d η 2 gives the result (12.37). The derivatives of the standard deviation, in (12.38), and the coefficient of variation, in (12.39), are obtained by differentiating (12.30) and (12.31) and applying (2.83) and (2.84).

4 Multiple Absorbing States and Probabilities of Absorption

Consider a chain that includes a > 1 absorbing states. The entry m ij of the a × s submatrix M in (12.1) is the rate of transition from transient state j to absorbing state i. The probabilities of absorption are defined as

$$\displaystyle \begin{aligned} b_{ij} = P \left[ \mbox{absorption in }i \left| \mbox{starting in }j \right. \right]. \end{aligned} $$
(12.47)

The a × s matrix \({\mathbf {B}} = \left (\begin {array}{c} b_{ij} \end {array}\right )\) is

$$\displaystyle \begin{aligned} {\mathbf{B}} = {\mathbf{M}} {\mathbf{N}}_1 {} \end{aligned} $$
(12.48)

(Iosifescu 1980, Section 8.5.6). Column j of B is the probability distribution of the eventual absorption state for an individual starting in transient state j. Usually a few starting states are of particular interest (e.g., states corresponding to “birth”). Let B(:, j) = Be j denote column j of B, where e j is the jth unit vector of length s. Then

$$\displaystyle \begin{aligned} d {\mathbf{B}}(:,j) = \left( {\mathbf{e}}_j^{\mathsf{T}} \otimes {\mathbf{I}}_s \right) d \mbox{vec} \, {\mathbf{B}}. {} \end{aligned} $$
(12.49)

Similarly, row i of B is \({\mathbf {B}}(i,:)={\mathbf {e}}_i^{\mathsf {T}} {\mathbf {B}}\) and

$$\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{B}}(i,:) = \left( {\mathbf{I}}_s \otimes {\mathbf{e}}_i^{\mathsf{T}} \right) d \mbox{vec} \, {\mathbf{B}} {} \end{aligned} $$
(12.50)

where e i is the ith unit vector of length a. The derivative of B in (12.49) and (12.50) is

$$\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{B}} = \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{I}} \right) d \mbox{vec} \, {\mathbf{M}} + \left( {\mathbf{N}}_1^{\mathsf{T}} \otimes {\mathbf{B}} \right) d \mbox{vec} \, {\mathbf{U}}. {} \end{aligned} $$
(12.51)

Derivations

Differentiating (12.48) yields

$$\displaystyle \begin{aligned} d {\mathbf{B}} = \left( d {\mathbf{M}} \right) {\mathbf{N}}_{1} + {\mathbf{M}} \left( d {\mathbf{N}}_{1} \right). \end{aligned} $$
(12.52)

Applying the vec operator and simplifying gives

$$\displaystyle \begin{aligned} d \mbox{vec} \, {\mathbf{B}} = \left( {\mathbf{N}}_{1}^{\mathsf{T}} \otimes {\mathbf{I}} \right) d \mbox{vec} \, {\mathbf{M}} + \left( {\mathbf{I}} \otimes {\mathbf{M}} \right) d \mbox{vec} \, {\mathbf{N}}_{1} \end{aligned} $$
(12.53)

Substituting (12.7) for dvec N 1 and simplifying gives (12.51).

5 The Embedded Chain: Discrete Transitions Within a Continuous Process

If a continuous-time chain is observed only at the moments when it changes state, the result is a discrete-time process called the embedded Markov chain, or the jump chain, associated with Q (Iosifescu 1980, Section 8.3.2). The transition matrix of this embedded chain can be written

(12.54)

where

$$\displaystyle \begin{aligned} \begin{array}{rcl} \widehat{{\mathbf{U}}} &\displaystyle =&\displaystyle {\mathbf{I}}_s - {\mathbf{U}} {\mathbf{U}}_{\mathrm{dg}}^{-1} {} \end{array} \end{aligned} $$
(12.55)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \widehat{{\mathbf{M}}} &\displaystyle =&\displaystyle -{\mathbf{M}} {\mathbf{U}}_{\mathrm{dg}}^{-1}. {} \end{array} \end{aligned} $$
(12.56)

The embedded chain provides information on the number of visits to each transient state, rather than the time spent in each transient state. The expected numbers of such visits are given by the fundamental matrix

$$\displaystyle \begin{aligned} \widehat{{\mathbf{N}}}_1 = \left( {\mathbf{I}} - \widehat{{\mathbf{U}}} \right)^{-1}. {} \end{aligned} $$
(12.57)

The sensitivity analysis of the embedded chain follows directly from the discrete-time results in previous chapters (Chaps. 4 and 5).

In particular, the differential of \(\widehat {{\mathbf {N}}}_1\) is Caswell (2006)

$$\displaystyle \begin{aligned} d \mbox{vec} \, \widehat{{\mathbf{N}}}_1 = \left( \widehat{{\mathbf{N}}}_1^{\mathsf{T}} \otimes \widehat{{\mathbf{N}}}_1 \right) d \mbox{vec} \, \widehat{{\mathbf{U}}}. \end{aligned} $$
(12.58)

However, this derivative is unlikely to be the sensitivity we are looking for. The continuous-time chain is likely to be parameterized in terms of the rate matrices U and M, rather than the probability matrices \(\widehat {{\mathbf {U}}}\) and \(\widehat {{\mathbf {M}}}\). To express the perturbation analysis of \(\widehat {{\mathbf {P}}}\) in terms of the parameters of Q requires the derivatives of the embedded chain with respect to the continuous chain; i.e.,

$$\displaystyle \begin{aligned} {d \mbox{vec} \, \widehat{{\mathbf{U}}} \over d \mbox{vec} \,^{\mathsf{T}} {\mathbf{U}}} \quad \mbox{and} \quad {d \mbox{vec} \, \widehat{{\mathbf{M}}} \over d \mbox{vec} \,^{\mathsf{T}} {\mathbf{M}}}. \end{aligned}$$

These derivatives are

$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, \widehat{{\mathbf{U}}} &\displaystyle =&\displaystyle \left[ - \left( {\mathbf{U}}_{\mathrm{dg}}^{-1} \otimes {\mathbf{I}}_s \right) + \left( {\mathbf{U}}_{\mathrm{dg}}^{-1} \otimes {\mathbf{U}} {\mathbf{U}}_{\mathrm{dg}}^{-1} \right) \mathcal{D}\,(\mbox{vec} \, {\mathbf{I}}_s) \right] d \mbox{vec} \, {\mathbf{U}} {} \end{array} \end{aligned} $$
(12.59)
$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, \widehat{{\mathbf{M}}} &\displaystyle =&\displaystyle - \left( {\mathbf{U}}_{\mathrm{dg}}^{-1} \otimes {\mathbf{I}}_a \right) d \mbox{vec} \, {\mathbf{M}} \\ &\displaystyle &\displaystyle + \left({\mathbf{I}}_s \otimes {\mathbf{M}} \right) \left( {\mathbf{U}}_{\mathrm{dg}}^{-1} \otimes {\mathbf{U}}_{\mathrm{dg}}^{-1} \right) \times \mathcal{D}\, \left( \mbox{vec} \, {\mathbf{I}}_s \right) d \mbox{vec} \, {\mathbf{U}}. {} \end{array} \end{aligned} $$
(12.60)

Using (12.59) and (12.61), one can write

$$\displaystyle \begin{aligned} {d \mbox{vec} \, \widehat{{\mathbf{N}}}_1 \over d \boldsymbol{\theta}^{\mathsf{T}}} = \left( \widehat{{\mathbf{N}}}_1^{\mathsf{T}} \otimes \widehat{{\mathbf{N}}}_1 \right) {d \mbox{vec} \, \widehat{{\mathbf{U}}} \over d \mbox{vec} \,^{\mathsf{T}} {\mathbf{U}}} \; {d \mbox{vec} \, {\mathbf{U}} \over d \boldsymbol{\theta}^{\mathsf{T}}}. \end{aligned} $$
(12.61)

Derivation

Differentiate \(\widehat {{\mathbf {U}}}\) in (12.55),

$$\displaystyle \begin{aligned} d \widehat{{\mathbf{U}}} = - \left( d {\mathbf{U}} \right) {\mathbf{U}}_{\mathrm{dg}}^{-1} - {\mathbf{U}} \left( d {\mathbf{U}}_{\mathrm{dg}}^{-1} \right), \end{aligned} $$
(12.62)

apply the vec operator, and use (2.82) and (11.12) for \(d \mbox{vec} \, {\mathbf {U}}_{\mathrm {dg}}^{-1}\). The result is

$$\displaystyle \begin{aligned} \begin{array}{rcl} d \mbox{vec} \, \widehat{{\mathbf{U}}} &\displaystyle =&\displaystyle - \left[ \left( {\mathbf{U}}_{\mathrm{dg}}^{-1} \right)^{\mathsf{T}} \otimes {\mathbf{I}}_s \right] d \mbox{vec} \, {\mathbf{U}} - \left( {\mathbf{I}}_s \otimes {\mathbf{U}} \right) d \mbox{vec} \, {\mathbf{U}}_{\mathrm{dg}}^{-1} \\ {} &\displaystyle =&\displaystyle - \left( {\mathbf{U}}_{\mathrm{dg}}^{-1} \otimes {\mathbf{I}}_s \right) d \mbox{vec} \, {\mathbf{U}} + \left( {\mathbf{I}}_s \otimes {\mathbf{U}} \right) \left( {\mathbf{U}}_{\mathrm{dg}}^{-1} \otimes {\mathbf{U}}_{\mathrm{dg}}^{-1} \right) \mathcal{D}\,(\mbox{vec} \, {\mathbf{I}}_s) d \mbox{vec} \, {\mathbf{U}} \end{array} \end{aligned} $$

which simplifies to give (12.59). Similarly, differentiating \(\widehat {{\mathbf {M}}}\) in (12.56) and applying the vec operator gives

$$\displaystyle \begin{aligned} d \mbox{vec} \, \widehat{{\mathbf{M}}} = - \left( {\mathbf{U}}_{\mathrm{dg}}^{-1} \otimes {\mathbf{I}}_a \right) d \mbox{vec} \, {\mathbf{M}} - \left( {\mathbf{I}}_s \otimes {\mathbf{M}} \right) d \mbox{vec} \, {\mathbf{U}}_{\mathrm{dg}}^{-1}. \end{aligned} $$
(12.63)

Using (2.82) and (11.12) for \(d \mbox{vec} \, {\mathbf {U}}_{\mathrm {dg}}^{-1}\) and simplifying gives (12.61).

6 An Example: A Model of Disease Progression

An important area of application of continuous-time Markov chains is the modelling of transitions among disease states. In this context, the time to absorption is longevity, and the time spent in various transient states has implications for the quality of life during the disease. Fix and Neyman (1951) introduced the idea and proposed a 4-state model for cancer, with two transient states (under treatment or not) and two absorbing states (death from cancer or from other causes). Kay (1986) proposed a model with k disease states and an absorbing state representing death. There is now a large literature on such models and their estimation. Recently, studies have proliferated that use Markov chain models of disease transmission to explore the cost-effectiveness of screening and treatment procedures (e.g., Kuo et al. 1999; Chen et al. 1999; Wu et al. 2006; Sonnenberg and Beck 1993).

Sensitivity analysis reveals how these demographic properties respond to changes in parameters. As an example, I consider a model for the progression of colorectal cancer (CRC) that was developed to study the cost-effectiveness of a new CRC screening technique based on DNA testing of stool samples (Wu et al. 2006). The model includes 7 transient states (normal, small and large adenoma, early and late preclinical CRC, and early and late clinical CRC) and 2 absorbing states (death from CRC and death from other causes); see Fig. 12.1. Parameters were estimated from the literature and from clinical studies in Taiwan.

Fig. 12.1
figure 1

State transition diagram for an absorbing Markov chain model of colorectal cancer (CRC) progression. The model includes 7 transient states based on the stage of development of adenoma (polyps) or cancer, and two absorbing states corresponding to death from CRC and death from other causes (OCD). Transition rates are given by λ i, and mortality rate from other causes by μ. (Modified, under the terms of a Creative Commons Attribution License, from Figure 1 of Wu et al. 2006)

This model, which describes the so-called natural history of the disease, was embedded in a larger decision model to compare the cost-effectiveness of screening strategies. The intensity matrix (12.1) corresponding to Fig. 12.1 is

(12.64)

The λ i are transition rates; μ is the mortality rate from other causes of death. The incidence rate of small adenoma (λ 1) and the mortality rate due to other causes of death (μ) are age-dependent. Here I have analyzed values for age 70; based on figures in Wu et al. (2006). This leads to a parameter vector (all rates are per year):

$$\displaystyle \begin{aligned} \boldsymbol{\theta} = \left(\begin{array}{c} \lambda_1 \\ \vdots\\ \lambda_8 \\ \mu \end{array}\right) = \left(\begin{array}{r} 1.52\times 10^{-2}\\ 3.46\times 10^{-2}\\ 2.15\times 10^{-2}\\ 3.70\times 10^{-1}\\ 2.38\times 10^{-1}\\ 4.85\times 10^{-1}\\ 3.02\times 10^{-2}\\ 2.10\times 10^{-1}\\ 2.20\times 10^{-2} \end{array}\right). \end{aligned} $$
(12.65)

6.1 Sensitivity Results

The fundamental matrix (12.2) is

$$\displaystyle \begin{aligned} {\mathbf{N}}_1 = \left(\begin{array}{rrrrrrr} 26.9& 0& 0& 0& 0& 0& 0 \\ 7.2&17.7& 0& 0& 0& 0& 0 \\ 5.7&14.0&23.0& 0& 0& 0& 0 \\ 0.2& 0.5& 0.8& 1.6& 0& 0& 0 \\ 0.1& 0.4& 0.6& 1.2& 2.0& 0& 0 \\ 0.9& 2.2& 3.6& 7.2& 0&19.2& 0 \\ 0.3& 0.7& 1.2 & 2.4& 4.1& 0.00& 4.3 \end{array}\right). \end{aligned} $$
(12.66)

Thus, given these rates, a 70-year old normal condition individual would expect to spend 27 years in stage 1, and only 0.9 and 0.3 years in stages 6 and 7 (early and late clinical CRC).Footnote 1 Individuals in more advanced stages can expect to spend progressively longer periods in stages 6 and 7 (compare across rows 6 and 7 of N 1).

The standard deviations (12.13) of the times spent in the transient states are

$$\displaystyle \begin{aligned} SD \left(\nu_{ij} \right) = \left(\begin{array}{rrrrrrr} 26.9& 0& 0& 0& 0& 0& 0 \\ 14.2&17.7& 0& 0& 0& 0& 0 \\ 15.2&21.2&23.0& 0& 0& 0& 0 \\ 0.8& 1.1& 1.4& 1.6& 0& 0& 0 \\ 0.7& 1.1& 1.4& 1.8& 2.0& 0& 0 \\ 5.8& 8.9&11.2&15.0& 0&19.2& 0 \\ 1.6& 2.4& 3.0& 3.9& 4.3& 0& 4.3 \end{array}\right). \end{aligned} $$
(12.67)

Clearly, considerable variation can be expected in the times spent in the various states; the standard deviation equals or exceeds the mean in every case.

Considering the sensitivity analysis of the time spent in transient states, focus on the fate of a normal (state 1) individual. The expected times spent in each state by such an individual are give by N 1(:, 1). From (12.7) and (2.55) the sensitivity and elasticity of N(:, 1) are

$$\displaystyle \begin{aligned} \begin{array}{rcl} {d {\mathbf{N}}_1(:,1) \over d \boldsymbol{\theta}^{\mathsf{T}}} &=& \left(\begin{array}{rrrrrrrrr} -722.6& 0& 0 & 0 & 0 & 0 & 0 & 0& -722.6 \\ 280.9& -127.5& 0 & 0 & 0 & 0 & 0 & 0& -321.6 \\ 223.4& 64.5& -132.0 & 0 & 0 & 0 & 0 & 0& -387.8 \\ 7.6& 2.2& 4.6 & -0.3 & -0.3 & 0 & 0 & 0& -13.5 \\ 5.6& 1.6& 3.4 & 0.2 & -0.2 & -0.3 & 0 & 0& -10.2 \\ 34.8& 10.0& 21.0 & -1.4 & 2.3 & 0 & -17.1 & 0& -79.0 \\ 11.6& 3.4& 7.0 & 0.3 & -0.5 & 0 & 0 & -1.3& -22.5 \end{array}\right) \\ {} {\epsilon {\mathbf{N}}_1(:,1) \over \epsilon \boldsymbol{\theta}^{\mathsf{T}}} &=& \left(\begin{array}{rrrrrrrrr} -0.4 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -0.6 \\ 0.6 & -0.6 & 0 & 0 & 0 & 0 & 0 & 0 & -1.0 \\ 0.6 & 0.4 & -0.5 & 0 & 0 & 0 & 0 & 0 & -1.5 \\ 0.6 & 0.4 & 0.5 & -0.6 & -0.4 & 0 & 0 & 0 & -1.5\\ 0.6 & 0.4 & 0.5 & 0.4 & -0.4 & -1.0 & 0 & 0 & -1.5 \\ 0.6 & 0.4 & 0.5 & -0.6 & 0.6 & 0 & -0.6 & 0 & -1.9 \\ 0.6 & 0.4 & 0.5 & 0.4 & -0.4 & 0.0 & 0 & -0.9 & -1.7 \end{array}\right). {} \end{array} \end{aligned} $$
(12.68)

These elasticities imply that a 1% increase in λ 1 will (to first order) cause about a 0.4% decrease in the mean time spent in the normal state and a 0.6% increase in the mean time spent in each other state. A 1% increase in λ 4 (the rate of transition between early and late preclinical CRC) creates a 0.6% decrease in the time spent in stages 4 and 6 (the early CRC stages) and a 0.4% increase in the time spent in stages 5 and 7 (the late CRC stages). An increase in the mortality rate μ due to other causes of death reduces the time spent in any of the transient states.

The elasticity of the variance in the time spent in the transient states by an individual in state 1 is

$$\displaystyle \begin{aligned} {\epsilon V(\nu_{i1}) \over \epsilon \boldsymbol{\theta}^{\mathsf{T}}}= \left(\begin{array}{rrrrrrrrr} -0.8& 0& 0& 0& 0& 0& 0& 0& -1.2 \\ 0.4& -1.2& 0& 0& 0& 0& 0& 0& -1.2 \\ 0.5& 0.3& -1.0& 0& 0& 0& 0& 0& -1.8 \\ 0.5& 0.4& 0.5& -1.2& -0.8& 0& 0& 0& -1.5 \\ 0.6& 0.4& 0.5& 0.4& -0.4& -1.9& 0& 0& -1.6 \\ 0.6& 0.4& 0.5& -0.6& 0.6& 0& -1.2& 0& -2.3 \\ 0.6& 0.4& 0.5& 0.4& -0.4& 0.0& 0& -1.8& -1.7 \end{array}\right). {} \end{aligned} $$
(12.69)

The sign pattern is the same as that of the elasticities of the mean times in (12.68), so we conclude that any parameter change that increases the mean time spent in a transient state will also increase the variance in that time. The elasticities of the variance are comparable to those of the mean (cf. (12.68) and (12.69)), showing that the means and the variance respond with roughly equal proportional changes.

Longevity is measured by the time to absorption, and is a primary concern in analyses of screening or treatment protocols. The vectors of the mean, standard deviation, and coefficient of variation of longevity are

$$\displaystyle \begin{aligned} \boldsymbol{\eta}_1 = \left(\begin{array}{r} 41.4 \\ 35.5 \\ 29.1 \\ 12.4 \\ 6.1 \\ 19.2 \\ 4.3 \end{array}\right) \quad SD(\boldsymbol{\eta}) = \left(\begin{array}{r} 37.4 \\ 30.3 \\ 25.8 \\ 14.1 \\ 4.7 \\ 19.2 \\ 4.3 \end{array}\right) \quad CV(\boldsymbol{\eta}) = \left(\begin{array}{r} 0.9 \\ 0.9 \\ 0.9 \\ 1.1 \\ 0.8 \\ 1.0 \\ 1.0 \end{array}\right). \end{aligned} $$
(12.70)

The sensitivity and elasticity of expected longevity (life expectancy) with respect to θ are

$$\displaystyle \begin{aligned} \begin{array}{rcl} {d \boldsymbol{\eta}_1 \over d \boldsymbol{\theta}^{\mathsf{T}}} &=& \left(\begin{array}{rrrrrrrrr} -158.7& -45.8& -96.0& -1.2& 1.3& -0.2& -17.1& -1.3&-1557.2 \\ 0&-112.2&-234.9& -3.0& 3.2& -0.6& -41.9& -3.2&-1089.1 \\ 0& 0&-384.2& -5.0& 5.3& -1.0& -68.6& -5.2&-756.5 \\ 0& 0& 0&-10.0& 10.7& -2.1&-138.8&-10.4&-176.0 \\ 0& 0& 0& 0& 0& -3.5& 0&-17.8& -29.8 \\ 0& 0& 0& 0& 0& 0&-367.0& 0&-367.0 \\ 0& 0& 0& 0& 0& 0& 0&-18.6& -18.6 \end{array}\right) {} \\ {} {\epsilon \boldsymbol{\eta}_1 \over \epsilon \boldsymbol{\theta}^{\mathsf{T}}} &=& \left(\begin{array}{rrrrrrrrr} -0.06& -0.04& -0.05& -0.01 & 0.01& -0.00& -0.01& -0.01& -0.83 \\ 0& -0.11& -0.14& -0.03 & 0.02& -0.01& -0.04& -0.02& -0.68 \\ 0& 0& -0.28& -0.06 & 0.04& -0.02& -0.07& -0.04& -0.57 \\ 0& 0& 0& -0.30 & 0.21& -0.08& -0.34& -0.18& -0.31 \\ 0& 0& 0& 0 & 0& -0.28& 0& -0.61& -0.11 \\ 0& 0& 0& 0 & 0& 0& -0.58& 0& -0.42 \\ 0& 0& 0& 0 & 0& 0& 0& -0.91& -0.09 \end{array}\right). {} \end{array} \end{aligned} $$
(12.71)

Almost all the nonzero elements are negative, because increasing any of the rates leading towards clinical CRC reduces life expectancy, as does increasing the mortality rate due to other causes of death. The exceptions are the sensitivities and elasticities of η 1 to λ 5 (in column 5 of these matrices), which are positive because λ 5 delays the onset of clinical CRC (cf. Fig. 12.1).

The elasticities of E(η 1), the life expectancy of a normal individual, to a change in θ, appear in the first row of (12.71). The largest of these (except for the last column, representing mortality from other causes of death) are to changes in λ 1, λ 2, and λ 3, the rates of transition from normal to small adenoma, small to large adenoma, and large adenoma to preclinical CRC. The rates λ 2 and λ 3 have large effects on E(η 2), and λ 3 has a large effect on E(η 3). These transitions are targets of screening and early treatment; this analysis quantifies the effect that such interventions could have.

The sensitivity and elasticity of the standard deviation of longevity are

$$\displaystyle \begin{aligned}\renewcommand\theequation{\thechapter.\arabic{equation}} {d SD \left(\boldsymbol{\eta} \right) \over d \boldsymbol{\theta}^{\mathsf{T}}} = \left(\begin{array}{rrrrrrrrr} -0.27&-0.07&-0.16&-0.00& 0.00&-0.00&-0.03&-0.00&-1.19 \\ 0&-0.13&-0.31&-0.00& 0.00&-0.00&-0.06&-0.00&-0.76 \\ 0& 0&-0.43&-0.00& 0.00&-0.00&-0.09&-0.00&-0.61 \\ 0& 0& 0&-0.01& 0.01& 0.00 &-0.27& 0.00&-0.27 \\ 0& 0& 0& 0& 0&-0& 0.00&-0.02&-0.02 \\ 0& 0& 0& 0& 0& 0&-0.37& 0&-0.37 \\ 0& 0& 0& 0& 0& 0& 0&-0.02&-0.02 \end{array}\right) \times 10^3 \end{aligned} $$
(12.72)

and

$$\displaystyle \begin{aligned}\renewcommand\theequation{\thechapter.\arabic{equation}} {\epsilon SD \left(\boldsymbol{\eta} \right) \over \epsilon \boldsymbol{\theta}^{\mathsf{T}}} = \left(\begin{array}{rrrrrrrrr} -0.11&-0.06&-0.09&-0.02& 0.01&-0.00&-0.02&-0.01&-0.70 \\ 0&-0.15&-0.22&-0.04& 0.03&-0.00&-0.06&-0.01&-0.55 \\ 0& 0&-0.36&-0.05& 0.05&-0.00&-0.11&-0.01&-0.52 \\ 0& 0& 0&-0.23& 0.23& 0.01&-0.58& 0.00&-0.43 \\ 0& 0& 0& 0& 0&-0.16& 0.00&-0.75&-0.09 \\ 0& 0& 0& 0& 0& 0&-0.58& 0&-0.42 \\ 0& 0& 0& 0& 0& 0& 0&-0.91&-0.09 \end{array}\right). \end{aligned} $$
(12.73)

These have the same sign pattern as the sensitivity of η 1, indicating that any increase in life expectancy will be accompanied by an increase in the variance of longevity. The coefficient of variation takes this joint change into account; from (12.39),

$$\displaystyle \begin{aligned}\renewcommand\theequation{\thechapter.\arabic{equation}} {\epsilon CV \left( \boldsymbol{\eta} \right) \over \epsilon \boldsymbol{\theta}^{\mathsf{T}}} = \left(\begin{array}{rrrrrrrrr} 0.04& 0.02& 0.03& 0.00&-0.00&-0.00& 0.01&-0.00&-0.31 \\ 0&-0.00& 0.02&-0.01& 0.00&-0.01& 0.01&-0.01&-0.38 \\ 0& 0&-0.01&-0.03& 0.01&-0.02& 0.01&-0.04&-0.21 \\ 0.00& 0.00& 0.00&-0.00&-0.07&-0.08& 0.32&-0.14& 0.19 \\ 0& 0& 0& 0.00& 0.00&-0.30& 0.00&-0.27&-0.09 \\ 0& 0& 0& 0& 0& 0.00& 0.00& 0.00& 0.00 \\ 0& 0& 0& 0& 0& 0& 0& 0.00& 0.00 \end{array}\right). \end{aligned} $$
(12.74)

Most of these elasticities are small, suggesting that the mean and standard deviation respond roughly proportionally, so that the CV  does not change much.

The matrix B in (12.48), giving the ultimate probability of death from CRC (row 1) or other causes of death (row 2) is

$$\displaystyle \begin{aligned} {\mathbf{B}} = \left(\begin{array}{rrrrrrr} 0.1 & 0.2 & 0.4 & 0.7 & 0.9 & 0.6 & 0.9 \\ 0.9 & 0.8 & 0.6 & 0.3 & 0.1 & 0.4 & 0.1 \end{array}\right). \end{aligned} $$
(12.75)

Focusing on the probability of death due to CRC, the sensitivity and elasticity, from (12.50), are

$$\displaystyle \begin{aligned} \begin{array}{rcl} {d \mbox{vec} \, {\mathbf{B}}(1,:) \over d \boldsymbol{\theta}^{\mathsf{T}}} &\displaystyle =&\displaystyle \left(\begin{array}{rrrrrrrrr} 3.5&\displaystyle 1.0&\displaystyle 2.1&\displaystyle 0.0&\displaystyle -0.0&\displaystyle 0.0&\displaystyle 0.4&\displaystyle 0.0&\displaystyle -7.1 \\ 0&\displaystyle 2.5&\displaystyle 5.2&\displaystyle 0.1&\displaystyle -0.1&\displaystyle 0.0&\displaystyle 0.9&\displaystyle 0.1&\displaystyle -11.5 \\ 0&\displaystyle 0&\displaystyle 8.4&\displaystyle 0.1&\displaystyle -0.1&\displaystyle 0.0&\displaystyle 1.5&\displaystyle 0.1&\displaystyle -12.5 \\ 0&\displaystyle 0&\displaystyle 0&\displaystyle 0.2&\displaystyle -0.2&\displaystyle 0.1&\displaystyle 3.0&\displaystyle 0.2&\displaystyle -8.5 \\ 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0.1&\displaystyle 0.00&\displaystyle 0.4&\displaystyle -5.4 \\ 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 8.1&\displaystyle 0&\displaystyle -11.1 \\ 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0.4&\displaystyle -3.9 \end{array}\right) \\ {} {\epsilon \mbox{vec} \, {\mathbf{B}}(1,:) \over \epsilon \boldsymbol{\theta}^{\mathsf{T}}} &\displaystyle =&\displaystyle \left(\begin{array}{rrrrrrrrr} 0.6&\displaystyle 0.4&\displaystyle 0.5&\displaystyle 0.1&\displaystyle -0.1&\displaystyle 0.0&\displaystyle 0.1&\displaystyle 0.1&\displaystyle -1.7 \\ 0&\displaystyle 0.4&\displaystyle 0.5&\displaystyle 0.1&\displaystyle -0.1&\displaystyle 0.0&\displaystyle 0.1&\displaystyle 0.1&\displaystyle -1.2 \\ 0&\displaystyle 0&\displaystyle 0.5&\displaystyle 0.1&\displaystyle -0.1&\displaystyle 0.0&\displaystyle 0.1&\displaystyle 0.1&\displaystyle -0.8 \\ 0&\displaystyle 0&\displaystyle 0&\displaystyle 0.1&\displaystyle -0.1&\displaystyle 0.0&\displaystyle 0.1&\displaystyle 0.0&\displaystyle -0.3 \\ 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0.0&\displaystyle 0&\displaystyle 0.1&\displaystyle -0.1 \\ 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0.4&\displaystyle 0&\displaystyle -0.4 \\ 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0&\displaystyle 0.1&\displaystyle -0.1 \\ \end{array}\right). \end{array} \end{aligned} $$

The probability of death from CRC could be reduced by increasing the mortality rate due to other causes (last column), although this is not an attractive treatment option. A more useful interpretation of the last column is as an indication of the increase in death from CRC that would result from reducing other causes of death.

For normal individuals, the risk of death from CRC is most elastic to changes in λ 2, λ 3, and λ 4 (row 1). The row sums of the elasticity matrix, corresponding to the effects of a proportional change in all rates, sum to zero because a change of time scale does not affect the probability of absorption.

6.2 Sensitivity of the Embedded Chain

The transition matrix \(\widehat {{\mathbf {P}}}\) in (12.76) for the embedded chain is

(12.76)

The fundamental matrix \(\widehat {{\mathbf {N}}}_1\) from (12.57) is

$$\displaystyle \begin{aligned} \widehat{{\mathbf{N}}}_1 = \left(\begin{array}{rrrrrrr} 1.0 & 0 & 0 & 0& 0 & 0& 0 \\ 0.4 & 1.0 & 0 & 0& 0 & 0& 0 \\ 0.2 & 0.6 & 1.0 & 0& 0 & 0& 0 \\ 0.1 & 0.3 & 0.5 & 1.0& 0 & 0& 0 \\ 0.1 & 0.2 & 0.3 & 0.6& 1.0 & 0& 0 \\ 0.1 & 0.1 & 0.2 & 0.4& 0 & 1.0& 0 \\ 0.1 & 0.2 & 0.3 & 0.6& 1.0 & 0& 1.0 \end{array}\right). \end{aligned} $$
(12.77)

In this continuous-time chain, states cannot be re-entered (cf. Fig. 12.1). Because a state can be visited at most once, the mean number of visits is also the probability of ever entering the state. Thus the probabilities that a normal individual will ever suffer early or late clinical CRC are \(\widehat {{\mathbf {N}}}_1 (6,1)=0.1\), and \(\widehat {{\mathbf {N}}}_1(7,1) = 0.07\), respectively. These probabilities increase for individuals in successively later stages; for an individual with large adenoma the probabilities are \(\widehat {{\mathbf {N}}}_1(6.3)=0.2\) and \(\widehat {{\mathbf {N}}}_1(7,3)=0.3\), respectively.

Focusing sensitivity analysis on individuals in the normal state (state 1), the sensitivities and elasticities of the number of visits are

$$\displaystyle \begin{aligned} {d \widehat{{\mathbf{N}}}_1(:,1) \over d \boldsymbol{\theta}^{\mathsf{T}}} = \left(\begin{array}{rrrrrrrrr} 0 & 0 & 0& 0& 0 & 0 & 0 & 0& 0 \\ 15.9 & 0 & 0& 0& 0 & 0 & 0 & 0&-11.0 \\ 9.7 & 2.8 & 0& 0& 0 & 0 & 0 & 0&-11.1 \\ 4.8 & 1.4 & 2.9& 0& 0 & 0 & 0 & 0& -8.3 \\ 2.8 & 0.8 & 1.7& 0.1& -0.1 & 0 & 0 & 0& -5.0 \\ 1.8 & 0.5 & 1.1& -0.1& 0.1 & 0 & 0 & 0& -3.2 \\ 2.7 & 0.8 & 1.6& 0.1& -0.1 & 0.0 & 0 & 0& -4.9 \end{array}\right) \end{aligned} $$
(12.78)

and

$$\displaystyle \begin{aligned} {\epsilon \widehat{{\mathbf{N}}}_1(:,1) \over \epsilon \boldsymbol{\theta}^{\mathsf{T}}} = \left(\begin{array}{rrrrrrrrr} 0 & 0 & 0& 0& 0 & 0 & 0 & 0& 0 \\ 0.6 & 0 & 0& 0& 0 & 0 & 0 & 0& -0.6 \\ 0.6 & 0.4 & 0& 0& 0 & 0 & 0 & 0& -1.0 \\ 0.6 & 0.4 & 0.5& 0& 0 & 0 & 0 & 0& -1.5 \\ 0.6 & 0.4 & 0.5& 0.4& -0.4 & 0 & 0 & 0& -1.5 \\ 0.6 & 0.4 & 0.5& -0.6& 0.6 & 0 & 0 & 0& -1.5 \\ 0.6 & 0.4 & 0.5& 0.41& -0.4 & 0.04 & 0 & 0& -1.5 \end{array}\right). \end{aligned} $$
(12.79)

The sensitivities and elasticities of the probability of contracting clinical CRC are given by the last two rows. These probabilities are highly elastic to λ 1, λ 2 and λ 3. The elasticities to μ indicate that every 1% reduction in mortality due to other causes will cause about a 1.5% increase in the probability of experiencing clinical CRC.

7 Discussion

The results of this chapter have been presented in terms of differentials of, or derivatives with respect to, a general vector θ of parameters. The nature of these parameters and their relation to Q, U, or M can be very general. At its simplest, θ could consist of some subset of the elements of Q. This is the case in the CRC example (Sect. 12.6), in which the parameters are transition rates λ i and mortality rates μ i. More generally, the transition rates might themselves be written as functions of other variables. For example, in Van Den Hout and Matthews (2009a,b) the rates are written as \(q_{ij}=\exp \left ( \boldsymbol {\beta }_{ij}^{\mathsf {T}} {\mathbf {z}} \right )\), i ≠ j, where z is a vector of covariates (e.g., age, medical care) and β ij is a vector of coefficients to be estimated. The results presented here can be applied directly to such cases, and indeed to even more complicated functional dependencies, using the chain rule. Thus, focusing on parametric dependence is not only scientifically valuable (these are, after all, the relationships of interest in applications of Markov chains) but also extremely general.

Epidemic models are often written as continuous-time Markov chains, specified in terms of rates of movement among infection states. Gómez-Corral and López-García (2018) extended the methods of this chapter to a model in which individuals are classified by two state variables (a level-dependent quasi-birth-death process). The model may be considered a continuous-time analog of the age×stage models of Chap. 6 (Caswell 2012; Caswell and Salguero-Gómez 2013; Caswell et al. 2018). Their approach takes advantage of the block structure of the intensity matrix for such processes. They have also applied the approach to receptor-ligand complexes within cells (López-García et al. 2018). As far removed from demography as molecules may seem, the concepts of i-state transitions, of inferring population behavior from individual trajectories, and of sensitivity analysis still apply. That’s a good thing.