# Modeling of 2D diffusion processes based on microscopy data: parameter estimation and practical identifiability analysis

## Abstract

### Background

Diffusion is a key component of many biological processes such as chemotaxis, developmental differentiation and tissue morphogenesis. Since recently, the spatial gradients caused by diffusion can be assessed in-vitro and in-vivo using microscopy based imaging techniques. The resulting time-series of two dimensional, high-resolutions images in combination with mechanistic models enable the quantitative analysis of the underlying mechanisms. However, such a model-based analysis is still challenging due to measurement noise and sparse observations, which result in uncertainties of the model parameters.

### Methods

We introduce a likelihood function for image-based measurements with log-normal distributed noise. Based upon this likelihood function we formulate the maximum likelihood estimation problem, which is solved using PDE-constrained optimization methods. To assess the uncertainty and practical identifiability of the parameters we introduce profile likelihoods for diffusion processes.

### Results and conclusion

As proof of concept, we model certain aspects of the guidance of dendritic cells towards lymphatic vessels, an example for haptotaxis. Using a realistic set of artificial measurement data, we estimate the five kinetic parameters of this model and compute profile likelihoods. Our novel approach for the estimation of model parameters from image data as well as the proposed identifiability analysis approach is widely applicable to diffusion processes. The profile likelihood based method provides more rigorous uncertainty bounds in contrast to local approximation methods.

## Keywords

Likelihood Function Maximum Likelihood Estimator Profile Likelihood Identifiability Analysis Practical Identifiability## Introduction

Diffusion is assumed to be the basis of many spatial organization processes for multi-cellular organisms. Crucial processes such as developmental pattern formation or chemotaxis rely on gradient information arising from diffusion and transport processes [1, 2]. In the last decades, diffusion processes have been of great interest not only for experimentalist but also for theoreticians. Turing [3] was the first to break ground, followed by Gierer and Meinhardt [4], who introduced models for such processes based on partial differential equations (PDEs). A prominent aspect is the diffusion of extracellular signaling molecules. Such molecules are synthesized and secreted by cells and spread through the surrounding tissue, forming a gradient. A biological prominent example is guided cell movement along such gradients. In this case, the cell senses the concentration difference between front and back, and moves along the gradient.

Despite these high-resolution imaging data, the number of quantitative models of biological diffusion processes is limited. While quantitative modeling with ordinary differential equations (ODEs) is a common method and the theory of parameter estimation and identifiability is sound, those results have yet to be transferred to the quantitative modeling with PDEs [9, 10]. In recent years the field of PDE-constrained optimization emerged, providing the theory and methods to estimate parameters of PDEs [11]. Nevertheless, specific problems occurring in biological problems, like partial observations, sparse measurements and high noise levels, have yet to be addressed. This has already been done for ODE parameter estimation techniques [12] but is an open problem in the PDE context. In particular, appropriate likelihood functions and methods for the efficient and reliable analysis of practical identifiability [9] are not available.

In this paper, we propose a likelihood function for the estimation of parameters of 2D diffusion process from image data. Furthermore, we transfer the concept of profile likelihood based identifiability analysis introduced by Raue et al. [9] from ODEs to PDEs. This allows us to go beyond the classical uncertainty analysis methods based on local approximation towards global uncertainty bounds. Finally, we evaluate the methods by studying a model for diffusion processes involved in the migration of dendritic cells towards lymphatic vessels (see schematic picture Figure 1B).

## Methods

In the following section we shortly introduce the considered class of PDEs and the available types of measurement data. Afterwards, the parameter estimation and identifiability analysis methods are presented.

### Problem description

*t*∈ (0,

*T*], $x\in \text{\Omega}\subset {\mathbb{R}}^{2}$ and $\phi \in {\mathbb{R}}_{+}^{{n}_{\phi}}$ we consider reaction-diffusion models of the form

where *u*(*t*, *x*) is a vector-valued function describing concentrations, molecule numbers or similar entities for a set of interacting substances. For non-diffusive components the corresponding entries of the diagonal diffusion matrix *D* are zero. The model is complemented with boundary conditions and initial conditions. In the following, we assume that boundary conditions, initial conditions and *f* (*t*, *x*, *u*, *φ*) are chosen such that for all *x*, *t* and *φ* a unique, integrable (with respect to *x*) solution *u*(*t*, *x*) exists in an appropriate function space *U* .

*t*

_{ k }for

*k*= 1, . . . ,

*N*, which are not necessarily equally spaced. For each image the number of pixels

*M*and their pixel area ${A}_{{p}_{t}}$ is known. A suitable function to map the state

*u*(

*t*,

*x*;

*φ*) to the observables is

for *i* = 1, . . . , *M* and *k* = 1, . . . , *N* . Here $b\in {\mathbb{R}}_{+}$ denotes a constant off-set due to background luminescence and *h* defines the observables and could for instance be a mapping onto the first component of *u*. With our assumptions made about existence, uniqueness and integrability this is a well-defined function.

With ${\u0233}_{k,i}$ we denote the intensity of pixel *i* at time point *t*_{ k }. We assume that $\epsilon ~LN\left(0,{\sigma}^{2}\right)$, hence ${\u0233}_{k,i}~LN\left({y}_{k,i},{\sigma}^{2}\right)$. In the following, we introduce the corresponding likelihood function which is used to estimate the unknown parameters *θ* = (*φ*, *b*, *σ*^{2}).

### Maximum likelihood estimation

*J*(

*θ*) =

*−*log

*L*(

*θ*), is minimized to improve the numerics. The minimization problem for parameter estimation for models of the type (1) is then given as

Optimization problems of this type belong to the class of PDE-constrained optimization problems, for which different numerical methods have been established (see [11] and references therein). Depending on the problem structure the PDE is either first optimized and then discretized or discretized and the optimized, which is often necessary and can be justified mathematically [11]. For the example considered, we used the second approach. Furthermore, we optimize the logarithm of the parameters instead of the parameters themselves. This take care of the natural positivity constraints and has for ODE models been shown to be more reliable.

While the optimization problem (5) can be solved numerically, the main problem for parameter estimation is the shape of the likelihood function. Non-identifiabilities and non-linear correlated parameters, leading to 'banana-shaped' likelihoods, render local approximation methods for the evaluation of confidence intervals often inaccurate.

### Profile likelihood based identifiability analysis

The uncertainty of the MLE is commonly analyzed by a local approximation of the objective function and the resulting asymptotic confidence intervals. This local approximation, however, is not reliable for nonlinear problems when we are interested global uncertainty bounds.

*θ*

_{ i }. For parameter

*θ*

_{ i }, PL(

*θ*

_{ i }) is computed by the re-optimization of all parameters

*θ*

_{ j }

*≠*

*θ*

_{ i }along the profile of parameter

*θ*

_{ i }[13]:

*θ*

_{ i },

*i*= 1, . . . ,

*n*

_{ θ }, and allows the evaluation of the likelihood ratio

*R*(

*θ*

_{ i }) = PL(

*θ*

_{ i })

*/L*(

*θ**) for the individual parameters. Based on the likelihood ratio

*R*(

*θ*

_{ i }) we can determine globally valid confidence intervals for the parameter

*θ*

_{ i },

with confidence level *α* and the corresponding likelihood ratio threshold *δ*_{ α } = *χ*^{2}(*α*, 1) [13]. And according to [9] a parameter is called practical non-identifiable if the likelihood ration does not fall below the threshold *δ*_{ α } for increasing and decreasing values of *θ*_{ i }. Hence a profile likelihood which is flat, i.e. remains above the threshold *δ*_{ α }, indicates a practically unidentifiable parameter. For systems of ordinary differential equations the profile likelihood calculation has been shown to be a suitable method to quantify the practical identifiability and the uncertainty of parameters.

## Results

To illustrate the proposed parameter estimation and uncertainty analysis framework, we consider the formation of gradients of signaling protein which are immobilized by the extracellular matrix. Such gradients are, for instance, the basis of the haptotaxis of dendritic cells towards the lymphatic vessels upon the detection of unknown antigenes [6, 7] (see Figure 1B). In this process, dendritic cells move towards the closest lymphatic vessel in the tissue and are subsequently transported through the lymphoid system towards the lymph nodes. The movement of the dendritic cells is guided by an immobilized gradient of the cytokine CCL21, which is released from the lymphoid vessels [6].

In the following we formulate a model for the gradient formation process. In our model, the signaling protein CCL21, denoted by *P*, is produced constantly at a spatially distributed source *Q*, the lymphoid system. The signaling protein gradient is immobilized through complex formation with a tissue bound sugar, denoted by *S*. The immobilized CCL21 protein is denoted by *C*.

*p*,

*s*and

*c*. Each variable is a function of the spatial location

*x*∈ Ω = [0,1]

^{2}, time

*t*∈ [0,

*T*] and a set of unknown parameters

*θ*. The model considered is:

*t*∈ (0,

*T*] and

*x*∈ Ω, with initial conditions

where *ν* denotes the outer normal of Ω. The binding and unbinding rates of CCL21 and tissue-bound sugar are denoted by *k*_{1} and *k*_{− 1}. The diffusion constant of CCL21, the rate of CCL21 degradation and the rate of CCL21 release from the lymph system are *D*, *γ* and *α*, respectively. In the following, we assume that *Q* : Ω *→* {0, 1} and *s*_{0} *≡* 1, due to scaling. We consider the kinetic parameters *θ* = (*D*, *α*, *k*_{1}, *k*_{− 1}, *γ*) of this model as unknowns with *θ* ∈ [10^{− 2}, 10^{1}]^{5}.

For this process image-resolved measurements of the immobilized CCL21 have been taken at one time point (see Figure 1A and [7]). And it might be assumed that the process has reached a steady state (personal communication with authors of [7]). Analytical analysis of the model (1) showed that not all parameters are identifiable from steady state data. In particular, it can be shown that the reaction rates *k*_{1} and *k*_{− 1}are structurally not identifiable. In the following, we want to analyze whether time-series data are sufficient to estimate all kinetic parameters of the model. We want to address this question with the image-based profile likelihood method introduced above and the model considering signaling protein, substrate and substrate-bound protein.

*C*. According to (2) we have

*c*. For the estimation process, data are generated via model simulation (for parameters see Table 1). Forthese simulations we chose a y-shape source term imitating a lymphoid vessel branch (see Figure 2A). We consider images taken at five time points

*t*

_{ k }∈ (0, 1],

*k*= 1, . . . , 5 with 50 pixels each ${A}_{{p}_{t}}$,

*j*= 1, . . . , 50. To account for measurement noise, log-normal noise was added (according to (3)) with σ = 10

^{−2}and

*b*= 10

^{−4}(see Figure 2B for one representative image).

Estimation results

Name | value | MLE | parameter bounds | confidence intervals | identifiability | ||
---|---|---|---|---|---|---|---|

| $\widehat{\theta}$ | | | | | ||

| 0.5 | 0.4985 | 0.01 | 10 | 0.4939 | 0.5044 | practical identifiable |

| 0.1 | 0.0995 | 0.01 | 10 | 0.0969 | 0.1059 | practical identifiable |

| 5 | 5.0375 | 0.01 | 10 | 4.6841 | 5.1761 | practical identifiable |

| 1 | 1.0281 | 0.01 | 10 | 0.9704 | 1.0687 | practical identifiable |

| 0.1 | 0.0669 | 0.01 | 10 |
| 0.3272 | |

For this artificial data set the maximum likelihood estimate *θ** is shown in Table 1. The chosen data points where sufficient to identify parameters *D*, *α*, *k*_{1} and *k*_{−1} well. They are practical identifiable on a confidence level *α* = 98% as the likelihood ration *R*(*θ*) falls below the given threshold for increasing and decreasing values of the parameters (Figure 2). For these parameters a Hessian based approximation of the likelihood at the ML estimate yields a good approximation of the profile likelihood (Figure 2). This is not the case for *γ*. For *γ*, the Hessian based approximation of the likelihood function underestimates the true uncertainty. Indeed, the profile likelihood for *γ* reveals that the parameter is practical non-identifiable as no lower bound exists in the considered regime. Thus, for this parameter, the analysis of the profile likelihood is required to assess the uncertainty of the parameter estimation.

The identifiability properties as well as the parameter confidence intervals change depending on the noise levels and the number of time points M. Simulation results show that, as expected, the confidence interval width increases then the noise levels increase. Additionally the practical non-identifiability of parameter *γ* increases drastically with the noise level. An increased number of time points results in tighter confidence intervals and improved identifiability properties. If the number of time points is large enough the degradation *γ* even becomes identifiable (results not shown). This shows that with time-resolved data all parameters can be identified.

## Discussion and conclusion

In this paper we introduced profile likelihood-based identifiability analysis for diffusion processes based on 2D image data. As proof of concept, we applied our method to a reaction diffusion system involved in the guidance of dendritic cells to the lymphatic vessels [6, 7]. Based on current knowledge this is the first paper using profile likelihood methods in this context.

Our approach facilitates the rigorous definition of uncertainty bounds compared to local approximative methods like the approximation of the Hessian matrix. This allows us to determine precisely which parameters can be identified, which we illustrate for a model describing the formation of CCL21 gradients, involved in the guidance of dendritic cells. Furthermore, profile likelihood-based uncertainty analysis also facilitates the planning of experiments [14]. If a specific parameter of the model is of particular biological interest its expected identifiability properties after performing a proposed experiment can be analyzed. Another strength of the likelihood function introduced is the straight forward extension to voxel based data, i.e. 3D image stacks. In the current setup the 2D area integral in (2) becomes a 3D volume integral. We will address this extension and it's application in future work.

In the illustrative example we used a simple finite difference method to discretize the Laplace operator. This approximation scheme sufficed as we knew the exact parameter values and could set parameter bounds for the optimization such that the discretization errors did not influence the optimization. In real applications this is impossible and an adaptive scheme has to be applied to ensure convergence of the PDE solver [11]. Otherwise the profile likelihood calculation is affected by discretization errors and no longer reliable.

Our analysis of the illustrative example showed that time-series image-data are particularly suitable to estimate the kinetic parameters of a reaction-diffusion processes. An interesting question for future work is whether dose response experiments yield similar results. Finally, the profile likelihood analysis yields a more reliable estimate of the uncertainty in the parameter estimation for such processes and is required to give rigorous global uncertainty bounds. Given the introduced likelihood function we could now approach the model selection in a statistical reasonable way.

## Notes

### Acknowledgements

We thank Michael Sixt for getting us interested in the problem of haptotaxis. Furthermore, we would like to thank the unknown reviewers, who provided excellent comments and held to improve the paper significantly. This work was supported by the Helmholtz Alliance on Systems Biology project 'CoReNe', the European Union within the ERC grant 'LatentCauses', the BMBF grant 'Virtual Liver' (grant-nr. 315752).

**Declarations**

Publication costs were financed by the Helmholtz Zentrum Muenchen GmbH.

This article has been published as part of BMC Bioinformatics Volume 14 Supplement 10, 2013: Selected articles from the 10th International Workshop on Computational Systems Biology (WCSB) 2013: Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/14/S10.

## References

- 1.Lander AD: Morpheus unbound: reimagining the morphogen gradient. Cell. 2007, 128: 245-56. 10.1016/j.cell.2007.01.004.CrossRefPubMedGoogle Scholar
- 2.Müller P, Schier AF: Extracellular movement of signaling molecules. Developmental cell. 2011, 21: 145-58. 10.1016/j.devcel.2011.06.001.PubMedCentralCrossRefPubMedGoogle Scholar
- 3.Turing AM: The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 1952, 237: 37-72. 10.1098/rstb.1952.0012.CrossRefGoogle Scholar
- 4.Meinhardt H: Models of biological pattern formation. 1982, Academic Press London, 6:Google Scholar
- 5.Huang B, Babcock H, Zhuang X: Breaking the diffraction barrier: super-resolution imaging of cells. Cell. 2010, 143: 1047-1058. 10.1016/j.cell.2010.12.002.PubMedCentralCrossRefPubMedGoogle Scholar
- 6.Schumann K, Lämmermann T, Bruckner M, Legler D, Polleux J, Spatz J, Schuler G, Förster R, Lutz M, Sorokin L: Immobilized chemokine fields and soluble chemokine gradients cooperatively shape migration patterns of dendritic cells. Immunity. 2010, 32: 703-713. 10.1016/j.immuni.2010.04.017.CrossRefPubMedGoogle Scholar
- 7.Weber M, Hauschild R, Schwarz J, Moussion C, de Vries I, Legler DF, Luther SA, Bollenbach T, Sixt M: Interstitial dendritic cell guidance by haptotactic chemokine gradients. Science. 2013, 339: 328-332. 10.1126/science.1228456.CrossRefPubMedGoogle Scholar
- 8.Jones SA, Shim SH, He J, Zhuang X: Fast, three-dimensional super-resolution imaging of live cells. Nature Methods. 2011, 8: 499-505. 10.1038/nmeth.1605.PubMedCentralCrossRefPubMedGoogle Scholar
- 9.Raue A, Kreutz C, Maiwald T, Bachmann J, Schilling M, Klingmüller U, Timmer J: Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics. 2009, 25: 1923-1929. 10.1093/bioinformatics/btp358.CrossRefPubMedGoogle Scholar
- 10.Hug S, Raue A, Hasenauer J, Bachmann J, Klingmüller U, Timmer J, Theis FJ: High-dimensional bayesian parameter estimation: case study for a model of JAK2/STAT5 signaling. Mathatical Biosciences, accepted.Google Scholar
- 11.Hinze M, Pinnau R, Ulbrich M, Ulbrich S: Optimization with PDE constraints. 2009, New York: SpringerGoogle Scholar
- 12.Stumpf M, Balding D, Girolami M: Handbook of statistical systems biology. 2011, John Wiley and SonsCrossRefGoogle Scholar
- 13.Murphy S, Van der Vaart A: On profile likelihood. Journal of the American Statistical Association. 2000, 95: 449-465. 10.1080/01621459.2000.10474219.CrossRefGoogle Scholar
- 14.Kreutz C, Raue A, Timmer J: Likelihood based observability analysis and confidence intervals for predictions of dynamic models. BMC Systems Biology. 2012, 6: 120-10.1186/1752-0509-6-120.PubMedCentralCrossRefPubMedGoogle Scholar

## Copyright information

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.