On Observability and Reconstruction of Promoter Activity Statistics from Reporter Protein Mean and Variance Profiles

Cinquemani, Eugenio

doi:10.1007/978-3-319-47151-8_10

On Observability and Reconstruction of Promoter Activity Statistics from Reporter Protein Mean and Variance Profiles

Eugenio Cinquemani¹⁵

Conference paper
First Online: 25 September 2016

368 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9957))

Abstract

Reporter protein systems are widely used in biology for the indirect quantitative monitoring of gene expression activity over time. At the level of population averages, the relationship between the observed reporter concentration profile and gene promoter activity is established, and effective methods have been introduced to reconstruct this information from the data. At single-cell level, the relationship between population distribution time profiles and the statistics of promoter activation is still not fully investigated, and adequate reconstruction methods are lacking.

This paper develops new results for the reconstruction of promoter activity statistics from mean and variance profiles of a reporter protein. Based on stochastic modelling of gene expression dynamics, it discusses the observability of mean and autocovariance function of an arbitrary random binary promoter activity process. Mathematical relationships developed are explicit and nonparametric, i.e. free of a priori assumptions on the laws governing the promoter process, thus allowing for the decoupled analysis of the switching dynamics in a subsequent step. The results of this work constitute the essential tools for the development of promoter statistics and regulatory mechanism inference algorithms.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bowsher, C.G., Voliotis, M., Swain, P.S.: The fidelity of dynamic signaling by noisy biomolecular networks. PLoS Comput. Biol. 9(3), e1002965 (2013)
Article MathSciNet Google Scholar
Cinquemani, E.: Reconstruction of promoter activity statistics from reporter protein population snapshot data. In: 2015 54th IEEE Conference on Decision and Control (CDC), pp. 1471–1476, December 2015
Google Scholar
Cinquemani, E.: Reconstructing statistics of promoter switching from reporter protein population snapshot data. In: Abate, A., et al. (eds.) HSB 2015. LNCS, vol. 9271, pp. 3–19. Springer, Heidelberg (2015). doi:10.1007/978-3-319-26916-0_1
Chapter Google Scholar
De Nicolao, G., Sparacino, G., Cobelli, C.: Nonparametric input estimation in physiological systems: problems, methods, and case studies. Automatica 33(5), 851–870 (1997)
Article MathSciNet MATH Google Scholar
Friedman, N., Cai, L., Xie, X.S.: Linking stochastic dynamics to population distribution: an analytical framework of gene expression. Phys. Rev. Lett. 97, 168302 (2006)
Article Google Scholar
Hasenauer, J., Waldherr, S., Doszczak, M., Radde, N., Scheurich, P., Allgower, F.: Identification of models of heterogeneous cell populations from population snapshot data. BMC Bioinform. 12(1), 125 (2011)
Article MATH Google Scholar
Hasenauer, J., Wolf, V., Kazeroonian, A., Theis, F.J.: Method of conditional moments (MCM) for the chemical master equation. J. Math. Biol. 69(3), 687–735 (2014)
Article MathSciNet MATH Google Scholar
Hespanha, J.: Modelling and analysis of stochastic hybrid systems. IEE Proc. Control Theor. Appl. 153(5), 520–535 (2006)
Article MathSciNet Google Scholar
de Jong, H., Ranquet, C., Ropers, D., Pinel, C., Geiselmann, J.: Experimental and computational validation of models of fluorescent and luminescent reporter genes in bacteria. BMC Syst. Biol. 4(1), 55 (2010)
Article Google Scholar
Kaern, M., Elston, T.C., Blake, W.J., Collins, J.J.: Stochasticity in gene expression: from theories to phenotypes. Nat. Rev. Gen. 6, 451–464 (2005)
Article Google Scholar
Koopmans, L.H.: The Spectral Analysis of Time Series. Probability and Mathematical Statistics. Academic Press, San Diego (1995)
MATH Google Scholar
Lindquist, A., Picci, G.: Linear Stochastic Systems - A Geometric Approach to Modeling, Estimation and Identification. Springer, Heidelberg (2015)
MATH Google Scholar
Munsky, B., Trinh, B., Khammash, M.: Listening to the noise: Random fluctuations reveal gene network parameters. Mol. Syst. Biol. 5 (2009). Article ID 318
Google Scholar
Neuert, G., Munsky, B., Tan, R., Teytelman, L., Khammash, M., van Oudenaarden, A.: Systematic identification of signal-activated stochastic gene regulation. Science 339(6119), 584–587 (2013)
Article Google Scholar
Papoulis, A.: Probability, Random Variables, and Stochastic Processes. McGraw-Hill Series in Electrical Engineering. McGraw-Hill, New York (1991)
MATH Google Scholar
Paulsson, J.: Models of stochastic gene expression. Phys. Life Rev. 2(2), 157–175 (2005)
Article Google Scholar
Sanft, K.R., Wu, S., Roh, M., Fu, J., Lim, R.K., Petzold, L.R.: Stochkit2: Software for discrete stochastic simulation of biochemical systems with events. Bioinformatics 27(17), 2457–2458 (2011)
Article Google Scholar
Stefan, D., Pinel, C., Pinhal, S., Cinquemani, E., Geiselmann, J., de Jong, H.: Inference of quantitative models of bacterial promoters from time-series reporter gene data. PLoS Comput. Biol. 11(1), e1004028 (2015)
Article Google Scholar
Zechner, C., Ruess, J., Krenn, P., Pelet, S., Peter, M., Lygeros, J., Koeppl, H.: Moment-based inference predicts bimodality in transient gene expression. PNAS 21(109), 8340–8345 (2012)
Article Google Scholar
Zechner, C., Unger, M., Pelet, S., Peter, M., Koeppl, H.: Scalable inference of heterogeneous reaction kinetics from pooled single-cell recordings. Nat. Methods 11, 197–202 (2014)
Article Google Scholar
Zulkower, V., Page, M., Ropers, D., Geiselmann, J., de Jong, H.: Robust reconstruction of gene expression profiles from reporter gene data using linear inversion. Bioinformatics 31(12), i71–i79 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Inria Grenoble – Rhône-Alpes, Montbonnot, 38334, St. Ismier Cedex, France
Eugenio Cinquemani

Authors

Eugenio Cinquemani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eugenio Cinquemani .

Editor information

Editors and Affiliations

INRIA Grenoble - Rhone-Alpes , Saint-Ismier, France
Eugenio Cinquemani
Electrical Engineering/Computer Science, University of California at Berkeley , Berkeley, California, USA
Alexandre Donzé

Appendices

A Definitions and Proofs

Matrix definitions. $A_{MP}$ $A_{MP,\times }$ $A_{MP,F}$ are given by

$$\begin{aligned} \begin{bmatrix} - d_M&0&0&0&0\\ k_P&- d_P&0&0&0\\ d_M&0&- 2\, d_M&0&0\\ k_P&d_P&0&- 2\, d_P&2\, k_P\\ 0&0&k_P&0&- d_M - d_P \end{bmatrix}, \quad \begin{bmatrix} 0&0\\ 0&0\\ 2\, k_M&0\\ 0&0\\ 0&k_M \end{bmatrix},\quad \begin{bmatrix} k_M&0\\ 0&0\\ k_M&0\\ 0&0\\ 0&0 \end{bmatrix}, \end{aligned}$$

in the same order, while

$$\begin{aligned} A_\otimes&= \begin{bmatrix} - d_M&0\\ k_P&- d_P \end{bmatrix},&A_{\times ,F}&= \begin{bmatrix} 0&k_M\\ 0&0 \end{bmatrix},&A_F&=\begin{bmatrix} - \alpha&0\\ \alpha -2\lambda _+&- 2\alpha \end{bmatrix}. \end{aligned}$$

Proof of Proposition 1 . Process F is a homogeneous continuous-time binary Markov chain. Letting $p(t)=\begin{bmatrix}\text {Prob}\{F(t)=0\}&\text {Prob}\{F(t)=1\} \end{bmatrix}^T$, for any t and $\tau $ it holds that

$$\begin{aligned} p(t)&=e^{Q(t-\tau )}p(\tau ),&Q&=\begin{bmatrix} -\lambda _+&\lambda _- \\ \lambda _+&-\lambda _- \end{bmatrix}. \end{aligned}$$

Mean $\mu _F=\text {Prob}\{F(t)=1\}$. Using the fact that $\dot{p}=Q p$, the differential equation for $\mu _F$, the second element of p, is $\dot{\mu }_F=\lambda _+(1-\mu _F)-\lambda _-\mu _F=-\alpha \mu _F+\lambda _+$. The solution of this equation relative to $\mu _F(0)$ yields the expression in the statement. Covariance $\rho _F(t,\tau )=\text {Prob}\{F(t)=1,F(\tau )=1\}-\mu _F(t)\mu _F(\tau )$. By Bayes’law, $\text {Prob}\{F(t)=1,F(\tau )=1\}=\text {Prob}\{F(t)={1|F(\tau )=1\}\cdot \text {Prob}\{F(\tau )=1\}}$. Second factor is equal to $\mu _F(\tau )$, while the first factor is given by the entry of row 2 and column 1 of $e^{Q(t-\tau )}$. Computing the matrix exponential thus yields the result. Stationary versions of $\mu _F$ and $\rho _F$ are found simply by taking the limit of $\mu _F(t)$ as $t\rightarrow +\infty $ and replacing the result for $\mu _F(\tau )$ and $\mu _F(t)$ in the expression of $\rho _F(t,\tau )$.

Proof of Proposition 2 . Starting from the second relation in (10),

$$\begin{aligned} \mathscr {M}_P(t)= & {} \mathbb {E}\big [\mathbb {E}[P^2|F]\big ]=\mathbb {E}\Big [\mathbb {E}\big [\big ((P-\mathbb {E}[P|F])+\mathbb {E}[P|F]\big )^2|F\big ]\Big ]\\= & {} \mathbb {E}\Big [\mathbb {E}\big [(P-\mathbb {E}[P|F])^2|F\big ]\Big ]+\mathbb {E}\Big [\mathbb {E}\big [\mathbb {E}[P|F]^2|F\big ]\Big ] \\+ & {} 2\cdot \mathbb {E}\Big [\mathbb {E}\big [(P-\mathbb {E}[P|F])\cdot \mathbb {E}[P|F]|F\big ]\Big ] \\= & {} \mathbb {E}\Big [\mathbb {E}\big [(P-\mathbb {E}[P|F])^2|F\big ]\Big ]+\mathbb {E}\Big [\mathbb {E}[P|F]^2\Big ]\\+ & {} 2\cdot \mathbb {E}\Big [\mathbb {E}\big [P-\mathbb {E}[P|F]|F\big ]\cdot \mathbb {E}[P|F]\Big ], \end{aligned}$$

where the last row vanishes since $\mathbb {E}\big [P-\mathbb {E}[P|F]|F\big ]=0$. Then, using the definitions of $\mu _P^F$ and $\sigma _{PP}^F$, the chain of equalities continues with

$$\begin{aligned} =\mathbb {E}\big [\sigma _{PP}^F(t)\big ]+\mathbb {E}\big [\big (\mu _P^F(t)\big )^2\big ] =\mathbb {E}\big [L_2^tF\big ]+\mathbb {E}\big [(L_1^tF)^2]= L_2^t\mu _F+\mathbb {E}[(L_1^tF)^2]. \end{aligned}$$

Proof of Proposition 3 . The following chain of inequalities hold:

$$\begin{aligned} \mathbb {E}[(L_1^tF)^2]= & {} \mathbb {E}\bigg [\bigg (L_1^t\Big (\sum _ia_i\phi _i\Big )\bigg )^2\bigg ]=\mathbb {E}\bigg [\sum _{i,j}L_1^t(a_i\phi _i)L_1^t(a_j\phi _j)\bigg ] \nonumber \\= & {} \sum _{i,j}\mathbb {E}[a_ia_j](L_1^t\phi _i)(L_1^t\phi _j)=\sum _i\sigma _i^2(L_1^t\phi _i)^2, \end{aligned}$$

(21)

where the latter equality follows from the mutual uncorrelation of the $a_i$.

Proof of Proposition 4 . Expanding the last term of (17) one gets

$$\begin{aligned} \mathbb {E}[(L_1^tF)^2]&=\int d\mathscr {P}_F(f) (L^t_1 f)^2 \\&=\int d\mathscr {P}_F(f) \left( \int _0^t d\tau \,\ell _1(t,\tau )f(\tau )\right) \left( \int _0^t dv\,\ell _1(t,v)f(v)\right) \\&=\int _0^t d\tau \int _0^t dv\, \ell _1(t,\tau ) \ell _1(t,v) \left( \int d\mathscr {P}_F(f)f(\tau )f(v)\right) \\&= \int _0^t d\tau \int _0^t dv\, \ell _1(t,\tau ) \ell _1(t,v) \big (\rho _F(\tau ,v)+\mu _F(\tau )\mu _F(v)\big ) \end{aligned}$$

where the last integrand is of course the autocorrelation of F at $\tau $ and v. Therefore

$$\begin{aligned} \sigma _{PP}(t)&=\mathscr {M}_P(t)-\mu _P^2(t) \\&=L^t_2\mu _F+\int _0^t d\tau \int _0^t dv\, \ell _1(t,\tau ) \ell _1(t,v) \big (\rho _F(\tau ,v)+\mu _F(\tau )\mu _F(v)\big ) \\&\quad - \left( \int _0^t d\tau \,\ell _1(t,\tau )\mu _F(\tau )\right) \left( \int _0^t dv\,\ell _1(t,v)\mu _F(v)\right) , \end{aligned}$$

and the result follows by collecting integrals and simplifying.

B Laplace Sensitivity Method for the Analysis of Parameter Identifiability

This section reports the identifiability analysis method of [2]. Let $\mathscr {Y}_\theta (t)$ be a vector function of $t\in \mathbb {R}$ depending on parameters $\theta $. Typically $\mathscr {Y}_\theta (\cdot )$ is an observed response of a dynamical system defined in terms of $\theta $.

Definition 1

The parametric family (of functions) $\{\mathscr {Y}_\theta :~\theta \in {\varTheta }\}$, with ${\varTheta }\subseteq \mathbb {R}^N$, $N\in \mathbb {N}$, is

(a)
locally identifiable at $\theta ^*$ if a neighborhood $B_{\theta ^*}\subseteq {\varTheta }$ of $\theta ^*$ exists such that the implication holds $\forall \theta \in B_{\theta ^*}$;
(b)
locally identifiable if (a) holds for almost every (a.e.) $\theta ^*\in {\varTheta }$.

For any given $\theta $ let $Y(s,\theta )$ be the Laplace transform of $\mathscr {Y}_\theta (\cdot )$. Let $\nabla Y(s,\theta )=\frac{\partial Y}{\partial \theta }(s,\theta )=\left[ \frac{\partial Y}{\partial \theta _1}~\cdots ~\frac{\partial Y}{\partial \theta _N} \right] (s,\theta ). $

Proposition 5

If, for some $L\in \mathbb {N}$, a set of points (or $\mathbb {C}$) exists such that the matrix

has full column rank, then $\{\mathscr {Y}_\theta :~\theta \in {\varTheta }\}$ is locally identifiable at $\theta ^*$ (in the sense of Definition 1(a)).

Now assume that the elements of $Y(s,\theta )$ are ratios of polynomials in the entries of $\theta $.

Corollary 1

If, for a given set of points and a given $\theta ^*$, matrix is full column rank, then $\{\mathscr {Y}_\theta :~\theta \in {\varTheta }\}$ is locally identifiable (a.e. in the sense of Definition 1(b)).

In the present paper, the Laplace transforms that are used to discuss identifiability belong to this last class (see [2]), whence Corollary 1 applies. In practice, these conditions can be easily checked by the use of the Matlab Symbolic Math Toolbox and evaluation of the rank conditions based on a finite set of heuristically chosen points (see again [2]).

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cinquemani, E. (2016). On Observability and Reconstruction of Promoter Activity Statistics from Reporter Protein Mean and Variance Profiles. In: Cinquemani, E., Donzé, A. (eds) Hybrid Systems Biology. HSB 2016. Lecture Notes in Computer Science(), vol 9957. Springer, Cham. https://doi.org/10.1007/978-3-319-47151-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-47151-8_10
Published: 25 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47150-1
Online ISBN: 978-3-319-47151-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics