Abstract
Reporter protein systems are widely used in biology for the indirect quantitative monitoring of gene expression activity over time. At the level of population averages, the relationship between the observed reporter concentration profile and gene promoter activity is established, and effective methods have been introduced to reconstruct this information from the data. At single-cell level, the relationship between population distribution time profiles and the statistics of promoter activation is still not fully investigated, and adequate reconstruction methods are lacking.
This paper develops new results for the reconstruction of promoter activity statistics from mean and variance profiles of a reporter protein. Based on stochastic modelling of gene expression dynamics, it discusses the observability of mean and autocovariance function of an arbitrary random binary promoter activity process. Mathematical relationships developed are explicit and nonparametric, i.e. free of a priori assumptions on the laws governing the promoter process, thus allowing for the decoupled analysis of the switching dynamics in a subsequent step. The results of this work constitute the essential tools for the development of promoter statistics and regulatory mechanism inference algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bowsher, C.G., Voliotis, M., Swain, P.S.: The fidelity of dynamic signaling by noisy biomolecular networks. PLoS Comput. Biol. 9(3), e1002965 (2013)
Cinquemani, E.: Reconstruction of promoter activity statistics from reporter protein population snapshot data. In: 2015 54th IEEE Conference on Decision and Control (CDC), pp. 1471–1476, December 2015
Cinquemani, E.: Reconstructing statistics of promoter switching from reporter protein population snapshot data. In: Abate, A., et al. (eds.) HSB 2015. LNCS, vol. 9271, pp. 3–19. Springer, Heidelberg (2015). doi:10.1007/978-3-319-26916-0_1
De Nicolao, G., Sparacino, G., Cobelli, C.: Nonparametric input estimation in physiological systems: problems, methods, and case studies. Automatica 33(5), 851–870 (1997)
Friedman, N., Cai, L., Xie, X.S.: Linking stochastic dynamics to population distribution: an analytical framework of gene expression. Phys. Rev. Lett. 97, 168302 (2006)
Hasenauer, J., Waldherr, S., Doszczak, M., Radde, N., Scheurich, P., Allgower, F.: Identification of models of heterogeneous cell populations from population snapshot data. BMC Bioinform. 12(1), 125 (2011)
Hasenauer, J., Wolf, V., Kazeroonian, A., Theis, F.J.: Method of conditional moments (MCM) for the chemical master equation. J. Math. Biol. 69(3), 687–735 (2014)
Hespanha, J.: Modelling and analysis of stochastic hybrid systems. IEE Proc. Control Theor. Appl. 153(5), 520–535 (2006)
de Jong, H., Ranquet, C., Ropers, D., Pinel, C., Geiselmann, J.: Experimental and computational validation of models of fluorescent and luminescent reporter genes in bacteria. BMC Syst. Biol. 4(1), 55 (2010)
Kaern, M., Elston, T.C., Blake, W.J., Collins, J.J.: Stochasticity in gene expression: from theories to phenotypes. Nat. Rev. Gen. 6, 451–464 (2005)
Koopmans, L.H.: The Spectral Analysis of Time Series. Probability and Mathematical Statistics. Academic Press, San Diego (1995)
Lindquist, A., Picci, G.: Linear Stochastic Systems - A Geometric Approach to Modeling, Estimation and Identification. Springer, Heidelberg (2015)
Munsky, B., Trinh, B., Khammash, M.: Listening to the noise: Random fluctuations reveal gene network parameters. Mol. Syst. Biol. 5 (2009). Article ID 318
Neuert, G., Munsky, B., Tan, R., Teytelman, L., Khammash, M., van Oudenaarden, A.: Systematic identification of signal-activated stochastic gene regulation. Science 339(6119), 584–587 (2013)
Papoulis, A.: Probability, Random Variables, and Stochastic Processes. McGraw-Hill Series in Electrical Engineering. McGraw-Hill, New York (1991)
Paulsson, J.: Models of stochastic gene expression. Phys. Life Rev. 2(2), 157–175 (2005)
Sanft, K.R., Wu, S., Roh, M., Fu, J., Lim, R.K., Petzold, L.R.: Stochkit2: Software for discrete stochastic simulation of biochemical systems with events. Bioinformatics 27(17), 2457–2458 (2011)
Stefan, D., Pinel, C., Pinhal, S., Cinquemani, E., Geiselmann, J., de Jong, H.: Inference of quantitative models of bacterial promoters from time-series reporter gene data. PLoS Comput. Biol. 11(1), e1004028 (2015)
Zechner, C., Ruess, J., Krenn, P., Pelet, S., Peter, M., Lygeros, J., Koeppl, H.: Moment-based inference predicts bimodality in transient gene expression. PNAS 21(109), 8340–8345 (2012)
Zechner, C., Unger, M., Pelet, S., Peter, M., Koeppl, H.: Scalable inference of heterogeneous reaction kinetics from pooled single-cell recordings. Nat. Methods 11, 197–202 (2014)
Zulkower, V., Page, M., Ropers, D., Geiselmann, J., de Jong, H.: Robust reconstruction of gene expression profiles from reporter gene data using linear inversion. Bioinformatics 31(12), i71–i79 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Definitions and Proofs
Matrix definitions. \(A_{MP}\) \(A_{MP,\times }\) \(A_{MP,F}\) are given by
in the same order, while
Proof of Proposition 1 . Process F is a homogeneous continuous-time binary Markov chain. Letting \(p(t)=\begin{bmatrix}\text {Prob}\{F(t)=0\}&\text {Prob}\{F(t)=1\} \end{bmatrix}^T\), for any t and \(\tau \) it holds that
Mean \(\mu _F=\text {Prob}\{F(t)=1\}\). Using the fact that \(\dot{p}=Q p\), the differential equation for \(\mu _F\), the second element of p, is \(\dot{\mu }_F=\lambda _+(1-\mu _F)-\lambda _-\mu _F=-\alpha \mu _F+\lambda _+\). The solution of this equation relative to \(\mu _F(0)\) yields the expression in the statement. Covariance \(\rho _F(t,\tau )=\text {Prob}\{F(t)=1,F(\tau )=1\}-\mu _F(t)\mu _F(\tau )\). By Bayes’law, \(\text {Prob}\{F(t)=1,F(\tau )=1\}=\text {Prob}\{F(t)={1|F(\tau )=1\}\cdot \text {Prob}\{F(\tau )=1\}}\). Second factor is equal to \(\mu _F(\tau )\), while the first factor is given by the entry of row 2 and column 1 of \(e^{Q(t-\tau )}\). Computing the matrix exponential thus yields the result. Stationary versions of \(\mu _F\) and \(\rho _F\) are found simply by taking the limit of \(\mu _F(t)\) as \(t\rightarrow +\infty \) and replacing the result for \(\mu _F(\tau )\) and \(\mu _F(t)\) in the expression of \(\rho _F(t,\tau )\).
Proof of Proposition 2 . Starting from the second relation in (10),
where the last row vanishes since \(\mathbb {E}\big [P-\mathbb {E}[P|F]|F\big ]=0\). Then, using the definitions of \(\mu _P^F\) and \(\sigma _{PP}^F\), the chain of equalities continues with
Proof of Proposition 3 . The following chain of inequalities hold:
where the latter equality follows from the mutual uncorrelation of the \(a_i\).
Proof of Proposition 4 . Expanding the last term of (17) one gets
where the last integrand is of course the autocorrelation of F at \(\tau \) and v. Therefore
and the result follows by collecting integrals and simplifying.
B Laplace Sensitivity Method for the Analysis of Parameter Identifiability
This section reports the identifiability analysis method of [2]. Let \(\mathscr {Y}_\theta (t)\) be a vector function of \(t\in \mathbb {R}\) depending on parameters \(\theta \). Typically \(\mathscr {Y}_\theta (\cdot )\) is an observed response of a dynamical system defined in terms of \(\theta \).
Definition 1
The parametric family (of functions) \(\{\mathscr {Y}_\theta :~\theta \in {\varTheta }\}\), with \({\varTheta }\subseteq \mathbb {R}^N\), \(N\in \mathbb {N}\), is
-
(a)
locally identifiable at \(\theta ^*\) if a neighborhood \(B_{\theta ^*}\subseteq {\varTheta }\) of \(\theta ^*\) exists such that the implication holds \(\forall \theta \in B_{\theta ^*}\);
-
(b)
locally identifiable if (a) holds for almost every (a.e.) \(\theta ^*\in {\varTheta }\).
For any given \(\theta \) let \(Y(s,\theta )\) be the Laplace transform of \(\mathscr {Y}_\theta (\cdot )\). Let \(\nabla Y(s,\theta )=\frac{\partial Y}{\partial \theta }(s,\theta )=\left[ \frac{\partial Y}{\partial \theta _1}~\cdots ~\frac{\partial Y}{\partial \theta _N} \right] (s,\theta ). \)
Proposition 5
If, for some \(L\in \mathbb {N}\), a set of points (or \(\mathbb {C}\)) exists such that the matrix
has full column rank, then \(\{\mathscr {Y}_\theta :~\theta \in {\varTheta }\}\) is locally identifiable at \(\theta ^*\) (in the sense of Definition 1(a)).
Now assume that the elements of \(Y(s,\theta )\) are ratios of polynomials in the entries of \(\theta \).
Corollary 1
If, for a given set of points and a given \(\theta ^*\), matrix is full column rank, then \(\{\mathscr {Y}_\theta :~\theta \in {\varTheta }\}\) is locally identifiable (a.e. in the sense of Definition 1(b)).
In the present paper, the Laplace transforms that are used to discuss identifiability belong to this last class (see [2]), whence Corollary 1 applies. In practice, these conditions can be easily checked by the use of the Matlab Symbolic Math Toolbox and evaluation of the rank conditions based on a finite set of heuristically chosen points (see again [2]).
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Cinquemani, E. (2016). On Observability and Reconstruction of Promoter Activity Statistics from Reporter Protein Mean and Variance Profiles. In: Cinquemani, E., Donzé, A. (eds) Hybrid Systems Biology. HSB 2016. Lecture Notes in Computer Science(), vol 9957. Springer, Cham. https://doi.org/10.1007/978-3-319-47151-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-47151-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47150-1
Online ISBN: 978-3-319-47151-8
eBook Packages: Computer ScienceComputer Science (R0)