Abstract
Biological systems are often modelled at different levels of abstraction depending on the particular aims/resources of a study. Such different models often provide qualitatively concordant predictions over specific parametrisations, but it is generally unclear whether model predictions are quantitatively in agreement, and whether such agreement holds for different parametrisations. Here we present a generally applicable statistical machine learning methodology to automatically reconcile the predictions of different models across abstraction levels. Our approach is based on defining a correction map, a random function which modifies the output of a model in order to match the statistics of the output of a different model of the same system. We use two biological examples to give a proof-of-principle demonstration of the methodology, and discuss its advantages and potential further applications.
GC and GS gratefully acknowledge support from the European Research Council under grant MLCS306999. LB acknowledges partial support from the EU project QUANTICOL, 600708, and by FRA-UniTS. We thank Dimitris Milios for useful discussions and for providing us with the MATLAB for heteroscedastic regression.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
\(\mathsf{{M}}\) could be complex to analyze either because of its structure, e.g., it might have many variables, or its numerical hurdles, e.g., the degree of non-linearity or parameters stiffness. For similar reasons, we do not care whether \(\mathsf{{m}}\) is has been derived by means of independent domain-knowledge or automatic techniques.
- 2.
In principle, even \(\mathsf{{m}}\) might have a set of free variables, with respect to \(\mathsf{{M}}\). However, as we have full control over that model, we could assume a parametrization of such variables and all what follows would be equivalent.
- 3.
In this work, we use the classic Gaussian kernel fixing hyperparameters by maximising the type-II likelihood; see [12].
References
Aitken, S., Alexander, R.D., Beggs, J.D.: A rule-based kinetic model of rna polymerase ii c-terminal domain phosphorylation. J Roy. Soc. Interface 10(86), 20130438 (2013)
Alur, R., Feder, T., Henzinger, T.A.: The benefits of relaxing punctuality. J. ACM 43(1), 116–146 (1996)
Barber, D.: Bayesian Reasoning and Machine Learning. Cambridge University Press, Cambridge (2012)
Bortolussi, L., Milios, D., Sanguinetti, G.: Smoothed model checking for uncertain continuous-time markov chains. Inf. Comput. 247, 235–253 (2016)
Bortolussi, L., Sanguinetti, G.: Learning and designing stochastic processes from logical constraints. In: Joshi, K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) QEST 2013. LNCS, vol. 8054, pp. 89–105. Springer, Heidelberg (2013)
Caravagna, G.: Formal modeling and simulation of biological systems with delays. Ph.D. thesis, University of Pisa (2011)
Cressie, N., Wikle, C.K.: Statistics for Spatio-Temporal Data. Wiley, New York (2015)
Hoyle, D.C., Rattray, M., Jupp, R., Brass, A.: Making sense of microarray data distributions. Bioinformatics 18(4), 576–584 (2002)
Kennedy, M.C., O’Hagan, A.: Bayesian calibration of computer models. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 63(3), 425–464 (2001)
Lawrence, N.D., Sanguinetti, G., Rattray, M.: Modelling transcriptional regulation using gaussian processes. In: Advances in Neural Information Processing Systems, pp. 785–792 (2006)
Noble, D.: Modeling the heart-from genes to cells to the whole organ. Science 295(5560), 1678–1682 (2002)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
All the code that replicate these analysis is available at the corresponding author’s webpage, and hosted on Github (repository GP-correction-maps).
1.1 A.1 Further Details on the Examples
The two models from Sect. 5.1 correspond to these systems of differential equations
which we solved in MATLAB with the ode45 routine with all parameters (InitialStep, MaxStep, RelTol and AbsTol) set to 0.01.
Concerning the Protein Translation Network (PTN) in Sect. 5.2, the set of reactions and their propensity functions that we can use to derive a Continuous Time Markov Chain model of the network are the following. Here \(\varvec{x}\) denotes a generic state of the system and, for instance, \(\varvec{x}_\mathsf{{mRNA}}{}\) the number of mRNA copies in \(\varvec{x}\).
The reduced PTN model is a special of this reactions set where transcription and mRNA decay are omitted. In this case we used StochPy to simulate the models and generate the input data per regression – see http://stochpy.sourceforge.net/; data sampling exploits python parallelism to reduce execution times.
For regression, we used the Gaussian Processes for Machine Learning toolbox for fixed-variance regression, see http://www.gaussianprocess.org/gpml/code/matlab/doc/ and a custom implementation of the other forms of regression.
1.2 A.2 Proofs
Proof of Theorem 1
Proof
Both the empiricals and nested estimator rely on an unbiased estimator of the mean/variance, which means that if \(k\rightarrow \infty \), i.e., we sample all possible values for the free variables, we would have a true model of \(\overline{y}\) \(\sigma \). This means that, for each sampled value from \(\varTheta \), even the simplest \(\overline{\sigma }\)-estimator would be equivalent, in expectation, to the marginalization of the free variables. This is enough, combined with properties of Gaussian Processes regression (i.e., the convergence to the true model with infinite training points), to state that the overall approach leads to an unbiased estimator of the correction map.
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Caravagna, G., Bortolussi, L., Sanguinetti, G. (2016). Matching Models Across Abstraction Levels with Gaussian Processes. In: Bartocci, E., Lio, P., Paoletti, N. (eds) Computational Methods in Systems Biology. CMSB 2016. Lecture Notes in Computer Science(), vol 9859. Springer, Cham. https://doi.org/10.1007/978-3-319-45177-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-45177-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45176-3
Online ISBN: 978-3-319-45177-0
eBook Packages: Computer ScienceComputer Science (R0)