Abstract
Gaussian process dynamical systems (GPDS) represent Bayesian nonparametric approaches to inference of nonlinear dynamical systems, and provide a principled framework for the learning of biological networks from multiple perturbed time series measurements of gene or protein expression. Such approaches are able to capture the full richness of complex ODE models, and can be scaled for inference in moderately large systems containing hundreds of genes. Related hierarchical approaches allow for inference from multiple datasets in which the underlying generative networks are assumed to have been rewired, either by context-dependent changes in network structure, evolutionary processes, or synthetic manipulation. These approaches can also be used to leverage experimentally determined network structures from one species into another where the network structure is unknown. Collectively, these methods provide a comprehensive and flexible platform for inference from a diverse range of data, with applications in systems and synthetic biology, as well as spatiotemporal modelling of embryo development. In this chapter we provide an overview of GPDS approaches and highlight their applications in the biological sciences, with accompanying tutorials available as a Jupyter notebook from https://github.com/cap76/GPDS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kholodenko BN, Kiyatkin A, Bruggeman FJ, Sontag E, Westerhoff HV, Hoek JB (2002) Untangling the wires: a strategy to trace functional interactions in signaling and gene networks. Proc Natl Acad Sci 99(20):12841–12846
Elowitz MB, Leibler S (2000) A synthetic oscillatory network of transcriptional regulators. Nature 403(6767):335–338
Gardner TS, Cantor CR, Collins JJ (2000) Construction of a genetic toggle switch in Escherichia coli. Nature 403(6767):339–342
Zak DE, Gonye GE, Schwaber JS, Doyle FJ (2003) Importance of input perturbations and stochastic gene expression in the reverse engineering of genetic regulatory networks: insights from an identifiability analysis of an in silico network. Genome Res 13(11):2396–2405
Locke J, Millar A, Turner M (2005) Modelling genetic networks with noisy and varied experimental data: the circadian clock in Arabidopsis thaliana. J Theor Biol 234(3):383–393
Pokhilko A, Mas P, Millar AJ (2013) Modelling the widespread effects of toc1 signalling on the plant circadian clock and its outputs. BMC Syst Biol 7(1):23
Fogelmark K, Troein C (2014) Rethinking transcriptional activation in the Arabidopsis circadian clock. PLoS Comput Biol 10(7):e1003705
Domijan M, Rand DA (2015) Using constraints and their value for optimization of large ode systems. J R Soc Interface 12(104):20141303
De Caluwé J, Xiao Q, Hermans C, Verbruggen N, Leloup JC, Gonze D (2016) A compact model for the complex plant circadian clock. Front Plant Sci 7:74
Ashall L, Horton CA, Nelson DE, Paszek P, Harper CV, Sillitoe K, Ryan S, Spiller DG, Unitt JF, Broomhead DS et al (2009) Pulsatile stimulation determines timing and specificity of NF-κB-dependent transcription. Science 324(5924):242–246
Wang Y, Paszek P, Horton CA, Yue H, White MR, Kell DB, Muldoon MR, Broomhead DS (2012) A systematic survey of the response of a model nf-κb signalling pathway to tnfα stimulation. J Theor Biol 297:137–147
Jonak K, Kurpas M, Szoltysek K, Janus P, Abramowicz A, Puszynski K (2016) A novel mathematical model of atm/p53/nf-κ b pathways points to the importance of the DDR switch-off mechanisms. BMC Syst Biol 10(1):75
Calderhead B, Girolami M, Lawrence ND (2009) Accelerating Bayesian inference over nonlinear differential equations with Gaussian processes. In: Advances in neural information processing systems, pp 217–224
Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MP (2009) Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface 6(31):187–202
Liepe J, Kirk P, Filippi S, Toni T, Barnes CP, Stumpf MP (2014) A framework for parameter estimation and model selection from experimental data in systems biology using approximate Bayesian computation. Nat Protoc 9(2):439–456
Beaumont MA, Rannala B (2004) The Bayesian revolution in genetics. Nat Rev Genet 5(4):251–261
Hjort N, Holmes C, Müller P, Walker S (eds) (2010) Bayesian nonparametrics. Cambridge University Press, Cambridge
Murray-Smith R, Johansen TA, Shorten R (1999) On transient dynamics, off-equilibrium behaviour and identification in blended multiple model structures. In: 1999 European control conference (ECC). IEEE, Piscataway, pp 3569–3574
Murray-Smith R, Girard A (2001) Gaussian process priors with ARMA noise models. In: Irish signals and systems conference, Maynooth, pp 147–152
Girard A, Rasmussen CE, Candela JQ, Murray-Smith R (2003) Gaussian process priors with uncertain inputs application to multiple-step ahead time series forecasting. In: Advances in neural information processing systems, pp 545–552
Leithead W, Solak E, Leith D (2003) Direct identification of nonlinear structure using Gaussian process prior models. In: European control conference (ECC), 2003. IEEE, Piscataway, pp 2565–2570
Sbarbaro D, Murray-Smith R (2005) Self-tuning control of non-linear systems using Gaussian process prior models. In: Switching and learning in feedback systems. Springer, Berlin, pp 140–157
Cunningham J, Ghahramani Z, Rasmussen CE (2012) Gaussian processes for time-marked time-series data. In: International conference on artificial intelligence and statistics, pp 255–263
Frigola R, Lindsten F, Schön TB, Rasmussen CE (2014) Identification of Gaussian process state-space models with particle stochastic approximation EM. IFAC Proc Vol 47(3):4097–4102
Frigola R, Chen Y, Rasmussen CE (2014) Variational Gaussian process state-space models. In: Advances in neural information processing systems, pp 3680–3688
Klemm S et al (2008) Causal structure identification in nonlinear dynamical systems. Department of Engineering, University of Cambridge, Cambridge
Penfold CA, Wild DL (2011) How to infer gene networks from expression profiles, revisited. Interface Focus 1(6):857–870
Penfold CA, Millar JB, Wild DL (2015) Inferring orthologous gene regulatory networks using interspecies data fusion. Bioinformatics 31(12):i97–i105
Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning, vol 2. MIT Press, Cambridge
Lloyd JR, Duvenaud D, Grosse R, Tenenbaum JB, Ghahramani Z (2014) Automatic construction and natural-language description of nonparametric regression models. Preprint. arXiv:14024304
Yang J, Penfold CA, Grant MR, Rattray M (2016) Inferring the perturbation time from biological time course data. Bioinformatics 32:2956–2964
Penfold CA, Sybirna A, Reid J, Huang Y, Wernisch L, Grant M, Ghahramani Z, Surani MA (2017) Nonparametric Bayesian inference of transcriptional branching and recombination identifies regulators of early human germ cell development. bioRxiv p 167684
Penfold CA, Sybirna A, Reid J, Huang Y, Wernisch L, Ghahramani Z, Grant M, Surani MA (2018) Branch-recombinant Gaussian processes for analysis of perturbations in biological time series. Bioinformatics, 34(17):i1005–i1013
Boukouvalas, Alexis, Hensman J, Rattray M (2018) BGP: identifying gene-specific branching dynamics from single-cell data with a branching Gaussian process. Genome biology 19.1:65
Äijö T, Lähdesmäki H (2009) Learning gene regulatory networks from gene expression measurements using non-parametric molecular kinetics. Bioinformatics 25(22):2937– 2944
Solak E, Murray-Smith R, Leithead WE, Leith DJ, Rasmussen CE (2003) Derivative observations in Gaussian process models of dynamic systems. In: Advances in neural information processing systems, pp 1057–1064
Penfold CA, Shifaz A, Brown PE, Nicholson A, Wild DL (2015) Csi: a nonparametric Bayesian approach to network inference from multiple perturbed time series gene expression data. Stat Appl Genet Mol Biol 14(3):307–310
Polanski K, Gao B, Mason SA, Brown P, Ott S, Denby KJ, Wild DL (2017) Bringing numerous methods for expression and promoter analysis to a public cloud computing service. Bioinformatics 1:3
Rabani M, Levin JZ, Fan L, Adiconis X, Raychowdhury R, Garber M, Gnirke A, Nusbaum C, Hacohen N, Friedman N et al (2011) Metabolic labeling of RNA uncovers principles of RNA production and degradation dynamics in mammalian cells. Nat Biotechnol 29(5):436–442
Li L, Nelson C, Fenske R, Trösch J, Pružinská A, Millar AH, Huang S (2017) Changes in specific protein degradation rates in Arabidopsis thaliana reveal multiple roles of lon1 in mitochondrial protein homeostasis. Plant J 89(3):458–471
D’Amour KA, Agulnick AD, Eliazer S, Kelly OG, Kroon E, Baetge EE (2005) Efficient differentiation of human embryonic stem cells to definitive endoderm. Nat Biotechnol 23(12):1534–1541
Wang P, Rodriguez RT, Wang J, Ghodasara A, Kim SK (2011) Targeting SOX17 in human embryonic stem cells creates unique strategies for isolating and analyzing developing endoderm. Cell Stem Cell 8(3):335–346
Viotti M, Nowotschin S, Hadjantonakis AK (2014) SOX17 links gut endoderm morphogenesis and germ layer segregation. Nat Cell Biol 16(12):1146–1156
Kobayashi T, Zhang H, Tang WW, Irie N, Withey S, Klisch D, Sybirna A, Dietmann S, Contreras DA, Webb R et al (2017) Principles of early human development and germ cell program from conserved model systems. Nature 546:416–420
Irie N, Weinberger L, Tang WW, Kobayashi T, Viukov S, Manor YS, Dietmann S, Hanna JH, Surani MA (2015) SOX17 is a critical specifier of human primordial germ cell fate. Cell 160(1):253–268
Werhli AV, Husmeier D (2008) Gene regulatory network reconstruction by Bayesian integration of prior knowledge and/or different experimental conditions. J Bioinform Comput Biol 6(03):543–572
Penfold CA, Buchanan-Wollaston V, Denby KJ, Wild DL (2012) Nonparametric Bayesian inference for perturbed and orthologous gene regulatory networks. Bioinformatics 28(12):i233–i241
Oates CJ, Korkola J, Gray JW, Mukherjee S et al (2014) Joint estimation of multiple related biological networks. Ann Appl Stat 8(3):1892–1919
Hickman R, Hill C, Penfold CA, Breeze E, Bowden L, Moore JD, Zhang P, Jackson A, Cooke E, Bewicke-Copley F et al (2013) A local regulatory network around three NAC transcription factors in stress responses and senescence in Arabidopsis leaves. Plant J 75(1):26–39
Kashima H, Yamanishi Y, Kato T, Sugiyama M, Tsuda K (2009) Simultaneous inference of biological networks of multiple species from genome-wide data and evolutionary information: a semi-supervised approach. Bioinformatics 25(22):2962–2968
Gholami AM, Fellenberg K (2010) Cross-species common regulatory network inference without requirement for prior gene affiliation. Bioinformatics 26(8):1082–1090
Zhang X, Moret BM (2010) Refining transcriptional regulatory networks using network evolutionary models and gene histories. Algorithms Mol Biol 5(1):1
Joshi A, Beck Y, Michoel T (2015) Multi-species network inference improves gene regulatory network reconstruction for early embryonic development in Drosophila. J Comput Biol 22(4):253–265
Shervashidze N, Schweitzer P, Leeuwen EJv, Mehlhorn K, Borgwardt KM (2011) Weisfeiler-Lehman graph kernels. J Mach Learn Res 12(Sep):2539–2561
Turing A (1952) The chemical basis of morphogenesis. Phil Trans R Soc Lond B 237:37–72
Kondo S, Miura T (2010) Reaction-diffusion model as a framework for understanding biological pattern formation. Science 329(5999):1616–1620
Müller P, Rogers KW, Jordan BM, Lee JS, Robson D, Ramanathan S, Schier AF (2012) Differential diffusivity of nodal and lefty underlies a reaction-diffusion patterning system. Science 336(6082):721–724
Pisarev A, Poustelnikova E, Samsonova M, Reinitz J (2008) Flyex, the quantitative atlas on segmentation gene expression at cellular resolution. Nucleic Acids Res 37(Suppl 1):D560–D566
Poustelnikova E, Pisarev A, Blagov M, Samsonova M, Reinitz J (2004) A database for management of gene expression data in situ. Bioinformatics 20(14):2212–2221
Kozlov K, Gursky V, Kulakovskiy I, Samsonova M (2014) Sequence-based model of gap gene regulatory network. BMC Genomics 15(12):S6
Purnick PE, Weiss R (2009) The second wave of synthetic biology: from modules to systems. Nat Rev Mol Cell Biol 10(6):410–422
Khalil AS, Collins JJ (2010) Synthetic biology: applications come of age. Nat Rev Genet 11(5):367–379
Windram OP, Rodrigues RT, Lee S, Haines M, Bayer TS (2017) Engineering microbial phenotypes through rewiring of genetic networks. Nucleic Acids Res 45(8):4984–4993
Isalan M, Lemerle C, Michalodimitrakis K, Horn C, Beltrao P, Raineri E, Garriga-Canut M, Serrano L (2008) Evolvability and hierarchy in rewired bacterial gene networks. Nature 452(7189):840–845
Lee MJ, Albert SY, Gardino AK, Heijink AM, Sorger PK, MacBeath G, Yaffe MB (2012) Sequential application of anticancer drugs enhances cell death by rewiring apoptotic signaling networks. Cell 149(4):780–794
Acknowledgements
CAP is supported by the Wellcome Trust (grant 083089/Z/07/Z). IG is supported by EPSRC/BBSRC research grant EP/L016494/1. AS is supported by a 4-year Wellcome Trust PhD Scholarship and Cambridge International Trust Scholarship. DLW acknowledges support from the Engineering and Physical Science Research Council (grant EP/R014337/1).
CAP, IG, and AS BBSRC-EPSRC funded OpenPlant Synthetic Biology Research Centre (BB/L014130/1) through the OpenPlant Fund scheme. CAP and AS also thank M. Azim Surani for his support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Penfold, C.A., Gherman, I., Sybirna, A., Wild, D.L. (2019). Inferring Gene Regulatory Networks from Multiple Datasets. In: Sanguinetti, G., Huynh-Thu, V. (eds) Gene Regulatory Networks. Methods in Molecular Biology, vol 1883. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8882-2_11
Download citation
DOI: https://doi.org/10.1007/978-1-4939-8882-2_11
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-8881-5
Online ISBN: 978-1-4939-8882-2
eBook Packages: Springer Protocols