Bayesian model selection for the Drosophila gap gene network
Abstract
Background
The gap gene system controls the early cascade of the segmentation pathway in Drosophila melanogaster as well as other insects. Owing to its tractability and key role in embryo patterning, this system has been the focus for both computational modelers and experimentalists. The gap gene expression dynamics can be considered strictly as a onedimensional process and modeled as a system of reactiondiffusion equations. While substantial progress has been made in modeling this phenomenon, there still remains a deficit of approaches to evaluate competing hypotheses. Most of the model development has happened in isolation and there has been little attempt to compare candidate models.
Results
The Bayesian framework offers a means of doing formal model evaluation. Here, we demonstrate how this framework can be used to compare different models of gene expression. We focus on the PapatsenkoLevine formalism, which exploits a fractional occupancy based approach to incorporate activation of the gap genes by the maternal genes and crossregulation by the gap genes themselves. The Bayesian approach provides insight about relationship between system parameters. In the regulatory pathway of segmentation, the parameters for number of binding sites and binding affinity have a negative correlation. The model selection analysis supports a stronger binding affinity for Bicoid compared to other regulatory edges, as shown by a larger posterior mean. The procedure doesn’t show support for activation of Kruppel by Bicoid.
Conclusions
We provide an efficient solver for the general representation of the PapatsenkoLevine model. We also demonstrate the utility of Bayes factor for evaluating candidate models for spatial pattering models. In addition, by using the parallel tempering sampler, the convergence of Markov chains can be remarkably improved and robust estimates of Bayes factors obtained.
Keywords
Gap genes Reactiondiffusion equations Bayesian model selection Parallel tempering Bayes factorAbbreviations
 AP
AnteriorPosterior
 Bcd
Bicoid
 Gt
Giant
 Hb
Hunchback
 HMC
Hamiltonian monte carlo
 Kni
Knirps
 Kr
Kruppel
 MCMC
Markov Chain monte carlo
 MH,
Metropolishastings
 MLE
Maximum likelihood estimate
 ODEs
Ordinary differential equations
 PDEs
Partial differential equations
 PTMCMC
Parallel tempering markov chain monte carlo
 Tll
Tailless
Background
In this paper, we explore models for the developmental process of segmentation in Drosophila, providing an efficient model solver. We use the Bayesian framework for inference and model selection. The process by which multicellular organisms develop from a single fertilized cell has been the focus of much attention. It was postulated that organisms are patterned by gradients of certain formproducing substances. Boveri [1] and Horstadius [2] used this idea to explain the patterning of the sea urchin embryo. The idea was given further impetus by the discovery of the Spemann organizer [3] which suggested that morphogenesis is the result of signals released from localized group of cells. In 1952, Turing, working on the problem of spatial patterning, coined the term morphogen to describe ‘formproducers’. He used mathematical models to show that chemical substances could selforganize into patterns starting from homogeneous distributions [4]. However, a definitive example of a morphogen was only provided in 1987 by the discovery of Bicoid function in the Drosophila embryo [5, 6] and subsequent visualization of its gradient [7, 8]. Not surprisingly, patterning in the Drosophila embryo has been the focus of both developmental and systems biologists.
The formation of several broad gap gene [9] expression patterns within the first two hours of development characterizes early Drosophila embryogenesis. Taken together, the gap genes constitute one of the four regulatory layers in the cascade of segmentation pathway in Drosophila embryo. Expression of gap genes is regulated by maternal genes [10] and they also participate in mutual repression [11]. Thus, activation by maternal gradients, combined with spatially specific gapgap cross repression helps to establish, sharpen and maintain the broad overlapping domains of the gap gene expression along the AnteriorPosterior (AP) axis. The gap gene network is one of the few examples of a developmental gene network which has been studied extensively using datadriven mathematical models [12, 13, 14] in order to reconstruct the regulatory structure of the gap gene network. However, there continues to be active discussion [15, 16] on how maternal gradients and mutual gap gene repression contribute to the formation of gap stripes.
Mathematical representation of the gap gene network through quantitative dynamical systems has helped investigate regulatory structure of this network along with specific properties of this representation such as the strength of interaction, cooperativity of regulators, etc. However, there is a deficit of a rigorous framework within which putative representations can be compared and allows one to conduct formal statistics of relative fit. In a seminal paper, Jaeger et al. [12] used a dynamical model where a genetic interconnectivity matrix described the regulatory parameters. Based on measures of model fit, they argued that dual regulatory action of Hunchback on Kruppel is not essential for to explain gap gene domain formation. While this may be valid, they do not provide a relative goodness of fit of the model against a representation that assumes dualregulation. Perkins et al. [17] did an extensive study of gap gene regulatory relationships and compared proposed networks in literature. However, their study does not provide a measure of statistical significance for model comparison. Essentially, the question we want to ask is how to chose between competing hypothesis for the network structure in a statistically rigorous manner ? In addition, real data is often contaminated with measurement noise and we need methods that can help us deal with this uncertainty.
Addressing the latter point, one way to handle error associated with experimental observations is to model it as Gaussian noise. If we know or are willing to assume a model for the error variance, then an estimate of the parameters can be sought by maximizing the likelihood in a least squares sense. This is the maximum likelihood estimate (MLE) [18] of the parameters. However, this point estimate suffers from being unrepresentative and is often intractable, especially if the likelihood is multimodal.
An alternative approach is the Bayesian framework which allows one to not only account for experimental error by propagating it to the model parameters but also a way to integrate our prior beliefs on the distribution of model parameters. In this manner, a posterior distribution of the model parameters is obtained which encapsulates our belief in the parameter values given uncertainty in measurement. Indeterminacy of model parameters and correlations between indeterminate parameters are incorporated into the marginal likelihood (evidence). Direct computations of integrals involved in Bayesian methods are difficult and so researchers tend to use Markov chain Monte Carlo (MCMC) methods like Gibbs sampling or MetropolisHastings algorithm [19]. Bayesian approaches have enjoyed great success in genetics [20] and we and others [21] expect that they will provide more satisfactory solutions to inference problems in computational systems biology.
In addition, the Bayesian approach allows us to assess which of the competing models is better supported by the data by comparing the ratios of marginal likelihood of the models. The process of comparing models is more formally known as model selection and the ratio of marginal likelihoods is also called the Bayes factor [22]. It follows, that in order to use Bayes factors, one needs to estimate the marginal likelihood of a model. However, this task becomes increasingly intractable with growing model dimensionality and a conventional MetropolisHastings sampling approach generally leads to poor mixing properties and unreliable conclusions. To overcome this difficulty, we use the parallel tempering Markov chain Monte Carlo (PTMCMC) sampling technique [23]. Briefly, this method runs parallel chains at different temperatures (or degree of smoothness of likelihood surface) and allows exchanges between the chains based on the Hastings ratio. The end result is a chain that mixes well and also doesn’t get stuck in local optima. Another benefit of this approach is that it allows one to use path integration to compute the thermodynamic estimator [24] of the marginal likelihood. This estimator has been shown to be reliable when working with Bayes factors [25] in the context of differential equations.
We currently focus on the PapatsenkoLevine formalism [26], which exploits a fractional occupancy based approach to incorporate activation of the gap genes by the maternal genes and crossregulation by the gap genes themselves. An advantage of this formalism is that it incorporates nonlinear effects between regulatory interactions and is closer to a mechanistic view of how regulation in this system occurs [27]. While in their paper, Papatsenko & Levine assumed that network structure is known a priori, our approach allows one to choose from competing network topologies reported in the literature and to vary strength of interactions between gap genes. It is worth mentioning here that although we consider models of increasing complexity, Bayes factors allows model comparison without concerns of overfitting, that is, they allow one to implicitly control for model dimensionality [28].
Methods
Expression data
Model solution
Timevarying systems can be modeled with ordinary differential equations (ODEs) which have efficient solvers available (for example, [30]). However, in pattern formation gene expression varies both in time and space and partial differential equations (PDEs) are the suitable method for characterizing this process. Closed form solutions for PDEs exist only in the most simplest of cases and numerical solutions need to be employed. Packaged solvers for PDEs do exist [31] and some like deal.II [32] have been used in systems biology applications [33, 34, 35]. However, due to the overhead of generalizability and computational tractability in structuring models, we wrote our own solver.
Here, u_{i}(x,t) represents the expression of gap gene, i, at time t and position x with Neumann boundary conditions, i.e., we assume that flux at the boundaries is zero. α represents the production rate, β is the linear decay rate and D is the diffusion constant. L denotes the length of the embryo and T corresponds to cleavage cycle 14.4 which marks the start of gastrulation. P^{A} and P^{B} are respectively combined activation and repression effects of regulators for each gap gene. These regulatory effects are a function of the gap gene expression and its binding affinity (K), cooperativity rate (C_{o}) and the number of binding sites (N_{s}). (Details in the Additional file 1 text.)
Parameter estimation

p(θY) is the posterior density of the parameters

L(θ;Y) is the likelihood of the data as elaborated above

π(θ) is the prior belief of the parameter

p(Y) is the marginal likelihood
Instead, we rely on the Markov chain Monte Carlo [39] method used for highdimensional sampling. The idea behind these methods is to draw samples from the stationary distribution of a Markov chain. When set up correctly, this distribution produces samples from the posterior distribution. The marginal likelihood itself, however, is relevant for model selection and we will return to its estimation in the “Model selection” section.
MetropolisHastings sampling
The terms are as defined previously and we note that the marginal likelihood term has conveniently canceled out in denominator. The proposal q(··) is usually taken to be a Gaussian, however, we note that in our case, the number of sites parameter, N_{s}, is discrete. Accordingly, we define the proposal density as a mixed density. With probability, p<1/10, we perturb N_{s} by either increasing or decreasing it by 1 with equal probability, while keeping the rest of the parameters unchanged. Else, we perturb each of the other parameters based on a Gaussian centered at the current value of the parameters, θ_{i} and with variance 0.1I, where I is the identity matrix. We use bounded uniform prior on all the parameters.
Parallel tempered MCMC sampling
 1
Initial start positions are assigned to each chain, Θ=(θ_{1},…,θ_{N})
 2
Associate each chain with a temperature based on a temperature ladder, (Θ,t)=(θ_{1},t_{1},…θ_{N},t_{N})
 3Repeat till convergence of all chains
 (a)
Apply local MetropolisHastings update step to each chain
 (b)
Pick two neighboring chains at different temperature. Assume states θ_{i} and θ_{j} for N pairs (i,j) with i sampled uniformly in (1,…,N) and j=i±1 with probability p_{e}(θ_{i},θ_{j}) where p_{e}(θ_{i},θ_{i+1})=p_{e}(θ_{i},θ_{i−1})=0.5 and p_{e}(θ_{1},θ_{2})=p_{e}(θ_{N},θ_{N−1})=1
 (c)
Exchange the state of the chains based on acceptance ratio
 (a)
 4
Use chain with lowest temperature for estimating posterior density
While the chain at the lowest temperature can be used for parameter inference, all the chains together can be used to estimate the marginal likelihood [25] and in turn calculate Bayes factors for Bayesian model comparison for model ranking. It is this aspect that we turn to next.
Model selection
The quantity in brackets is the ratio of the marginal likelihoods of the two models and is termed the Bayes factors. When we have no prior preference of one model over the other, we assume p(M_{1})=p(M_{2}) and then the ratio of likelihoods is exactly equal to the Bayes factor. In essence, then, the problem of model selection boils down to the problem of estimating the marginal likelihood.
However, in practice this is a poor estimator unless working with very large sample sizes. Similarly, the importance sampling based the posterior harmonic mean estimator has been shown [42, 43] to be a very poor estimator.
The expectation with respect to the posterior at each temperature on the ladder can be approximated using the Monte Carlo estimate. For all the models we used a temperature schedule with N=10 according to an exponential ladder \(t_{i} = \left (\frac {i}{N}\right)^{5}, i = 1,..., N\) as suggested in [25].
Model overfitting
The process of model selection described above helps guard against choosing overparameterized models by penalizing them implicitly for higher dimensionality. This ability of Bayes factors to prioritize simpler models over complex ones has also been discussed elsewhere [28, 41].
However, as we consider relative goodness of fit amongst models, there might still be an argument that the best chosen model does overfit the data. One way to test model overfitting is crossvalidation [44]. In such an approach, usually, we can envisage excluding some of the data (validation set) during model fitting step and then testing the accuracy of the model on this heldout data set. An overfit model would perform well on the fitted data but poorly on the heldout dataset.
 1
We fit the model to the data y_{1},⋯,y_{m}, where m is chosen such that 1,⋯,m corresponds to the first 60% of the data, drawn sequentially across the embryo axis.
 2
We use the fitted model to predict for the next 5% of observations and compute the loglikelihood.
 3
Repeat steps 1 & 2, adding 5% of the data set to training set and predict the next 5%.
 4
Finally, compute the mean loglikelihood from the predictions made above.
As our data is stratified, we ensure that the training set draws evenly from expression observation of the gap genes, i.e. we pick the initial 60% of the observations from each of the four gap genes to train the model. Similarly, predictions are made on the next 5% of the observations for each gap gene.
The models, solver and MCMC sampler were coded using the python programming language. PyMC [45] was used for certain diagnostic visualizations. The code for reproducing the analysis is available on GitHub at the repository: https://github.com/asifzubair/BayesianModelSelection.
Results
The Drosophila gap gene network has been the subject of intense study from both experimentalists and computational modelers. Despite this, efforts to compare proposed network hypothesis in a statistically rigorous manner have been few and far between. Here, we propose to use the Bayesian framework for doing parameter inference and model selection. The Bayesian framework permits one to do a fully probabilistic analysis of model system allowing one to account for uncertainty in parameter estimates and model fit. We employ an MCMC approach using the parallel tempering (PTMCMC) sampler to do Bayesian analysis. This sampler not only allows for better convergence but also helps one to compute the thermodynamic estimator for marginal likelihood. Other sampling approaches for accelerating convergence like adaptive MCMC [46] and Hamiltonian Monte Carlo (HMC) [47] exist. However, these samplers require all the parameters to be continuous whereas the PTMCMC sampler does not have such a restriction. In addition, they do not have the benefit of providing a natural way to estimate the marginal likelihood like the PTMCMC sampler does. Using estimates of the marginal likelihood, we use Bayes factor to compare between models.
Specifications for all 6 models evaluated
Models  A6  B7  B7r  C8  D7  D8 

Global parameters:  
Affinity(logKa)  K  K  K  K  K  K 
Cooperativity  C _{ o}  C _{ o}  C _{ o}  C _{ o}  C _{ o}  C _{ o} 
Binding Sites  N _{ s}  N _{ s}  N _{ s}  N _{ s}  N _{ s}  N _{ s} 
Syn./Decay  α  α  α  α  α  α 
Diffusion  D  D  D  D  D  D 
Max. conc  50  50  50  50  50  50 
Nodespecific binding affinities:  
Bcd ^{ A}  K  K  K _{3}  K _{3}  K  K _{3} 
Bcd ^{ R}  K  K  K  K  K  K 
Cad ^{ A}  K  K  K  K  K  K 
Hb ^{ A}  K  K  K  K  K  K 
Hb ^{ D}  K  K  K  K  K  K 
Hb ^{ R}  K _{1}  K _{1}  K _{1}  K _{1}  K _{1}  K _{1} 
Gt ^{ R}  K  K  K  K  K  K 
Kr ^{ R}  K _{1}  K _{2}  K _{1}  K _{2}  K _{2}  K _{2} 
Kni ^{ R}  K  K  K  K  K  K 
Tll ^{ R}  K  K  K  K  K  K 
Open Parameters:  6  7  7  8  7  8 
In their paper, Papatsenko & Levine [26] fit each of the models (A6, B7 and C8) separately by maximizing an objective function based on the correlation measured between the model and the data. They use the final correlation value to distinguish between the models. Their formulation and analysis showed that the gap gene network can be modeled using a more modular approach, involving two relatively independent network domains. In addition, they show close agreement of parameter estimates and experimentally observed values for most parameters. However, their approach to compare the models themselves is slightly problematic as it does not apply appropriate penalties for increasing model dimensionality. Bayes factors apply this penalty implicitly and so adhere to the notion of Occam’s razor of favoring simple hypothesis over complex ones. Moreover, Papatsenko & Levine do not offer a measure of statistical significance to justify model choice and rely on an adhoc notion of overfitting. We enhance their fundamentally sound approach by allowing for statistically rigorous model selection and also allow for comparing competing network hypothesis.
Efficient model solver
Convergence of MCMC runs
Time to convergence for MCMC samplers can be sensitive to initial start points. To overcome this, some approaches try to initialize the sampler from the MLE estimate of the likelihood function. This approach suffers from the same pitfalls as optimization algorithms, in that the sampler may not sample the whole likelihood space and the evidence of convergence may be misleading.
To ensure that the sampler had indeed converged, we initialized the chain from random start points drawn from a uniform prior. We used the GelmanRubin statistic [50] to monitor convergence of the chains. This diagnostic uses multiple chains to check for lack of convergence, and is based on the notion that if multiple chains have converged, by definition they should appear very similar to one another. The GelmanRubin statistic uses an analysis of variance approach to assessing convergence by calculating both the betweenchain variance and withinchain variance to assess whether chains have indeed converged. We used the gelman.plot() function from the R [51] package coda [52] to plot the GelmanRubin statistic. It calculates the GelmanRubin shrink factor (R) repeatedly, first calculating with 50 observations and then adding bins of 10 observations iteratively. For convergence, we would ideally want the shrink factor to be below 1.2.
Marginal likelihood and Bayes factors
Criteria due to Kass & Rafferty [22] for interpretation of Bayes factor as evidence support categories
2log_{e}(B)  B  Evidence against H_{0} 

0 to 2  1 to 3  Not worth more than a bare mention 
2 to 6  3 to 20  Substantial 
6 to 10  20 to 150  Strong 
>10  >150  Very strong 
Gene expression profiles
Overfitting analysis
We tested the best performing model (according to Bayes factor criteria), model B7r, for overfitting. We used a modified crossvalidation (CV) approach for testing overfitting (see methods). In each CVfold, we fit the model to the training set and then draw 100 samples from the posterior parameter distribution. The posterior samples are used to predict values for the held out set. We use the mean loglikelihood metric as prediction accuracy measure. As a Gaussian error model is used, the mean loglikelihood is proportional to the residual error in this case. The mean loglikelihood for the cross validation set is 0.314 (±0.024). The mean loglikelihood using samples from posterior parameter distribution generated using the complete data is 0.326 (±0.061). Using a Student’s ttest with Welch modification, we found the difference in means to not be statistically significant (P>0.05) indicating that the model doesn’t overfit the data.
Discussion
Recovering gene regulatory network information from expression data is a key problem in systems biology. Particularly in the study of the segmentation pathway for early Drosophila embryo, various modeling approaches have been taken [12, 17, 26]. However, most of these modeling approaches rely on the assessment of a single candidate model. This sort of approach has been previously argued against [53] as it doesn’t pay heed to competing hypotheses and hence, other plausible explanations. In addition, inference in these approaches rely on optimization techniques which do not account for uncertainty in experimental measurements. Optimization approaches try to offer measures of parameter certainty through sensitivity analysis but, barring certain studies [54], the issue of comparing models has been largely unadressed.
We do note that there have been some attempts [55] at doing model selection in the Drosophila embryo. However, the application of a structured framework in which models can be compared is still elusive. Doing the analysis in a Bayesian framework provides a more standard procedure to address both the issues of performing inference regarding different models and to assess the certainty of parameter estimates. An important issue when working with dynamical models is the issue of identifiability [49, 56, 57, 58]  the ability to uniquely estimate parameters of the model given the data. In the Bayesian context, a priori identifiability issues can be detected by examining the covariance structure of the full parameter posterior distribution. Parameters that are confounded will be tightly correlated. Identifiability issues can be surmounted by providing a more informative prior that more tightly constrains confounded parameters. In our case, however, we have chosen to work with uniform priors to indicate that our knowledge of the system is still evolving. Indeterminacy of model parameters are incorporated into the marginal likelihood, allowing one to still perform model selection. However, parameter relationships can still uncover important mechanisms. In our study, we find that the parameters for binding affinity and number of sites are negatively correlated (Additional file 1: Figure S3). Such a relationship is expected as it indicates that a transcription factor can modulate gene expression by either binding strongly to a few sites or through weak binding to multiple sites. Similar to our study, Chertkova et al. [59], show that loss of transcription factor binding sites in in silico models results in increase in binding affinity of transcription factors, supporting negative correlation between these parameters in order to maintain gene expression.
In the Bayesian framework, Bayes factors provide a means of doing model selection and have been employed to compare between ODE based models [25, 42, 60, 61]. We show here that similar approaches can be used for doing model selection in the context of PDE models for spatial patterning. An advantage of the Bayesian model selection paradigm using Bayes factors is that it doesn’t require models to be nested, i.e models need not follow a set hierarchy where all models may be derived from an extended parameterized model. This particularly advantageous when we attempt to test hypotheses involving different network topologies. Samples from the posterior of parameter distribution were generated using the parallel tempering (PTMCMC) sampler. This sampling approach can be easily combined with the numerically stable thermodynamic integration method to estimate marginal likelihood for each of the competing models. These estimates in turn can be used to compute Bayes factors. Our analysis shows that besides the global binding affinity parameter, a different nodespecific parameter is required for describing the regulatory effect of Bicoid on its target genes. This may point to the fact that the molecular mechanism of activation by Bicoid is different from other maternal/gap genes. The nodespecific Bicoid binding affinity parameter helps account for a posterior shift of Hunchback expression. A candidate hypothesis for the activation of Kruppel by Bicoid was also tested for. Our analysis offers little support for the activation of Kruppel by Bicoid.
We point out that as the computation of posterior probabilities in Bayesian analysis involves integration over highdimensional parameter spaces, sampling from higher dimensions becomes increasingly difficult. This is a particular limitation for the large parameter models that we see in systems biology. While there has been some progress in Bayesian parameter estimation in highdimensions [62], this problem is far from solved. However, there might be some justification in criticism that these highdimension models also tend to be overparameterized and thus too flexible. One approach would be do a hierarchical Bayesian analysis [63] to constrain parameter sets in order to prevent the problem of overfitting and estimation in higher dimensions.
Conclusion
This study aims to elaborate on a Bayesian framework for conducting model selection in context of the Drosophila early developmental segmentation pathway. In particular, we focus on identifying regulatory interactions for the gap gene network. Our study seeks to provide a statistical framework in which predicted experimental hypothesis can be tested. In addition, the model selection procedure also ensures that a minimal model for gap gene expression can be formulated. In order to conduct such an analysis, we provide an efficient solver for the PapatsenkoLevine formulation. In conclusion, we find that a seven parameter model with a nodespecific binding affinity to describe regulatory action of Bicoid on the gap genes explains the data adequately.
Notes
Acknowledgements
Authors gratefully acknowledge help and advice from Dmitri Papatsenko and for generously donating his time to clarify model specifications and implementation.
Funding
SVN & PM acknowledge support from NIH grants, UO1 GM103804 & RO1 MH100879. AZ acknowledges support from University of Southern California (USC) Provost fellowship. The funding bodies played no role in the design of the study and analysis and interpretation of data and in writing the manuscript.
Availability of data and material
The dataset supporting the conclusions of this article is available in the github repository, https://github.com/asifzubair/BayesianModelSelection.
Authors’ contributions
AZ, PM & SVN conceptualized the problem definition. AZ & IGR designed the model solver. AZ coded the models, solver and inference procedure. AZ wrote the initial draft of the manuscript. All authors edited, refined and approved the final manuscript.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary material
References
 1.Bover T. Die polaritat von ovocyte, ei, und larve des strongylocentrus lividus. Zoo Jahrb Abt Anat Ont Thi. 1901; 14(384).Google Scholar
 2.Horstadius S. Uber die determination im verlaufe der eiaschse bei seeigeln. Publ Staz Zool Napoli. 1935; 14:251–479.Google Scholar
 3.Spemann H, Mangold H. Induction of embryonic primordia by implantation of organizers from a different species. Roux’s Arch Entw Mech. 1924; 100:599–638.Google Scholar
 4.Turing AM. The chemical basis of morphogenesis. Philos Trans R Soc Lond B Biol Sci. 1952; 237:757–66.Google Scholar
 5.NussleinVolhard C, Wieschaus E. Mutations affecting segment number and polarity in Drosophila. Nature. 1980; 287:795–801.PubMedCrossRefGoogle Scholar
 6.NussleinVolhard C, Fronhofer HG, Lehmann R. Determination of anteroposterior polarity in Drosophila. Science. 1987; 238:1675–81.PubMedCrossRefGoogle Scholar
 7.Driever W, NussleinVolhard C. The bicoid protein determines position in the Drosophila embryo in a concentration dependent manner. Cell. 1988; 54:95–104.PubMedCrossRefGoogle Scholar
 8.Driever W, NussleinVolhard C. A gradient of bicoid protein in Drosophila embryos. Cell. 1988; 54:83–93.PubMedCrossRefGoogle Scholar
 9.Jaeger J. The gap gene network. Cell Mol Life Sci. 2011; 68:243–74.PubMedCrossRefGoogle Scholar
 10.Hulskamp M, Pfeifle C, Tautz D. A morphogenetic gradient of hunchback protein organizes the expression of the gap genes kruppel and knirps in the early Drosophila embryo. Nature. 1990; 346:577–80.PubMedCrossRefGoogle Scholar
 11.Kraut R, Levine M. Mutually repressive interactions between the gap genes giant and kruppel define middle body regions of the Drosophila embryo. Development. 1991; 111:611–21.PubMedGoogle Scholar
 12.Jaeger J, Blagov M, Kosman D, Kozlov KN, et al M. Dynamical analysis of regulatory interactions in the gap gene system of drosophila melanogaster. Genetics. 2004; 167:1721–31.PubMedPubMedCentralCrossRefGoogle Scholar
 13.Jaeger J. Modelling the Drosophila embryo. Mol BioSys. 2009; 5:1549–68.CrossRefGoogle Scholar
 14.Jaeger J, Manu, Reinitz J. Drosophila blastoderm patterning. Curr Opin in Genet & Dev. 2012; 22:1–9.CrossRefGoogle Scholar
 15.Papatsenko D, Levine M. Dual regulation by the hunchback gradient in the Drosophila embryo. Pro NAtl Acad Sci USA. 2008; 105:2901–6.CrossRefGoogle Scholar
 16.Zinzen RP, Papatsenko D. Enhancer responses to similarly distributed antagonistic gradients in development. PLoS Comp Biol. 2007; 3:1–10.CrossRefGoogle Scholar
 17.Perkins TJ, Jaeger J, Reinitz J, Glass L. Reverse engineering the gap gene network of drosophila melanogaster. PLoS Comp Bio. 2006; 2(5):51.CrossRefGoogle Scholar
 18.Stigler SM. The epic story of maximum likelihood. Stat Sci. 2007; 22(4):598–620.CrossRefGoogle Scholar
 19.Hastings WK. Monte carlo sampling methods using markov chains and their applications. Biometrika. 1970; 57(1):97–109.CrossRefGoogle Scholar
 20.Beaumont MA, Rannala B. The bayesian revolution in genetics. Nat Rev Genetics. 2004; 5:251–61.PubMedCrossRefGoogle Scholar
 21.Wilkinson DJ. Bayesian methods in bioinformatics and computational systems biology. Briefings in Bioinformatics. 2007; 8(2):109–16.PubMedCrossRefGoogle Scholar
 22.Kass RE, Raftery AE. Bayes factors. J Amer Stat Assoc. 1995; 90(430):773–95.CrossRefGoogle Scholar
 23.Geyer CJ. Markov chain monte carlo maximum likelihood. In: Computing Science and Statics Proceedings of the 23rd Symposium on the Interface: 1991. p 156.Google Scholar
 24.Gelman A, Meng XL. Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Stat Sci. 1998; 13(2):163–85.CrossRefGoogle Scholar
 25.Calderhead B, Girolami M. Estimating bayes factors via thermodynamic integration and population mcmc. Comp Stats & Data Anal. 2009; 53(12):4028–45.CrossRefGoogle Scholar
 26.Papatsenko D, Levine M. The drosophila gap gene network is composed of two parallel toggle switches. PLoS One. 2011; 6(7):21145.CrossRefGoogle Scholar
 27.Janssens H, Hou S, Jaeger J, Kim AR, Myasnikova EEA. Quantitative and predictive model of transcriptional control of the drosophila melanogaster even skipped gene. Nat Genet. 2006; 38(10):1159–65.PubMedCrossRefGoogle Scholar
 28.Jefferys W, Berger J. Ockham’s razor and bayesian analysis. Amr Sci. 1992; 80:64–72.Google Scholar
 29.Pisarev A, Poustelnikova E, Samsonova M, Reinitz J. Flyex, the quantitative atlas on segmentation gene expression at cellular resolution. Nuclec Acids Res. 2009; 37:560–6.CrossRefGoogle Scholar
 30.Hindmarsh AC, Brown PN, Grant KE, Lee SL, Serban R, Shumaker DE, Woodward CS. SUNDIALS: Suite of nonlinear and differential/algebraic equation solvers. ACM Trans Math Softw (TOMS). 2005; 31(3):363–96.CrossRefGoogle Scholar
 31.Li Q, Ito K, Wu Z, Lowry CS, Loheide II SP. Comsol multiphysics: A novel approach to ground water modeling. Ground Water. 2009; 47(4):480–7. https://doi.org/10.1111/j.17456584.2009.00584.x.CrossRefGoogle Scholar
 32.Bangerth W, Hartmann R, Kanschat G. deal.II – a General Purpose Object Oriented Finite Element Library. ACM Trans Math Softw. 2007; 33(4):24/1–24/27.CrossRefGoogle Scholar
 33.Garikipati K. Perspectives on the mathematics of biological patterning and morphogenesis. J Mech Phys Solids. 2017; 99:192–210.CrossRefGoogle Scholar
 34.Murphy L, Venkatraman C, Madzvamuse A. Parameter identification through mode isolation for reaction–diffusion systems on arbitrary geometries. Int J Biomath. 2018; 11(13):1850053–83.CrossRefGoogle Scholar
 35.Albert PJ, Schwarz US. Dynamics of Cell Shape and Forces on Micropatterned Substrates Predicted by a Cellular Potts Model. Biophys J. 2014; 106:2340–52.PubMedPubMedCentralCrossRefGoogle Scholar
 36.Lions JL. Optimal Control of Systems Governed by Partial Differential Equations. New York: Springer; 1971.CrossRefGoogle Scholar
 37.Pazy A. Semigroups of Linear Operators and Applications to Partial Differential Equations. New York: Springer; 1983.CrossRefGoogle Scholar
 38.Bayes T, Price R. An essay towards solving a problem in the doctrine of chances. by the late rev. mr. bayes, communicated by mr. price, in a letter to john canton, ma and frs. Philo Trans Royal S Lon. 1763; 53(0):370–418.Google Scholar
 39.Gilks WR, Richardson S, Spiegelhalter D. Markov Chain Monte Carlo in Practice. London: Chapman and Hall/CRC; 1995.CrossRefGoogle Scholar
 40.Mohamed L, Calderhead B, Filippone M, Christie M, Girolami M. Population mcmc methods for history matching and uncertainty quantification. Comput Geosci. 2012; 16:423–36.CrossRefGoogle Scholar
 41.Girolami M. Bayesian inference for differential equations. Theor Comp Sci. 2008; 4(16):4–16.CrossRefGoogle Scholar
 42.Vyshemirsky V, Girolami MA. Bayesian ranking of biochemical system models. Bioinformatics. 2008; 24(6):833–9.PubMedCrossRefGoogle Scholar
 43.Meng XL, Wong WH. Simulating ratios of normalization constants via a simple identity: A theoretical exploration. Stat Sinica. 1996; 6:831–60.Google Scholar
 44.Kohavi R. A study of crossvalidation and bootstrap for accuracy estimation and model selection. Morgan Kaufmann: 1995. p. 1137–43.Google Scholar
 45.Patil A, Huard D, Fonnesbeck CJ. Pymc: Bayesian stochastic modelling in python. J Stat Softw Artic. 2010; 35(4):1–81. https://doi.org/10.18637/jss.v035.i04. https://www.jstatsoft.org/v035/i04.Google Scholar
 46.Andrieu C, Thoms J. A tutorial on adaptive mcmc. Stat Comput. 2008; 18(4):343–73. https://doi.org/10.1007/s112220089110y.CrossRefGoogle Scholar
 47.Betancourt M. A conceptual introduction to hamiltonian monte carlo. 2017. arXiv.Google Scholar
 48.Knipple DC, Seifert E, Rosenberg UB, Preiss A, Jäckle H. Spatial and temporal patterns of krüppel gene expression in early drosophila embryos. Nature. 1985; 317:40–4.PubMedCrossRefGoogle Scholar
 49.Becker K, BalsaCanto E, CicinSain D, Hoermann A, Janssens H, Banga JR, Jaeger J. Reverseengineering posttranscriptional regulation of gap genes in drosophila melanogaster. PLOS Comput Biol. 2013; 9(10):1–16. https://doi.org/10.1371/journal.pcbi.1003281.CrossRefGoogle Scholar
 50.Brooks SP, Gelman A. General methods for monitoring convergence of iterative simulations. J Comput Graph Stat. 1997; 7:434–55.Google Scholar
 51.R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2008. http://www.Rproject.org. ISBN 3900051070.Google Scholar
 52.Plummer M, Best N, Cowles K, Vines K. Coda: Convergence diagnosis and output analysis for mcmc. R News. 2006; 6(1):7–11.Google Scholar
 53.Chamberlin TC. The method of multiple working hypotheses. Science. 1890; 15:92–6.Google Scholar
 54.RodriguezFernandez M, Rehberg M, Kremling A, Banga JR. Simultaneous model discrimination and parameter estimation in dynamic models of cellular systems. BMC Syst Biol. 2013; 7(76):10–118617520509776.Google Scholar
 55.van Dassow G, Meir E, Munro EM, Odell GM. The segment polarity network is a robust developmental module. Nature. 2000; 406:188–92.CrossRefGoogle Scholar
 56.Raue A, Karlsson J, Saccomani MP, Jirstrand M, Timmer J. Comparison of approaches for parameter identifiability analysis of biological systems. Bioinformatics. 2014; 30(10):1440–8. https://doi.org/10.1093/bioinformatics/btu006.PubMedCrossRefGoogle Scholar
 57.Chis OT, Banga JR, BalsaCanto E. Structural identifiability of systems biology models: A critical comparison of methods. PLOS ONE. 2011; 6(11):1–16. https://doi.org/10.1371/journal.pone.0027755.CrossRefGoogle Scholar
 58.Villaverde AF, Barreiro A, Papachristodoulou A. Structural identifiability of dynamic systems biology models. PLOS Comput Biol. 2016; 12(10):1–22. https://doi.org/10.1371/journal.pcbi.1005153.CrossRefGoogle Scholar
 59.Chertkova AA, Schiffman JS, Nuzhdin SV, Kozlov KN, Samsonova MG, Gursky VV. In silico evolution of the Drosophila gap gene regulatory sequence under elevated mutational pressure. BMC Evol Biol. 2017; 17(1):4.PubMedPubMedCentralCrossRefGoogle Scholar
 60.Schmidl D, Hug S, Li WB, Greiter MB, Theis FJ. Bayesian model selection validates a biokinetic model for zirconium processing in humans. BMC Sys Bio. 2012; 6(95):95.CrossRefGoogle Scholar
 61.MiliasArgeitis A, Porreca R, Summers S, Lygeros J. Bayesian model selection for the yeast gatafactor network: a comparison of computational approaches. In: 49th IEEE Conference on Decision and Control: 1517 Dec. 2010; Atlanta. IEEE: 2010. p. 3379–84.Google Scholar
 62.Hug S, Raue A, Hasenauer J, Bachmann J, Klingmuller U, Timmer J, Theis FJ. Highdimensional bayesian parameter estimation: Case study for a model of jak2/stat5 signaling. Math Biosci. 2013; 246:293–304.PubMedCrossRefGoogle Scholar
 63.Carlin B, Louis T. Empirical bayes: Past, present and future. J Am Stat Assoc. 2000; 95(452):1286–9. https://doi.org/10.1080/01621459.2000.10474331.CrossRefGoogle Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.