Background

Understanding the dynamical behavior of living cells from their complex genomic regulatory networks is a challenge posed in systems biology; yet it is one of critical importance (i.e., morphogenesis). Gene expression data can be used to infer, or reverse-engineer, the underlying genomic network to analyze the interactions between the molecules. Unfortunately, most of the existing work on reverse-engineering genomic regulatory networks estimates one single static network from all available data, which is often collected during different cellular functions or developmental epochs. The idea that molecular networks are remodeled as a function of time and stage is well understood; this conclusion is supported by the developmental networks of sea urchin embryos [1]. Throughout a cellular process, such as cancer progression or anticancer therapy, there may exist multiple underlying “themes” that determine the functionalities of each molecule and their relationships to others, and such themes are dynamic. In signal processing terms, summarizing gene expression data, that comes from different cellular stages, into one network would be similar to characterizing a non-stationary signal by its Fourier spectrum. Biologically, static networks cannot reveal regime-specific or key transient interactions that lead to biological changes.

One of the challenges of inferring a time-varying network is that there are only a few observations available at each time point. This small sample size is amplified by the high dimension of every sample, leading to a small n large p problem (i.e., more variables than observations). In particular, the system is under-determined. However, exploiting the fact that molecular networks are sparse, one can use compressive sensing to find a solution. Compressive sensing is concerned with the optimal reconstruction of a sparse signal from an under-determined linear system [2, 3]. Under-determined systems are quite common in computational biology/ecology, and the application of compressive sensing to solve these under-determined systems in nature has been a popular solution [46]. Compressive sensing theory states that, under the restricted isometry property (RIP), the optimal sparsest solution of a linear system is equivalent to the minimum l1-norm solution [2, 3]. Unfortunately, it is almost impossible to check whether a linear system satisfies the RIP condition. In general, the minimum l1-norm solution can be far from the optimal sparse solution.

In our previous work [4], we addressed the problem of under-sampled sparse systems by proposing a new energy-weighted likelihood function that ensures the convergence of the likelihood function for under-determined systems with unknown covariance. The approach was coined Small sample MUltivariate Regression with Covariance estimation (SMURC) and was applied to infer the wing-muscle gene regulatory networks of the Drosophila Melanogaster during the four phases of its development [4]. However, the estimated networks at every epoch used only the data in the corresponding epoch. In particular, the larval network ignored all the measurements in the previous embryonic phase, and so was the case for the subsequent stages. Other research efforts have been proposed to address the problem of recovering time-varying gene regulatory networks by using dynamic Bayesian models [7], non-parametric Bayesian regression [8], and random graph models [9].

In this contribution, we introduce a new approach to modeling sparse time-varying networks and their applications to gene regulatory networks that are based on our recent work [10, 11]. We start by projecting the Kalman solution onto an “approximately sparse” space by using l1-regularization. We further expand upon our previous work by using a Kalman smoother. We then explore the l1-neighborhood for sparser solutions by leveraging our recent compressive sensing technique known as Kernel RecONstruction (KRON) [12]. KRON recovers the optimal sparsest solution whether the RIP condition is satisfied or not. However, KRON’s computational complexity is still exponential in the number of parameters p. We, therefore, advance Approximate KRON (AKRON) [11], which builds growing neighborhoods of the l1 solution that moves towards the optimal sparsest solution and eventually reaches it. The size of the neighborhood is tunable depending on the computational resources available. We derive theoretical bounds of optimality. The AKRON Kalman filter is validated on synthetic and real-world data sets.

The state-space model and Kalman filter

Following the works in [4, 13], we model the network dynamics using a state space model. The system equation is given by a random walk model, which reflects a lack of prior knowledge of the network topological changes. The observation equation is given by a first-order differential equation, whose parameters reflect the strength and sign of interactions (positive and negative are activating and repressing, respectively) [14]. The state space model of the incoming edges, \(\mathbf {a}_{i} \in \mathbb {R}^{p}\), for gene i can be shown to be [13]

$$\begin{array}{*{20}l} \mathbf{a}_{i}(k+1) =& \mathbf{a}_{i}(k)+\boldsymbol{w}_{i}(k). \end{array} $$
(1)
$$\begin{array}{*{20}l} \mathbf{y}_{i}(k) =& \mathbf{X}^{\mathrm{T}}(k)~ \mathbf{a}_{i}(k)+\boldsymbol{v}_{i}(k), \end{array} $$
(2)

where i=1,⋯,p and p is the number of genes. \(\boldsymbol {X}(k) \in \mathbb {R}^{p \times n}\) is the gene expression matrix at time k. yi(k) is the rate of change of expression of gene i at time k. wi(k) and vi(k) are the process and observation noise, respectively. These noise processes are assumed to be zero mean Gaussian noise processes with the known covariances Qk and Rk, respectively, and uncorrelated to the state vector ai(k). The full connectivity matrix, A(k), can be recovered by simultaneous parallel recovery of its rows \(\boldsymbol {a}_{i}^{t}(k)\) at every time instant k. Thus, we can process each gene in parallel. The Kalman filter can be used to track a(k) [13, 15]; however, this is only if the system is observable. The problem with using a Kalman filter in our setting is that the system is under-determined (i.e., more variables than equations, p>n). This problem, however, can be circumvented by taking into account the sparsity of the vector ai(k). Since each gene in the genomic regulatory network is governed by only a small number of other genes, these networks are known to be sparse. Furthermore, we have also experimented with a Kalman Smoother that is applied after the Kalman filter. Note that the Kalman Smoother is optional. A Kalman smoother can reduce the covariance of the optimal estimate. We implemented Rauch et al.’s Kalman smoothing algorithm [16] and we compared AKRON-Kalman both with and without smoothing.

The pseudo code for the proposed Kalman filtering approach is shown in Fig. 1.

Fig. 1
figure 1

AKRON-Kalman Tracker+Smoother

Constrained Kalman filtering

It is known that the connectivity of the edges in gene regulatory networks [13] is sparse. Unfortunately, the output of the Kalman filter is likely not going to be sparse. We first start by projecting the Kalman solution at time k, ak|k, onto the set of “approximately sparse” vectors by solving the following Lasso problem [17]:

$$ \boldsymbol{a}_{k|k}^{*} = \arg\min_{\boldsymbol{a}\in \mathbb{R}^{p}} \left\{(1 - \alpha) \left\|\boldsymbol{a}_{k|k} - \boldsymbol{a}\right\|_{2}^{2} + \alpha \|\boldsymbol{a}\|_{1}\right\}, $$
(3)

where α∈[0,1] controls the tradeoff between the Kalman estimate and sparsity. An α close to zero will result in a solution that is close to the Kalman estimate, but that may not be sparse. The opposite happens when α is close 1, which will produce a sparser solution, but may be far from the Kalman estimate. This is achieved by minimizing the reconstruction error (i.e., the first term in (3)).

AKRON: a search for a sparser solution

Consider the following l0-optimization problem, which finds the optimal sparsest solution in a linear under-determined system.

$$\begin{array}{*{20}l} \boldsymbol{x}^{*} = &\arg \min_{\boldsymbol{x} \in \mathbb{R}^{p}} \|\boldsymbol{x}\|_{0} \\ & \text{s.t.} \; \boldsymbol{\Phi}\boldsymbol{x} = \boldsymbol{y} \end{array} $$
(4)

where ∥·∥0 is the l0-norm, which is defined as the support of the vector, \(\boldsymbol {x} \in \mathbb {R}^{p}\), \(\boldsymbol {y} \in \mathbb {R}^{n}\), and \(\boldsymbol {\Phi } \in \mathbb {R}^{n \times p}\). We consider the scenario where pn and denote in the sequel s=pn. Without loss of generality, Φ is assumed to be full-rank. Compressive sensing theory [3] shows that, under the Restricted Isometry Property (RIP) condition on the matrix Φ, the l1-norm solution is equivalent to the l0-norm solution. Unfortunately, it is impossible to check if the RIP condition is satisfied for a given matrix. Despite this strict condition, l1 has been routinely used to find a sparse solution in systems of the form (4).

The proposed Approximate Kernel RecONstruction (AKRON) is an approximation to computationally complex Kernel RecONstruction (KRON) problems [12]. KRON is able to achieve an exact solution to (4), but the algorithm becomes computationally expensive for typically p>15. AKRON, detailed below, is introduced to balance the trade-off between the computational resources that are available and the accuracy of the reconstruction.

The Kalman filter estimate is first sparsified by incorporating l1 regularization in (3) (line 5 of Fig. 1). However, the l1 projection is not guaranteed to be the optimal sparsest solution. AKRON-Kalman filter (AKRON-KF) starts off from the l1-regularized Kalman estimate in (3). Then, the s=pn smallest elements of the l1 projection in (3) are set to zero. The logic behind this strategy is to use the l1-projection to guess the position of the zeros in the optimal solution. Given that the kernel of the system matrix Φ in (3) has dimension s, we know that if s zero locations are correctly set, then the optimal sparsest solution can be exactly found by solving the linear system in (4) [12].

Following this reasoning, AKRON finds a sparser solution by exploring δ-neighborhoods of the l1-projection. The central idea behind AKRON’s δ-neighborhoods is as follows: (i) find the indices with the (s+δ) smallest magnitudes of the l1 solution, (ii) set exactly s of these indices to zero, (iii) re-solve the system Φx=y. All the possible \(s+\delta \choose s\) combinations of the smallest elements in the solution of (3) are evaluated. This idea can also be viewed as a “perturbation” of the l1 approximation to make it closer to the l0-norm. The size of the neighborhood δ is tunable depending on the computational power available, and vary from 0 (l1-approximation) to n (KRON, i.e., perfect reconstruction).

Example: To understand the AKRON algorithm and illustrate the importance of the δ-neighborhoods, we present a simple numerical example. Consider the following randomly generated noiseless system as in (4):

$$\begin{array}{*{20}l} \boldsymbol{\Phi} =& \left(\begin{array}{ccccc} -0.4588 & 1.5977 & -0.8724 & -0.1121 & -1.3068 \\ 0.2942 & 3.0954 & -1.0530 & 0.3454 & 1.5257 \\ -0.1948 & -0.7558 & -0.9756 & 0.1549 & 0.9586 \\ \end{array} \right); \\ \boldsymbol{y} =& \left(\begin{array}{ccc} -1.2316 & 1.1739 & 0.8135 \\ \end{array} \right)^{T} \end{array} $$

The optimal 0-norm sparsest solution is given by

$$ \boldsymbol{x}^{*} = \left(\begin{array}{ccccc} 0 & 0 & 0 & -1.2372 & 1.04858 \\ \end{array} \right)^{T} $$
(5)

The 1 solution, which solves (4), is given by

$$ \widehat{\boldsymbol{x}}_{1} = \left(\begin{array}{ccccc} 0.0 & -0.034 & 0.047 & 0.0 & 0.870 \\ \end{array} \right)^{T} $$

Clearly, the l1-solution is not as sparse as the optimal solution and has incorrect zero locations. We have n=3,p=5 and thus s=2. If we choose δ=1 the AKRON considers the s+δ=3-smallest magnitudes of \(\widehat {\boldsymbol {x}}_{1}\), which are located at indices 1, 2 and 4. We set s=2 locations to zero among these 3 indices. We consider all \({s+\delta \choose s} = {3 \choose 2} = 3\) combinations of two zeros in indices 1, 2 and 4 of \(\widehat {\boldsymbol {x}}_{1}\). The combination of indices 1 and 2 set to zero leads to the sparsest optimal solution x in (5). Thus, in this case, the 1-norm solution is sub-optimal; but by considering a δ=1-neighborhood of this approximation, AKRON is able to exactly recover the sparsest optimal l0-solution.

In the noisy case, where the constraint in (4) is replaced by ∥Φxy∥≤ε, with ε being a given noise threshold level, the neighborhood δ is chosen adaptively as follows: set the s smallest magnitudes of the l1 solution to zero; compute the observation error ∥Φxy∥. If this error is smaller than the energy of the noise, we adopt this solution. Otherwise, the next smallest element is set to zero and the error is recalculated.

In the following propositions, we investigate under which assumptions on the entries of the l1 solution and its closeness to the l0 solution, will AKRON yield the optimal l0 solution.

Proposition 1

Consider the system in (4) with the optimal l0-solution x having k>s zeros. Consider the l1-solution, x1, and assume that ∥x1x2ε. Let J denote the number of indices j such that

$$\begin{array}{*{20}l} |(\boldsymbol{x}_{1})_{j}| \leq \frac{\epsilon}{\sqrt{k-s+1}}. \end{array} $$
(6)

Then, by choosing δJs, AKRON yields the optimal l0-solution.

Proof

Let Θ and \(\overline {\Theta }\) be the index sets of zero and non-zero entries in the l0-solution x, respectively. We have |Θ|=k. We need to show that at least s indices in Θ are where (6) holds. To prove this fact, assume the opposite, i.e., at least (ks+1) indices in Θ are such that

$$|(\boldsymbol{x}_{1})_{j}| > \frac{\epsilon}{\sqrt{k-s+1}}. $$

However, in this case, we have

$$\|\boldsymbol{x}_{1} - \boldsymbol{x}^{*}\|_{2} > \sqrt{k-s+1}\frac{\epsilon}{\sqrt{k-s+1}} = \epsilon, $$

which contradicts the assumption ∥x1x2ε. □

The following proposition derives an upper bound for \(\delta \in \mathbb {N}_{+}\) when the non-zero elements of the optimal l0-solution are bounded from below. We first need the following Lemma.

Lemma 1

Consider the system in (4) with the optimal l0-solution x and approximate l1-solution x1. Denote by Θ and \(\overline {\Theta }\) the index sets of zero and non-zero entries in x, respectively. Assume that ∥x1x2ε. Let R be the number of indices \(j \in \overline {\Theta }\) such that |(x1)j|≤ε. If δ=R, then AKRON yields the optimal solution.

Proof

To obtain the sparsest l0-solution, it is sufficient to choose s zeros in “correct places”, i.e., with indices in Θ. Recall that AKRON sets s out of the smallest-magnitude (s+δ) entries in x1 to zero. Therefore, AKRON will yield the optimal solution if out of these (s+δ) smallest-magnitude entries, there are at least s entries from Θ. But all entries from Θ have |(x1)j|≤ε, for otherwise the assumption ∥x1x2ε would be violated. This means that only those entries from \(\overline {\Theta }\), which also satisfy |(x1)j|≤ε, could be chosen by AKRON, and there are only R of them. Hence δ=R will yield the optimal l0-solution.

Lemma 1 provides a sufficient condition on δ for optimality of AKRON, namely, if δ=R, then we are guaranteed the optimal l0-solution. Since this condition is not necessary, we could reach optimality with δR. □

Proposition 2

Consider the system in (4) with the optimal l0-solution x and approximate l1-solution x1. Let \(\overline {\Theta }\) be the set of indices of non-zero elements in x. Assume that the non-zero entries of x are bounded from below, i.e.,

$$\left|\left(\boldsymbol{x}^{*}\right)_{j}\right| \geq \eta,~ \text{for all}~ j \in \overline{\Theta}~ \text{and for some}~ \eta > 0. $$

Assume further that ∥x1x2ε. If ε<η, then \(\delta \leq \left (\frac {\epsilon }{\eta - \epsilon }\right)^{2}\) yields the optimal l0-solution. In particular, if \(\epsilon < \frac {\eta }{2}\) then δ=0 suffices.

Proof

Since |(x)j|≥η for \(j \in \overline {\Theta }\), we have

$$\left\|\boldsymbol{x}_{1}-\boldsymbol{x}^{*} \right\|_{2} \geq (\eta-\epsilon)\sqrt{R}\geq(\eta-\epsilon)\sqrt{\delta}, $$

where we used the fact that δR from Lemma 1. Thus, \(\epsilon \geq \left \| \boldsymbol {x}_{1}-\boldsymbol {x}^{*} \right \|_{2} \geq (\eta -\epsilon)\sqrt {\delta }\), which completes the proof. □

Although Propositions 1 and 2 derive theoretical bounds for the choice of the neighborhood radius δ to recover the optimal sparsest solution, we found in our experiments below that relatively small values of δ are sufficient to achieve a balance between desired accuracy and computational complexity.

Results

In this section, we present an empirical analysis of the AKRON-KF and its smoother, including comparisons to other approaches proposed for detecting the relationships between different genes in a molecular network. The experiments include a number of carefully designed synthetic data sets, as well as a real-world data set, namely the fruit fly.

Overview of experimental protocols

Our experiments are conducted on real-world and synthetic data. The advantage of the synthetic data are that the ground truth networks are known; therefore, we can calculate different statistics about the reconstruction error of the network. Unfortunately, we do not have a clear view of the “ground truth” for real-world data. Therefore, we use findings from the life sciences that have studied these networks and were able to infer gene-to-gene relationships that are well established [18].

Our experiments make use of the following algorithms for a sparse reconstruction of a time-varying network:

  • l1-KF(S): This algorithm is the output of the Kalman filter with the l1 projection applied to the state vector. The (S) indicates whether the smoother was implemented.

  • AKRON-KF(S): This is the proposed approach using the output of l1-KF(S) to seed AKRON. It is also implemented with and without the smoother.

Both of the above algorithms can reconstruct a network that represents the interactions between genes. We compute the true positive (TP), true negative (TN), false positive (FP) and false negative (FN) rates. These rates are summarized through accuracy (acc), sensitivity (sen), specificity (spe), and Matthew’s correlation coefficient (mcc), which are defined below.

$$\begin{array}{*{20}l} \text{acc} &= \frac{TP + TN}{TP+TN+FP+FN}, \\ \text{sen} &= \frac{TP}{TP+FN}, \\ \text{spe} &= \frac{TN}{TN+FP}, \\ \text{mcc} &= \frac{TP \cdot TN - FP \cdot FN}{\sqrt{(TP+FP)\cdot(TP+FN)\cdot(TN+FP)\cdot(TN+FN)}}. \end{array} $$

Matthew’s correlation coefficient provides a more balanced statistic for examining the overall trade-offs between the different rates (i.e., TP, TN, FP, and FN).

Results on synthetic data

Synthetic time-varying networks are simulated to evaluate the efficacy of the proposed AKRON-KF(S) on data that we have complete control over. All results in this section are presented as the average over 25 monte carlo simulations. Averaging is performed because there could be a large degree of variation in the time-varying networks that are randomly generated.

First, we evaluate the impact of α in (3) on the estimation of ak|k and the reconstruction errors of l1-KF and AKRON-KF. The experiment is configured as follows: a 25-gene network evolves over four-time points by simulating a random walk; all networks being 85% sparse. At each time step, there are nine observations that are available. Figure 2 shows the effect of α on the reconstruction errors of AKRON-KF and l1-KF. Clearly, AKRON-KF is the better performer across the statistics that we collected. Thus, AKRON significantly improves the previous implementation of sparse Kalman filters for time-varying networks. Furthermore, AKRON-KF detects the location of the edges in the simulated networks (see Fig. 2b). Simply using the solution from l1-KF for a small α is not enough to find the location of the zeros. In fact, α needs to be close to one to achieve a high accuracy at edge detection (i.e., (3) will place a large weight on the l1 penalty and a small weigh on the error). Given these results, we choose α=0.2 for the remainder of the experiments since this value provides a reasonable trade-off between the different statistics that were assessed. Figure 2e shows that AKRON-KF is also superior to Lasso-KF as assessed by Matthew’s correlation coefficient, which provides a balanced measure.

Fig. 2
figure 2

Comparison of AKRON-KF and l1-KF on a synthetic data with 25 genes over four time points and with 85% sparsity. All results are averaged over 25 monte carlo simulations. a Error. b Accuracy. c Sensitivity. d Specificity. e MCC

Second, we expand the synthetic experiments to evaluate the impact of δ in AKRON and the dimensionality of the kernel of X, and we also evaluate the impact that the smoothing has on the reconstruction of the networks. We simulated three different kernel dimensions and values for δ. Table 1 shows the outcome of these experiments. The entries in the table are presented as A/B, where A and B are the results from AKRON-KF and l1-KF, respectively. The table is also divided in half to separate the results for the Kalman filter and Kalman smoother. Similar to the first experiment, we observe that AKRON-KF typically outperforms l1-KF in nearly every statistic and the results can be quite significant. Furthermore, systems with a large kernel dimension (i.e., large p small n) benefit significantly from a larger value of δ. For example, consider a network that is 9×50×4. δ=1 provides a little reduction in error; however, increasing δ to 3, significantly reduces the error and increases the other statistics. Finally, we observe that smoothing improves the error of the system; however, we should note that this improvement comes at a computational cost.

Table 1 Results on simulated networks

Results on Flybase

The application of interest is the inference of the time-varying wing-muscle genomic network of the Drosophila Melanogaster (fruit fly). The Drosophila’s microarray dataset originally consists of 4028 genes taken over 66 different time points [18]. The data includes 4 stages of the Drosophila’s life: embryonic (samples 1 through 30), larval (samples 31 through 40), pupal (samples 41 through 58), and adulthood (samples 59 through 66). Flybase hosts a list of undirected gene interactions [19]. We set α=0.2 based on the experiments in the previous section for l1-KF and AKRON-KF.

In this application, we considered a list of 11 genes that are responsible for the wing muscle development, which has been considered by many researchers before [79, 20]. The embryonic, pupal, and larval stages are undersampled to 9 observations in each stage that were used in the reconstruction of the 11-gene network in each developmental epoch. All 8 time points were used in the adulthood period. To summarize, the reconstruction of the connectivity matrix uses 9 samples in the embryonic, pupal, and larval developmental stages and 8 samples in the adulthood developmental stage. The 11 gene network was reconstructed throughout each of the four developmental stages using AKRON-KF and AKRON-KFS.

The networks reconstruction using the l1-KF and AKRON-KF are shown in Figs. 3a-d and 4a-d, respectively. For clarity, the displayed networks do not show the strength of the interaction, only that there is an interaction detected by one of the algorithms. The AKRON-KF tracker results in clearly sparser networks than the l1-KF. AKRON-KF was able to find all the connections reported in Flybase: (Actn,prm) appears in the embryonic, larval, and pupal stages, (Actn,up) appears in all four stages, (up,mhc) appears in the embryonic, larval, and pupal stages, (up,sls) appears in the embryonic, larval, and pupal stages and (sls,mhc) appears in the embryonic and larval stages. The other two connections appear through one medium gene (Actn,sls) appears in the embryonic, pupal and adulthood phases through one additional gene and (twin, eve) appears through one or more additional genes only in the embryonic and larval phases.

Fig. 3
figure 3

Reconstructed networks for the l1-KF across the four time stages. Edges in the network represent either the suppression of a gene or excitation of a gene, or one gene excites while the other suppresses. a Embryonic. b Larval. c Pupal. d Adult

Fig. 4
figure 4

Reconstructed networks for the AKRON-KF across the four time stages. Edges in the network represent either the suppression of a gene or excitation of a gene, or one gene excites while the other suppresses. a Embryonic. b Larval. c Pupal. d Adult

The networks reconstruction using the l1-KF and AKRON-KF with smoothers are shown in Figs. 5a-d and 6a-d, respectively. The smoothing provides very similar network topologies to the ones without the smoother; however, we did observe that the networks were sparser in the larval stage (see Figs. 4b and 6b). Note that while Fig. 7 is considered the ground truth, the could exist relationships that have not yet been discovered. These statistics are shown in Table 2, which again shows the benefit of using the AKRON-KF over the l1-KF.

Fig. 5
figure 5

Reconstructed networks for the l1-KF with a smoother across the four time stages. Edges in the network represent either the suppression of a gene or excitation of a gene, or one gene excites while the other suppresses. a Embryonic. b Larval. c Pupal. d Adult

Fig. 6
figure 6

Reconstructed networks for the AKRON-KF with a smoother across the four time stages. Edges in the network represent either the suppression of a gene or excitation of a gene, or one gene excites while the other suppresses. a Embryonic. b Larval. c Pupal. d Adult

Fig. 7
figure 7

The known undirected gene interactions in the Drosophila’s 11-gene wing muscle network as determined from Flybase [18]

Table 2 Results of the AKRON-KF and l1-KF on the real-world Drosophila Melanogaster data set

Table 3 lists all previous algorithms that were applied to this genetic network. Only the AKRON-KF, l1-KF [13], SMURC [4] and Dynamic Bayesian networks [7] considered time-varying networks; and, hence, were able to distinguish the different phases in the network. The other algorithms (minimum description length [20], random graph model [9], and nonparametric Bayesian regression [8]) assumed a stationary network, and hence it is not clear at which stage the detected connections develop. The AKRON-KF along with the l1-KF are the only algorithms able to recover all known interactions and specify the developmental stage where these interactions occur. Although the l1-KF also finds all reported interactions, the networks are denser (less sparse) than the AKRON-KF. With regard to the MDL results in Table 2, we have not reported the interations. The MDL authors in [20] inferred a single network, using all 66 time points, that characterizes the entire Drosophila’s life cycle. In particular, the MDL approach is stationary and does not differentiate between the phases or time-varying epochs of the data. We wanted to report the MDL findings as they are in the literature. Notice that AKRON used less points and found more correct interactions.

Table 3 Detection of the known gene interactions in Flybase (E: embryonic, L: larval, P: pupal and A: adulthood)

Conclusion

In this work, we addressed the problem of inferring time-varying molecular networks as a tracking problem that can be solved using the Kalman filter. The major difficulty, however, is that there is not a sufficient number of observations at each time point, which makes the state-space model unobservable and the tracking senseless. Fortunately, molecular networks are known to be sparse because the dynamics of every gene are governed by only a small number of genes. By incorporating the sparsity condition, we show that the tracking problem becomes feasible.

We presented the AKRON Kalman filter, which builds on our previous work on the Lasso-Kalman filter (l1-KF). Our proposed approach leverages the AKRON algorithm to find a sparser solution that is more representative of the ground truth. The proposed tracker/smoother first computes the output of l1-KF; then explores growing neighborhoods of the l1-projection to look for sparser solutions, eventually reaching the optimal sparsest estimate. The size of these neighborhoods is a tunable parameter that depends on the computational power available. AKRON-KF was benchmarked on synthetic and real-world data against l1-KF. The results demonstrate that the proposed approach is better at recovering sparse time-varying networks than l1-KF. Not only was the reconstruction error of the proposed approach lower than l1-KF, but it was also better at detecting whether an edge exists in a network. AKRON-KF tracker was applied to infer the wing muscle gene regulatory network of the Drosophila Melanogaster during four developmental phases of its life cycle, and successfully identified all seven known interactions reported in Flybase. We should also note that our proposed approach will work for time-series networks that have more than four times steps and sparsity levels

Our future work includes applying the AKRON-KF to other types of data, particularly data related to different types of cancers to create a predictive network biomarker for clinical outcome. These ideas have applicability in translational clinical cancer research, basic cancer research, and in network-based drug discovery.