# Identifying interactions in the time and frequency domains in local and global networks - A Granger Causality Approach

- 6k Downloads
- 9 Citations

## Abstract

### Background

Reverse-engineering approaches such as Bayesian network inference, ordinary differential equations (ODEs) and information theory are widely applied to deriving causal relationships among different elements such as genes, proteins, metabolites, neurons, brain areas and so on, based upon multi-dimensional spatial and temporal data. There are several well-established reverse-engineering approaches to explore causal relationships in a dynamic network, such as ordinary differential equations (ODE), Bayesian networks, information theory and Granger Causality.

### Results

Here we focused on Granger causality both in the time and frequency domain and in local and global networks, and applied our approach to experimental data (genes and proteins). For a small gene network, Granger causality outperformed all the other three approaches mentioned above. A global protein network of 812 proteins was reconstructed, using a novel approach. The obtained results fitted well with known experimental findings and predicted many experimentally testable results. In addition to interactions in the time domain, interactions in the frequency domain were also recovered.

### Conclusions

The results on the proteomic data and gene data confirm that Granger causality is a simple and accurate approach to recover the network structure. Our approach is general and can be easily applied to other types of temporal data.

## Keywords

Bayesian Network Granger Causality ARMA Model Ordinary Differential Equation Causal Influence## Background

One of the most fundamental issues in computational biology is to reliably and accurately uncover the network structure of elements (genes, proteins, metabolites, neurons and brain areas etc.), based upon high throughput data [1, 2]. There are several well-established reverse-engineering approaches to explore causal relationships in a dynamic network, such as ordinary differential equations (ODE), Bayesian networks, information theory and Granger Causality.

The notion of Granger causality, which was first introduced by Wiener and Granger [3, 4, 5], proposed that there is a causal influence from one time series to another if the prediction of one time series is improved with the knowledge of the second one. The prediction is made in terms of an auto-regressive model. In addition, Granger causality has the advantage of having a corresponding frequency domain decomposition so that one can clearly find at which frequencies two elements interact with each other. Granger's conception of causality has been widely and successfully applied in the econometrics literature and recently in the biological literature [6, 7, 8, 9, 10, 11].

Considering the four different approaches to the same problem, a natural question is to investigate which should be preferred. In a previous paper [12], we presented a comparison study of Granger causality and dynamic Bayesian network inference approaches. The result showed that Granger causality outperformed the dynamic Bayesian network inference when the time series were long enough because the Granger causality was then able to detect weak interactions. In a recent *Cell* paper [13, 14], the authors carried out a systematic comparison between the ODE, Bayesian and information theoretic approaches for a small synthesized gene network in the yeast (Saccharomyces cerevisiae). The authors concluded that the ODE was the best approaches amongst those three. We have applied our conventional Granger causality approach on the same recorded time-series and found that the results derived by it were better than all the other three approaches' in the original paper. A small network of seven previously investigated proteins [15] was also re-constructed. Interestingly, the two important proteins DDX5 and RFC1 found in experiments were at the top of the re-constructed network. Frequency domain results were analyzed and they indicated that DDX5 and BAG2 interacted at a frequency of one cycle per three hours.

Due to the complexity of biological processes, in order to capture the dynamics of complex systems and investigate the functions of genes and neurons in detail, it is much better to treat the network as a whole instead of analyzing a very limited portion of it. Until now, most of the analysis tools currently used for the whole network are based on clustering algorithms. These algorithms attempt to locate groups of genes that have similar expression patterns over a set of experiments. Such analysis has proven to be useful in discovering genes that are co-regulated and/or have similar function. A more ambitious goal for analysis is revealing the structure of the transcriptional regulation process, for example, for a given transcription factor, could we find all its upstream and downstream transcription factors? This is clearly a challenging and fascinating problem.

Most popular approaches, such as Granger causality, are powerful in cases where the length of the time series is much larger than the number of variables, which is exactly the reverse of the situation commonly found in microarray experiments, for which relatively short time series are measured over tens of thousands of genes or proteins. The real difficulty comes from the fact that when the dimension is larger than the length of time series, the design matrix of predictors is rectangular, having more columns than rows; in such case, the model is under-determined and cannot be uniquely fitted. Bayesian network is a graph-based model of joint multivariate probability distributions that captures properties of conditional independence between variables, but as it requires a large number of parameters and assumptions upon the variable distribution, it also quickly becomes intractable for large networks. Keeping these limitations in mind, it is still an important task to developing methodologies that are both statistically sound and computationally tractable to make a full use of the wealth of data now at our disposal. In order to tackle this problem, we propose a new framework: Global Granger Causality (GGC) This framework builds on the use of partial Granger causality which was illustrated in our previous paper [16]. We first construct an initial sparse network by considering all possible links by computing bivariate pair-wise Granger causality. Once we identify such a network structure, there is uncertainty about the true causal structure; we need to check whether the links appearing in pairwise causality are direct or indirect. We do so by computing GGC step by step. If a link is found to be an indirect relationship in the sense of GGC, we delete such a link from the initial network. Theoretically, iterating the procedure will remove all indirect links and only direct connections will remain. The advantage of such an approach is obvious. By explicitly taking more sources into account, it provides a less biased structure of the network due to latent variables than in a small network as described above. It also provides information on the ancestors and descendents of key elements such as DDX and RFC1 in our network. The results can then guide experimentalists to investigate the properties of a small subset of specific proteins.

The rest of the article is divided in two sections. First, in the method sections, we introduce Granger causality in details, as well as its formulation in the frequency domain. We also describe global Granger causality, the new procedure for applying Granger causality to large networks. Next, in the result section, we apply our method on small (local) and large (global) networks. In both cases, simulations and actual biological data (gene and protein time-series) are used and results discussed. And we also provide a theoretical proof of its reliability

## Method

A measurement of Causal influence for time series was first proposed by Wiener-Granger. We define the causal influence of one time series on another by quantifying the improvement made on the prediction of a time series when we incorporate the knowledge of a second one. Granger implemented this notion in the context of linear vector auto-regression (VAR) model of stochastic processes. In the AR model, the variance of the prediction error is used to test the prediction improvement. For instance, consider two time series; if the variance of the autoregressive prediction error of the first time series at the present time is reduced by the inclusion of past measurements from the second time series, then one can conclude that the second time series have a causal influence on the first one. Geweke [17, 18] decomposed the VAR process into the frequency domain and converted the causality measurement into a spectral representation which made the interpretation more appealing.

_{ t }, an ARIMA(p,d,q) model is given by:

Where L is the lag operator, the error term *ε*_{ t }has normal distribution with 0 mean.

### Conditional Granger Causality in the time domain

**X**

_{ t }and

**Z**

_{ t }and their k

^{th}and m

^{th}order differences Δ

^{ k }

**X**

_{ t }and Δ

^{ m }

**Z**

_{ t }(without loss of generality, we assume that m = k from now on), the joint autoregressive representation for Δ

^{ k }

**X**

_{ t }and Δ

^{ k }

**Z**

_{ t }by using the knowledge of their past measurement can be expressed as

^{ k }

**X**

_{ t }, Δ

^{ k }

**Y**

_{ t }and Δ

^{ k }

**Z**

_{ t }can be represented as

**ε**

_{ it },

*i*= 1,2,⋯,5 are the prediction errors, which are uncorrelated over time. If we rewrite equation (2) and equation (4) in terms of X, Y and Z themselves, we see that whether a coefficient vanishes or not is almost unchanged. Hence it is safe to say that the conditional Granger causality form

**Y**to

**X**conditional on

**Z**can be defined as (see [19] for the classical definition)

When the causal influence from **Y** to **X** is entirely mediated by **Z**, the coefficient *b*_{2i}is uniformly zero, and the two auto-regressive models for two or three time series will be exactly same, thus we can get var(ε_{1t}) = var(ε_{3t}). We then have *F*_{Y→X|Z} = 0, which means **Y** can not further improve the prediction of **X** including past measurements of **Y** conditional on **Z**. In other words, **Y** doesn't have an influence on **X**. For var(**ε**_{1t}) > var(**ε**_{3t}), *F*_{Y→X|Z} > 0 and therefore there is a direct influence from **Y** to **X**, conditional on the past measurements of **Z**.

### Conditional Granger Causality in the frequency domain

*I*is identity matrix. The normalized equations are represented as:

where Δ(λ, k) is the Fourier transform of the difference operator Δ^{k}. Therefore, for ARIMA and ARMA model in the frequency domain, their causality is identical. This is in agreement with our conclusions in the time domain causality and in general the Kolmogorov identity holds true, that is: integrating the frequency-domain Granger causality over all frequencies yields the time-domain Granger causality.

### Global Granger Causality

Partial Granger causality (PGC) provides an accurate description of the internal dynamics of the system when the number of nodes is much smaller than the length of recorded time series. However, when the number of nodes increases, especially when they are larger than the length of time series, a 'curse of dimension' immediately arises, it is a situation for which usual methods break down.

_{1},...X

_{n}} has only a single pathway to target T, and each nodes from {Y

_{1},...,Y

_{n}} has two distinct pathways to target T. From Fig. 1A, we can find the parents of target T are T

_{1}, T

_{2}, T

_{3}.

The detailed algorithm is illustrated as follows:

First, apply the bivariate pair-wise Granger causality to find all of the ancestors of the target T. This set is denoted A_{0}(T). In theory, we can detect all possible Granger-causal links in this procedure, both direct and indirect. In Fig. 1A, A_{0}(T) = {T_{1}, T_{2}, T_{3}, X_{1},...X_{n}, Y_{1},...,Y_{n}}.

- (I)
For each node in A

_{0}(T), compute the partial Granger causalities conditioned on all other single nodes in the A_{0}(T). If the relationship vanishes, delete this node from the initial network and obtain the 1-stage network. After this procedure, all indirect links conditioned on one single node have been removed. In Fig. 1A, {X_{1},...X_{n}} are deleted from A_{0}(T), denoting the remaining set as A_{1}(T) ={T_{1}, T_{2}, T_{3}, Y_{1},...,Y_{n}}. This is proved in**Lemma 1**of Discussion section. - (II)
For each node in A

_{1}(T), compute the partial Granger causalities conditioned on all possible pairs in A_{1}(T). We obtain the 2-stage network in where all indirect links conditioned on a pair of nodes have been removed. In Fig. 1B, {Y_{1},...,Y_{n}} is further deleted from A_{1}(T), denoting the remaining set as A_{2}(T) ={T_{1}, T_{2}, T_{3}}. - (III)
Continue the procedure above until we can not remove any nodes from A

_{k}(T).

The rationale is as follows: if the usual Granger causality from Y → X is large but significantly decreases when conditioned on a third signal Z (F_{Y→X|Z} ), then the connection Y→X is only indirect and should be discarded. We use this principle to find the direct ancestors (signals acting on a target X with no intermediate) of each nodes. At step 0, we search for all signals Y such that F_{Y→X} is large. We call A_{0} this collection of candidate ancestors. At step 1, we filter this set further with keeping the signals Y∈A_{0} such that F_{Y→X|Z} is still large for all Z∈A_{0}. We call A_{1} this new set and carry on the procedure by conditioning on groups of 2, then 3 etc. signals from the previous set until such an operation is not possible (the size of A_{i} decreases or stabilizes at each iteration). The result is a list of direct ancestors for each node, which we aggregate to produce the global network.

## Result and Discussions

### Local Network: Synthesized Data

To illustrate the conditional Granger causality approach in both time and frequency domains, a simple multivariate model with fixed coefficients which has been discussed in previous ([9, 16]) papers is tested first.

*n*is the time, and [

*ε*

_{1},

*ε*

_{2},

*ε*

_{3},

*ε*

_{4},

*ε*

_{5}] are independent Gaussian white noise processes with zero means and unit variances. From the equations, we see that

*X*

_{1}(

*n*) is a cause of

*X*

_{2}(

*n*),

*X*

_{3}(

*n*) and

*X*

_{4}(

*n*), and

*X*

_{4}(

*n*) and

*X*

_{5}(

*n*) share a feedback loop with each other. Fig. 2A shows an example of the time trace of 5 time series. We obtain 95% confidence intervals by bootstrapping: we simulated the fitted vector autoregressive model to generate a data set of 1000 realizations of 1000 time points with a sampling rate 200 Hz and used their statistics for estimating the confidence intervals [20]. (Fig. 2D). An ARMA (Auto-Regressive Moving Average) model was used to fit the data, and samples were drawn from this fitted ARMA model. To depict all causal relationships in a single figure, we enumerated them in a table as shown in Fig. 2C. According to the confidence intervals, one can derive the network's structure as shown in Fig. 2B. From the result, the Granger causality approach correctly recovered the pattern of the connectivity in this toy model. Furthermore, we applied the conditional Granger causality approach on frequency domain as shown in Fig. 2E. The causal relationships from

*X*

_{1}to

*X*

_{2},

*X*

_{3}and

*X*

_{4}show strong interactions at around 25 Hz.

### Local Network: A yeast synthetic network of five genes

A recent *Cell* paper by Irene Cantone *et al*. [14] assessed systems biology approaches for reverse-engineering and modeling (see also [13]). To recover a regulatory interaction network, the authors used three well-established reverse-engineering approaches: ordinary differential equations (ODEs), Bayesian networks and information theory. A gene synthetic network in the yeast consisting of 5 genes with 8 known interactions was investigated. From the results, the authors found ODEs and Bayesian networks could correctly infer most regulatory interactions from the experimental data with best values of PPV = 0.75 [Positive Predictive Value] and Se = 0.5 [Sensitivity]. In order to validate our approach, we applied conditional Granger causality [12] to the same experimental data. From our results, we found that the conditional Granger causality approach could also correctly infer most regulatory interactions and outperformed all the other three approaches reported in [14] with the best values of PPV = 0.83 and Se = 0.83. Hence the Granger causality approach, although simple, can be successfully applied to recover the network structure from temporal data and it could play a significant role in systems biology [21].

Initially, we applied conditional Granger causality to the switch-off time series which contained more time points than switch-on time series. The switch-off experiment data consisted of 4 replicates. Since a shift from galactose-raffinose- to glucose-containing medium caused a large initial decay, we simply removed the first two time points for 2 replicates. The time series were not stationary. The gene expression level decreased with time because of the inhibition effect of galactose-raffinose-containing medium. In order to apply conditional Granger causality, we were required to use ARIMA (Auto-Regressive Integrated Moving Average) rather than ARMA model to fit the data. The level of difference for ARIMA was chosen to be 1.

We then grouped Gal4 and Gal80 as a single node as they form a complex [14], and then applied conditional Granger causality approach. Fig. 3D. shows 95% confidence intervals for the causality. From this figure, we can then recover a simplified causal network as shown in Fig. 3B. It shows the 6 most significant edges. There are 4 true-positive edges and 1 false-positive edge. By comparing our PPV (0.83) and Se (0.83) values with the original paper (PPV = 0.75, Se = 0.5), it is further confirmed that the performance of our algorithm is much better. The reason why the Granger causality outperforms the other approaches is clear from the detailed analysis in [12] where we have reported that the Granger causality is better than the Bayesian approach provided the data set is long enough. The Bayesian approach is similar to the ODE approach as claimed in [12]. Hence we could reasonably expect that the Granger approach is the best among the four approaches.

### Local Network: A Local Circuit of Seven Proteins

*p*< 10

^{-13}). It has been further proved that it plays a functional role in the response to the drug: a doubling in the death rate was observed during the first 40 hours when DDX5 was knocked-down [15]. Protein RFC1 also showed a significant correlation with cell fate (with a p-value

*p*< 10

^{-6}). Our derived network is in good agreement with these two biological characteristics. Protein DDX5, which is the most significantly correlated with the cell fate, is on the top level of the network. Protein RFC1 is in a lower level comparing to DDX5, since the causal relation is from DDX5 to RFC1. Therefore, the results on the proteomic data and gene data confirm that the Granger causality is a simple and accurate algorithm to recover the network structure.

### Global Network: Synthesized Data

To measure the performance of the Global Granger Causality algorithm introduced in this paper, we first consider some toy models. The first toy model is a high-dimensional time series. We also compare the result of GGC with that of PGC.

#### Example 1

*ω*

_{1},⋯,

*ω*

_{12}are zero-mean uncorrelated process with identical variance. We generated time series of 80 points. The true network structure is depicted in Fig. 6A, there are 21 actual links. We first applied PGC to the data directly and used a bootstrap method to construct the network structure. More specifically, we simulated the fitted VAR model to generate a dataset of 1000 realizations of 80 time point, and used 3

*σ*as the confidence interval. If the lower limit of the confidence interval was greater than zero, we considered there was a relationship between two units. The network structure is depicted in Fig. 6B. The network structure we obtained from PGC was misleading. The reason is that since the order of the model is 4, the number of total parameters we should estimate in this model is 12 × 12 × 4, the estimator is unreliable with such little data.

Secondly, we used GGC to investigate the network structure. Fig. 6C shows the results we obtained after applying pairwise Granger causality. There are 33 links in total. We computed partial Granger causality conditioned on any intermediate node to identify whether the links appearing in Fig. 6C are direct or indirect. If the lower limit of the confidence interval of partial Granger causality is less than zero, then the link is regarded to be indirect and is deleted from Fig. 6C (dashed arrows). Fig. 6D is the final structure we get from GGC; it is consistent with the actual structure Fig. 6A.

#### Example 2: Random network

Next we present a validation of our method with a series of experiments on random networks for which the true structure is known. We built an Erdös-Rényi random graph with N = 200 nodes and M = N × log(N) = 1060 edges. From the network structure, we generated N time series with an auto-regressive model of order 1 whose transition matrix is the transpose of the adjacency matrix of the network, with its largest eigenvalue normalized to 0.99 to obtain a stable system. Each time series was 200 time-points long and normal noise of unit variance was added throughout. The algorithm was applied to each single node to get a list of their guessed ancestors. We then compared the true and computed lists of ancestors. One should expect that the connection between two nodes to be difficult to uncover if the corresponding coefficient in the linear model is small. To factor this out, we first considered the case where the non-zero coefficients of the transition matrix were all equal and maximized. We then applied the method on the case where the non-zero coefficients were randomly distributed.

#### Example 3: Constant coefficients

The data were generated by an auto-regressive model with transition matrix A. A is a scaled version of the transpose of the true adjacency matrix. The scaling factor was chosen so as to be maximal while leading to a largest eigenvalue for A of less than 1 (or the model degenerates). In this particular case, it was found to be 0.1736. The procedure has one parameter τ, the threshold at which a Granger-causality is deemed significant. By varying this parameter from 0 to 0.1, we obtained different large networks which we compared to the truth. The accuracy of each network was summarized by its true positive and false positive rates. Fig. 6E shows the resulting receiver operating characteristic (ROC) curve that is the graph obtained by plotting the false positive rate against the true positive rate. The performance of the method was extremely good, with an area under the curve close to 1. Crucially for biological applications, the false positive rate is always very small.

#### Example 4: Random coefficients

In this setup, the non-zero coefficients of the transition matrix were randomly distributed (normally distributed with mean 3 and multiplied by -1 with probability 1/2). The matrix was then scaled in the same manner as before. Fig. 6F shows the performance of the method on this harder problem. The method is not as accurate as before, with a maximum true positive rate just over 0.5. However, the false positive rate is still very low: the method doesn't guess as many ancestors as before but its guesses are rarely wrong. The fact that more connections are now missed out is not surprising: the non-zeros coefficients are randomly distributed and can be very small. Fig. 6G shows how the true positive rate varies with the magnitude of the coefficients; the true positive rate goes to zero with small coefficients.

### Global Network: A Global Circuit of 812 Proteins

Setting the same threshold as the one used to obtain the small circuit, a large, sparse network is obtained: 768 nodes remain (discarding those with no connections) and 2972 edges were found, which represented only 0.5% of all the possible edges. The complete structure can be found in the **Additional File**1. Fig. 7A shows the distributions of in-, out- and total degree of the nodes. All three distributions are exponential, precluding the possibility of a scale-free network. Each node has an average in-degree and out-degree of 3.8, indicating a well-connected network. This is confirmed by the characteristic path length (average of the shortest path between all pairs of nodes). Considered undirected, the graph has a characteristic path length of 3.8, in line with those of previously reported biological networks (see [23] and references within), including protein-protein interaction networks, although it should be noted that the present study is concerned with the dynamics of the proteins (as in [24] for example) and not their physical interactions (in which case the network is undirected by construction). The directed graph also has a small characteristic path length of 5.7 nodes and a small diameter (largest shortest path) of 12 nodes. Such connectedness indicates that the network is a small world [25, 26]. However, it is not particularly modular: while its mean clustering coefficient is an order of magnitude (17 times) higher than one of a random network, the clustering coefficient is almost constant with respect to the node's degree. In other words, the same level of clustering is found everywhere regardless of the node's degree.

The previous small network in Fig. 4 was obtained by using the conditional granger causality. Conditional Granger causality can be misled by common influence: if both nodes are subjected to an unknown common source, it can have an effect on their connections. Partial Granger causality - another extension of Granger causality [16, 27, 28] - can address this issue by considering an unseen external input in the linear model and working out its effect on the connection. For example, the partial Granger causality between DDX5 and RFC1 is very small, even though the conditional Granger causality between them is high (Fig. 4) and there exists a short path (1 intermediate) from DDX5 to RFC1 in the large network. This suggests the connection is affected by a common unseen source.

In order to identify which proteins have an influence on the connections between the 7 proteins of interest (AKAP8L, PSMB6, BAG2, DDX5, DKFZP434, SPCS1, and RFC1), we first extracted them as well as the proteins belonging to the shortest paths between them, resulting in a subset of 118 proteins. We then applied a filtering process on each of the 12 connections uncovered in the previous section. The rationale of the algorithm is that if removing the (explicit) influence of a protein makes the connection between two nodes change, then this protein should be kept as a potential influence on the connections - if z is independent of x and y, then z does not affect the Granger causality and F_{x→y|z} = F_{x→y}. After filtering for those that have an influence, we then considered their pairs and build a new subset, then triplets etc.. The end-result is a set of proteins which have a substantial influence on the connection.

Fig. 7D shows the small network of 7 proteins with the now-identified external influences. Note that those proteins do not necessarily belong to the path from one node to the other, but rather they have some substantial influence on the connection as a whole, for example on some members of the path.

### How reliable is Global Granger Causality?

In theory, we can recover all possible links from the pairwise Granger causality procedure and have to Granger-condition on all combinations of the nodes in the system to remove an indirect connection. However, it is an NP-hard problem and we will stop at a stage k, i.e., we only need to Granger-condition on the combinations of up to k nodes. Therefore, the analysis on how to choose k and the probability of correctly uncovering the true relationship of the whole network when we stop at stage k is of vital importance. In this section, we will provide some simulation and theoretic results on these questions.

Consider a network with N nodes {X_{1},⋯,X_{N}} with a connection probability p. There are N × (N - 1) × p direct links on average in the whole system. We intend to estimate how many indirect connections are left when we stop at stage k. Here we focus on a pair X to T, where X, T are in {X_{1},⋯,X_{N}}. If there exists only one single path from X to T, this link can be discarded by Granger-conditioning on a single intermediate node in the path. If there are more than one paths from X to T, in theory, this link should be discarded by Granger-conditioning on all the other nodes.

**Definition 1 (bottleneck)***. Assume that there are m distinctive directed paths from S* ∈ {X_{1},⋯,X_{ n }} *to T and p(S, T) be the set of all distinctive directed paths from S to T. A set of nodes* {Z_{1},⋯,Z_{m}} *is called a section from S to T if there is no directed path from S to T in the graph* {X_{1},⋯,X_{N}}-{Z_{1},⋯,Z_{m}}. *A section which minimizes its total number of elements of the section is called a bottleneck*.

For example, in Fig. 1C both {B_{1}, B_{2}} and {B_{3}} are sections from S to T, but {B_{3}} is the bottleneck..

**Lemma 1***. Assume that the set*{B

_{1},⋯,B

_{m}}

*is the bottleneck from S to T, we have*

**Proof**. We only check two cases here. The first case is that there is a single serial connection from S to T. For example, we have

*S*→

*B*

_{1}→

*B*

_{2}→⋯

*B*

_{ n }→

*T*where every single node {B

_{i}} is a bottleneck of the path. If we condition on one of the single node B

_{i}in the path, we need to show

*ξ*is the noise term. From the assumption of the path structure, we conclude

where E, F, G are polynomials and *ε* is the noise (could be different). From the equation above, we see that for any node B_{i,} we have Open image in new window . Intuitively, in a serial path *S* → *B*_{1} → *B*_{2} →⋯*B*_{ n }→ *T*, the information cannot be transmitted from S to T if B_{i} is removed. In conclusion, for a single path, the Granger causality is zero whenever we condition on one of its nodes in the path. It is not necessary to condition on the whole path to remove the causality.

_{1}and B

_{2}converge to a common bottleneck B

_{3}. It is easy to see that information can not be transmitted from S to T if B

_{3}is removed, then we can easily see that

Combining the above two cases completes the proof of the lemma.

Lemma 1 tells us that if there are m distinctive paths from S to T, i.e., the number of the bottleneck is m, then the causality between S and T will vanish when we take into account the partial Granger causality on {X_{1}, ...,X_{m}}. There may exist other common drives to the observed nodes S and T such as Fig. 1D. We assume the partial Granger causality can delete the influence of such drive and exclude such case in our analysis.

The exact formula of the number of bottlenecks seems to be fairly complicated but we can have a first look at the empirical distribution of it. For a variety of connection probability p, we generate 500 random networks when N = 100. For each network, we randomly select two nodes and compute the number of the bottleneck between them. Fig. 1E shows the histograms when p = 0.015, 0.02, 0.03 and 0.05, respectively. From these figures, it can be easily seen that the sparser the network is, the quicker we can detect the true structure from global Granger causality. When p = 0.015, it is very likely for any two nodes to be unconnected or directly connected, then almost all the true relationships can be uncovered at stage 1. When p = 0.02, all the true relationships can be uncovered at stage 2. When p = 0.03, the probability of uncovering the true relationship is 90.8% at stage 2 and 98.6% at stage 3. When p = 0.05, the probability of uncovering the true relationship is 82.2% at stage 4 and 97.8% at stage 6. It is not until stage 9 that all indirect links can be discarded.

## Conclusion

In this paper, we focused on the Granger causality approach in both the time and frequency domains in local and global networks. For a local gene circuit, a recent *Cell* paper by Irene Cantone *et al*. [14] assessed systems biology approaches for reverse-engineering and modeling by investigating a gene synthetic network in the yeast consisting of 5 genes with 8 interactions (also see highlight [13]). From our results, we found that our conditional Granger approach could also correctly infer most regulatory interactions and outperform the three approaches reported in the [14]. For a local protein-interaction network, our derived network is in good agreement with biological characteristics. Therefore, the results on the proteomic data and gene data confirm that the Granger causality is a simple and accurate approach to recover the network structure. For a global network, our novel approach was successfully used to build a large network from all the recorded 812 proteins.

## Notes

### Acknowledgements

CZ, CL and JF are funded by the EPSRC Project CARMEN (EP/E002331/1). SG is funded by NSFC (#10901049) and Hunan Provincial Education Department (#09C636). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

## Supplementary material

## References

- 1.Alon U:
*An introduction to Systems Biology: Design Principles of Biological Circuits*. Chapman and Hall/CRC; 2007.Google Scholar - 2.Klipp E, Herwig R, Kowald A, Wierling C, Lehrach H:
*Systems biology in practice: Concepts, Implementation and Application*. WILEY-VCH Verlag GmbH CO. KGaA; 2005.CrossRefGoogle Scholar - 3.Wiener N: The theory of prediction. In
*Modern Mathematics for Engineers*.*Volume 1*. Edited by: Beckenbach EF. New York: McGraw-Hill; 1956.Google Scholar - 4.Granger C: Investigating causal relations by econometric models and cross-spectral methods.
*Econometrica*1969, 37: 424–438. 10.2307/1912791CrossRefGoogle Scholar - 5.Granger C: Testing for Causality: A Personal Viewpoint.
*Journal of Economic Dynamics and Control*1980, 2: 329–352. 10.1016/0165-1889(80)90069-XCrossRefGoogle Scholar - 6.Chen YH, Rangarajan G, Feng J, Ding M: Analyzing multiple nonlinear time series with extended Granger causality.
*Physics Letters A*2004, (324):26–35. 10.1016/j.physleta.2004.02.032Google Scholar - 7.Chen YH, Bressler SL, Ding MZ: Frequency Decomposition of Conditional Granger Causality and Application to Multivariate Neural Field Potential Data.
*J Neurosci Methods*2006, 150: 228–237. 10.1016/j.jneumeth.2005.06.011CrossRefPubMedGoogle Scholar - 8.Feng JF, Yi DY, Krishna R, Guo SX, Buchanan-Wollaston V: Listen to genes: dealing with microarray data in the frequency domain.
*PLoS One*2009, 4(4):e5098. 10.1371/journal.pone.0005098CrossRefPubMedPubMedCentralGoogle Scholar - 9.Guo SX, Wu JH, Ding MZ, Feng JF: Uncovering Interactions in Frequency Domains.
*PLoS Computational Biology*2008, e1000087: 1–10.Google Scholar - 10.Ge T, Kendrick K, Feng JF: A Unified Dynamic and Granger Causal Model Approach Demonstrates Brain Hemispheric Differences During Face Recognition Learning.
*PLoS Comp Biol*5(11):e1000570. 10.1371/journal.pcbi.1000570Google Scholar - 11.Marinazzo D, Pellicoro M, Stramaglia S: Kernel Granger causality and the analysis of dynamical networks.
*Physical Review E*2008, 77: 056215. 10.1103/PhysRevE.77.056215CrossRefGoogle Scholar - 12.Zou CL, Kendrick KM, Feng JF: Granger causality vs. Dynamic Bayesian network inference: A Comparative Study.
*BMC Bioinformatics*2009, 10: 122. 10.1186/1471-2105-10-122CrossRefPubMedPubMedCentralGoogle Scholar - 13.Camacho DM, Collins JJ: Systems biology strikes gold.
*Cell*2009, 137: 24–26. 10.1016/j.cell.2009.03.032CrossRefPubMedGoogle Scholar - 14.Cantone I, Marucci L, Iorio F, Ricci M, Belcastro V, Bansal M, Santini S, di Bernardo M, di Bernardo D, Cosma MP: A Yeast Synthetic Network for In Vivo Assessment of Reverse-Engineering and Modeling Approaches.
*Cell*2009, 137: 172–181. 10.1016/j.cell.2009.01.055CrossRefPubMedGoogle Scholar - 15.Cohen AA, Geva-Zatorsky N, Eden E, Frenkel-Morgenstern M, Issaeva I, Sigal A, Milo R, Cohen-Saidon C, Liron Y, Kam Z, Cohen L, Danon T, Perzov N, Alon U: Dynamic Proteomics of Individual Cancer Cells in Response to a Drug.
*Science*2008, 322(5907):1511–6. 10.1126/science.1160165CrossRefPubMedGoogle Scholar - 16.Guo SX, Seth AK, Kendrick KM, Zhou C, Feng JF: Partial granger causality-eliminating exogenous inputs and latent variables.
*Journal of Neuroscience Methods*2008, 172: 79–93. 10.1016/j.jneumeth.2008.04.011CrossRefPubMedGoogle Scholar - 17.Geweke J: Measurement of Linear Dependence and Feedback Between Multiple Time Series.
*Journal of the American Statistical Association*1982, 77: 304–313. 10.2307/2287238CrossRefGoogle Scholar - 18.Geweke J: Measures of Conditional Linear Dependence and Feedback Between Time Series.
*Journal of the American Statistical Association*1984, 79: 907–915. 10.2307/2288723CrossRefGoogle Scholar - 19.Ding M, Chen Y, Bressler SL: Granger causality: basic theory and application to neuroscience. 2006. Arxiv preprint q-bio/0608035Google Scholar
- 20.Wehrens R, Putter H, Buydens LM: The bootstrap: a tutorial.
*Chemometrics and Intelligent Laboratory Systems*2000, 54: 35–52. 10.1016/S0169-7439(00)00102-7CrossRefGoogle Scholar - 21.Zou CL, Kendrick KM, Feng JF: The Fourth Way: Granger Causality is better than the three other Reverse-engineering Approaches. 2009. COMMENTS ON CellGoogle Scholar
- 22.Sigal A, Milo R, Cohen A, Geva-Zatorsky N, Klein Y, Liron Y, Rosenfeld N, Danon T, Perzov N, Alon U: Variability and memory of protein levels in human cells.
*Nature*2006, 444: 643–646. 10.1038/nature05316CrossRefPubMedGoogle Scholar - 23.Mason O, Verwoerd M: Graph theory and networks in biology.
*IET Systems Biology*2007, 1: 89–119. 10.1049/iet-syb:20060038CrossRefPubMedGoogle Scholar - 24.Sachs K, Perez O, Pe'er D, Lauffenburger DA, Nolan GP: Causal protein-signaling networks derived from multiparameter single-cell data.
*Science*2005, 308: 523–529. 10.1126/science.1105809CrossRefPubMedGoogle Scholar - 25.Kleinberg JM: Navigation in a small world.
*Nature*2000, 406: 845. 10.1038/35022643CrossRefPubMedGoogle Scholar - 26.Wang XF, Chen G: Complex networks: small-world, scale-free and beyond.
*Circuits and Systems Magazine, IEEE*2003, 3: 6–20. 10.1109/MCAS.2003.1228503CrossRefGoogle Scholar - 27.Ladroue C, Guo SX, Kendrick KM, Feng JF: Beyond element-wise interactions: Defining group-to-group interactions for biological processes.
*PLoS One*2009, 4(9):e6899. 10.1371/journal.pone.0006899CrossRefPubMedPubMedCentralGoogle Scholar - 28.The Fourth Way: Granger Causality is better than the three other Reverse-engineering Approaches[http://www.dcs.warwick.ac.uk/~feng/cell_09.html] Additional materials including a Matlab program and more references on causality analysis are available at

## Copyright information

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.