1 Introduction

When the interplanetary magnetic field (IMF) is characterized by a nearly-southward orientation for a sufficiently long time, the near-Earth electromagnetic environment, i.e., the plasma circulation and the magnetospheric/ionospheric current systems, undergoes some dynamical changes yielding dissipation of the excess of energy, momentum and mass coming from the surrounding interplanetary medium (Akasofu and Chapman 1961; Akasofu et al. 1974). As a consequence, magnetic storms and substorms develop, being the macroscopic manifestation of these dynamical processes of dissipation (Gonzalez et al. 1994).

In recent years, understanding the physical mechanisms behind the development of these events is becoming more and more important since magnetic disturbances may be tremendously hazardous for telecommunications, satellites preservation and also responsible for exposing astronauts to abnormal radiation level (Malandraki and Crosby 2018). The fingerprint of a magnetic storm is the depression of the horizontal component of the magnetic field caused by the enhancement of the ring current, usually monitored by the Disturbance Storm Time (\(D_{st}\)) index (1 h resolution by definition) or the SYM-H index, which is its 1-min resolution equivalent. It is now recognized that the efficiency of energization is, in turn, primarily controlled by a long-lasting southward component of the IMF. However, whether the southward IMF is the unique driver of both particle energization and injection or if other magnetospheric-ionospheric internal processes are responsible for large scale injection into the storm-time ring current is still an open problem (Borovsky 2021).

Since the pioneering works by Akasofu and Chapman (1961), magnetospheric substorms (i.e., violent electrojet activity) have been identified as a possible source of particle injection, mainly because of their statistical association with the occurrence of the magnetic storm main phase (Kamide et al. 1998). In particular, based on investigating particle injection at geostationary orbit it was concluded that substorms drive particle acceleration (Akasofu et al. 1974). Soon after, it has been shown that the constituents of the storm-time ring current are more energetic than those energized directly by substorms (Williams 1987). On the other hand, predictions of the \(D_{st}\) index, by using the auroral index AL alone, are well in agreement with observations (Burton et al. 1975; Kamide and Fukushima 1971; Gonzalez et al. 1994). Furthermore, AMPTE and CRRES missions (Krimigis et al. 1982; Wilken et al. 1992) also reported the direct observation of ionospheric-origin ions dominating the population of the ring current during the main phase of a storm (Hamilton et al. 1988; Daglis 1997), suggesting a direct causal link with substorms. Indeed the upward acceleration along the magnetic field lines of these ionospheric ions may be associated to successive occurrence of intense substorms.

Despite such important observations, it remains extremely difficult to disentangle the role of substorms in the energization of the storm-time ring current directly from geomagnetic indices data. First attempts have been made by using prediction filters and, more recently, by using sophisticated techniques based on information theory (De Michelis et al. 2011; Stumpo et al. 2020; Manshour et al. 2021). All these works share the common idea that causation is associated to the notion of predictability, instead of correlation. If the knowledge of the time series Y reduces the error in predicting the time series X it is said that an information flow (IF; or, equivalently, predictive causality) exists from Y to X. This is the reason why predictions are associated to evidence of asymmetric couplings between the components of a physical system (in this case the magnetosphere-ionosphere system).

The main drawback in using data-driven techniques is conceptual. Can geomagnetic indices as \(D_{st}\) (or SYM-H) and AL capture the processes of injection and energization (McPherron 1997)? Other issues arise when conditional statistics are used. Indeed, recently Runge et al. (2018) and Manshour et al. (2021) have shown, using conditional transfer entropy, that if one removes the influence of the southward IMF, the IF from AL/AE to SYM-H (and vice-versa) previously found by De De Michelis et al. (2011) and Stumpo et al. (2020) becomes negligible. These studies are carried out, respectively, by using 20 min time averaged SYM-H and 5 min resampled SYM-H, although McPherron (1997) suggested that high-resolution indices are required for unveiling the role of substorms because otherwise fast dynamics may be lost in average. On the other hand, other authors suggested that a magnetic storm is not a trivial superposition of intense substorms, but that the outflow of ionospheric ions is controlled by an efficiency function \(\alpha =\alpha (B_z, t)\) which depends on the southward IMF. In this framework, \(\alpha \) is an empirical function depending on \(B_z\) and on the cumulative time in which the condition \(Bz <0\) persists, although an exact mathematical form cannot be derived from first principles (see, e.g., Gonzalez et al. (1994) for more details). In particular, according to Kamide (1992) and Gonzalez et al. (1994), the energy balance equation for \(D_{st}\) (justified in physics ground by the Dessler-Parker-Sckopke relation, Dessler and Parker (1959); Sckopke (1966)) can be written as

$$\begin{aligned} \frac{d}{dt}D_{st}(t) = \alpha (B_z, t) \text {AL}(t) - \frac{D_{st}(t)}{T}, \end{aligned}$$
(1)

where T is the relaxation time of \(D_{st}\), i.e., the duration of the disturbance, which in general is not constant but depends on the specific concentrations of ions species in the ring current. Now, if the southward \(B_z\) controls the efficiency, i.e., the energy supply, the averaged conditional IF used by Manshour et al. (2021) and by Runge et al. (2018) may be zero because the periods in which \(\alpha \ne 0\) are only transient. Equation (1) suggests also that the IF from AL to \(D_{st}\) is non-negligible only during southward \(B_z\), also according to observations of particle injection discussed above. This fact implies that the IF is strongly non-stationary and, as such, it is different during storm and non-storm times. Thus, unraveling the IF by using long time series to estimate the average of transfer entropy may be misleading, since \(\alpha (B_z, t)\) is reasonably different from zero only for short times. As a result, the use of long time series most likely tends to unbalance the contributions of quiet and disturbed geomagnetic periods, thus leading to bias the IF towards values mainly due to non-storm time intervals. This argument may be at the origin of the vanishing IF observed by Runge et al. (2018) and Manshour et al. (2021).

In this framework, the aim of this paper is to shed some light in the debate about the influence of magnetospheric substorms on the storm-time ring current development. The main task is to suit the powerful information-theoretic approach to the study of this issue in a consistent way with physics modeling and in-situ observations. To achieve this we consider a dataset of storms and substorms extracted from SuperMAG database (Gjerloev 2009) and we perform the estimation of the transfer entropy between high-latitude, low-latitude geomagnetic indices and IMF \(B_z\) component by using a sliding window technique. Since small time windows contain non-stationary data we consider a set of storm events, i.e. an ensemble of independent time series, that allows us to investigate the dynamics of the IF between different physical quantities during the storm events. In fact, the main advantage of this approach is that quiet and disturbed periods are not averaged in the analysis and thus the ensemble enables to quantify how the IF varies during the different phases of the storm.

The paper is organized as follows: in Sect. 2 we will review the concept of IF and the definition as well as the derivation of the transfer entropy. Next we will discuss the motivation behind the use of this tool as well as advantages and disadvantages of the ensemble approach for geomagnetic studies. In Sect. 3 we present the dataset and we discuss the synchronization of signals involved in the ensemble. Finally in Sect. 4 we show the results, while in Sect. 5 we discuss them in terms of interpretation and contextualization in literature.

2 Methods

Generally, the study of causation is dominated by the notion of predictability (Sugihara et al. 2012). In this context, if we have two time series X and Y, we say that Y drives the dynamics of X if the information about the state of X can be recovered from the past states of Y and not vice-versa. In the framework of non-parametric statistics, the formalization of this concept can be retrieved in the notion of IF.

The step forward of the IF with respect to the popular concept of correlation is that the former removes any redundant or shared information between current X and its own past. For example, if we have two processes, X and Y, such that Y drives X and not vice-versa, we need to consider that the process \(Y_t\) incorporates intrinsically the information about \(X_t\)’s past, otherwise the cross correlation (or, equivalently, the delayed mutual information) in the direction from X to Y would not be zero even if the IF is absent. For further details see Bossomaier et al. (2016) and Schreiber (2000).

Therefore, if we want to compute the IF from Y to X, the idea is to remove the redundancy introduced by the past history of X. To account this, the most simple formulation of transfer entropy is given in terms of conditional mutual information (CMI), i.e.

$$\begin{aligned} T_{Y\rightarrow X}^{(k,l)}\left( \tau ) = I(X_t; \textbf{Y}_{t-\tau }^{(l)} \vert \textbf{X}_{t-1}^{(k)}\right) , \end{aligned}$$
(2)

where in general \(\textbf{Y}_{t-\tau }^{(l)} = \left( Y_{t-\tau }, Y_{t-\tau -1},..., Y_{t-\tau -l} \right) \) and \(\textbf{X}_{t}^{(k)} = \left( X_t, X_{t-1},..., X_{t-k} \right) \) are the multivariate reconstruction of Y and X past histories, respectively, if we assume they are l-th and k-th order Markov processes. From a probabilistic point-of-view, the transfer entropy is essentially the distance in probability from the validity of the generalized Markov condition, i.e.

$$\begin{aligned} p\left( X_{t}\vert \textbf{X}_{t-1}^{(k)};\textbf{Y}_{t-\tau }^{(l)}\right) =p\left( X_{t}\vert \textbf{X}_{t-1}^{(k)}\right) , \end{aligned}$$
(3)

which is fulfilled if and only if \(X_t\) depends (conditionally) only on its own history. Using conditional Kullback Leibler Divergence (cKLD), we can compute the distance between l.h.s and r.h.s of Equation (3) and obtain the explicit formula for the transfer entropy (Schreiber 2000)

$$\begin{aligned} T_{Y \rightarrow X}^{(k,l)}(\tau ) = \sum _{X_{t},\textbf{X}_{t-1}^{(k)},\textbf{Y}_{t-\tau }^{(l)}} p(X_{t},\textbf{X}_{t-1}^{(k)},\textbf{Y}_{t-\tau }^{(l)})\log \frac{p\left( X_{t}\vert \textbf{X}_{t-1}^{(k)},\textbf{Y}_{t-\tau }^{(l)}\right) }{p\left( X_{t}\vert \textbf{X}_{t-1}^{(k)}\right) }. \end{aligned}$$
(4)

To compute Eq. 4 the Kraskov–Stögbauer–Grassberger estimator is used for its optimality in terms of systematic errors and biases due to finite sample effects (Kraskov 2004; Kraskov et al. 2004; Wibral 2014).

Note that whereas in general for non-Markov process both \((k, l) \rightarrow (\infty , \infty )\), for purely Markov processes, the past history of e.g. \(X_t\) is completely embedded into \(X_{t-1}\) alone, so that the redundancy of X’s own past can be elimated by simply conditioning on \(X_{t-1}\) and the multivariate vector \(\textbf{X}_{t-1}^{(k)}\) collapses to the univariate signal \(\textbf{X}_{t-1}^{(k)} \rightarrow X_{t-1}\). At this point it is also worth noticing that \(\textbf{X}_{t-1}^{(k)}\) and \(\textbf{Y}_{t-\tau }^{(l)}\) can be thought as an embedding reconstruction of the phase space according to Taken’s theorem (Takens 1981), although the equivalence with the real phase-space is not guaranteed for stochastic systems (Kantz and Schreiber 2003). Furthermore, when the noise-level is high, or when the underlying dynamics is stochastic at coarse grained scales, the parameters of the reconstruction k and l, cannot be recovered by using methods, such as False Nearest Neighbours technique (Kennel et al. 1992), because they are suited for deterministic dynamics (Ragwitz and Kantz 2002; Kantz and Schreiber 2003).

Our definitions in Eqs. (2) and (4) incorporate directly the time lag \(\tau \) between X and Y. This is because the interaction may be delayed in time more than \(\tau =1\) as in the original definition by Schreiber (2000). In this case, only Y is lagged forward in time, while the past of X remains untouched. This choice, as demonstrated explicitly by Wibral et al. (2013), is the only one, among the most popular definitions of information transfer (e.g. Pompe and Runge (2011), Paluš et al. (2001)), allowing to restore Wiener’s principle of causality and to recover the correct transfer delays (Bossomaier et al. 2016; Wibral et al. 2013).

Practically, the interpretation of the transfer entropy is straightforwardly related to the difference in the uncertainties of X’s future before and after the knowledge of Y’s past, recovering again the notion of predictability in the statistical inference of IF. Following (Bossomaier et al. 2016) this interpretation can be put in a formal way by decomposing the total uncertainty, i.e. the Shannon entropy, \(H(X_t)\) as

$$\begin{aligned} H(X_t) = I(\textbf{X}_{t-1}^{(k)}; X_{t}) + T_{Y \rightarrow X}^{(k,l)}(\tau ) + H(X_t \vert \textbf{X}_{t-1}^{(k)}; \textbf{Y}_{t-\tau }^{(l)}). \end{aligned}$$
(5)

Hence, we can extract three contributions in the uncertainty on the future state of X: the first accounts for the information contained in the past \(\textbf{X}_{t-1}^{(k)}\), the second term is the IF from the Y’s past, and the third term is the residual uncertainty after the knowledge of both X’s and Y’s histories. From Eq. (5) it is clear that the consideration of \(\textbf{X}_{t-1}^{(k)}\) instead of \(X_{t-1}\) controls the balance between stored and transferred information. An inadequate reconstruction of the k-th Markov process may lead to confusion between stored and transferred information (Bossomaier et al. 2016).

In general, when the information-theoretic approach is applied to real data, we have single realizations of X and Y as time series processes. In this case, the PDFs in Eq. (4) are estimated assuming stationarity and, naturally, also the IF in this case should be stationary. The consequence is that transient dynamics and local non-stationarity is completely neglected. This point is crucial when the aim is to study the role of intense substorms in the energization of storm-time ring current. For this reason, as mentioned in the Introduction, we need a time-resolved estimation of the transfer entropy over a sliding window as presented below.

Let \(W=\{\hat{t}-\delta , \hat{t}-\delta +1,..., \hat{t}+\delta \}\) be the time window centered around \(\hat{t}\). Then, we can restrict the time series \(X_t\), \(\textbf{X}_{t-1}^{(k)}\) and \(\textbf{Y}_{t-\tau }^{(l)}\) to W and compute Equation (4) to get the time windowed transfer entropy, i.e.

$$\begin{aligned} \mathcal {T}_{Y\rightarrow X}^{(k, l)}(\hat{t}, \tau ) = T_{Y_W \rightarrow X_W}^{(k,l)}(\tau ). \end{aligned}$$
(6)

In principle, in the limit \(N\rightarrow \infty \) and \(\delta \rightarrow 0\), the average \(\langle \mathcal {T}_{Y\rightarrow X}^{(k, l)}(\hat{t}, \tau ) \rangle _{\hat{t}}\) converges to Eq. (4). As can be seen from this definition, one of the main limitation of the method when we deal with empirical observations, is represented by the need for a sufficient statistics, which cannot be always guaranteed. Indeed, geomagnetic indices are sampled at a maximum cadence of 1 min and the used time window must be large enough to include a sufficient number of data points in the statistics. On the other hand, this size cannot be too large either since we want to resolve transient dynamics, which could be suppressed by averaging on wide time windows.

In order to overcome this limitation, firstly we find a suitable trade-off for the size of the time window used in the analysis and then we introduce the estimation of the transfer entropy over an ensemble of independent realizations of magnetic storms in a way similar to the method proposed by Gómez-Herrero et al. (2015). This enables us to study how the IF varies during the evolution of the magnetic storm. We remark that the ensemble approach is somewhat different from computing transfer entropy between individual trials averaging the single transfer entropies a posteriori. This would not be an ensemble approach. In contrast, we merge together all the time series in the specific time windows and compute directly the total transfer entropy.

Note that the need for a sufficient statistics is also strongly affected by the so-called curse of dimensionality. Indeed, in general the unbiased estimation of the transfer entropy requires the k-th and l-th reconstruction of Markov processes as explained above, but the number of data points needed for the correct sampling of PDFs scales non-linearly with the dimension, i.e. with both k and l.

Another crucial point is that the transfer entropy in Eq. (2), when the IF is absent, is equal to zero only theoretically. When the sample size is finite and the transfer entropy is empirically measured, a bias is always present, regardless the estimator we use. In this framework, a key question is whether or not the values found for the transfer entropy are statistically significant, especially if we do not know a priori the underlying PDFs. To perform such a test, we need forming the null hypothesis \(H_0\) that the IF is zero and the relevant distribution of the transfer entropy would be if \(H_0\) was true. Practically it means that we need surrogate time series \(\textbf{Y}_{t-\tau }^{(l)}\) such that \(p(X_t\vert \textbf{X}_{t-1}^{(k)};\textbf{Y}_{t-\tau }^{(l)})=p(X_t\vert \textbf{X}_{t-1}^{(k)})\). In order to achieve this we create surrogate trials by only shuffling the source time series \(\textbf{Y}_{t-\tau }^{(k)}\) and leaving X untouched. Indeed if X was shuffled, any correlation would results to be destroyed and the Markov condition in Eq. (3), i.e. our null hypothesis, may not be fulfilled anymore. Finally, a threshold confidence is fixed at 0.95 and the corresponding critical value of the transfer entropy \(\hat{T}_{Y \rightarrow X}(\hat{t}, \tau )\) is computed in each window, so that if our measurements of the IF are greater than \(\hat{T}_{Y \rightarrow X}(\hat{t}, \tau )\) we can argue statistical significance. For this preliminary study we use only two surrogates to fix the background of transfer entropy values.

3 Data preparation

In order to investigate the IF between external driving, auroral electrojet activity and ring current dynamics during magnetic storm events, we use the Super-MAG high-latitude index SML and low-latitude index SMR, which are a generalization of the traditional AL and SYM-H, respectively. They are, as usual, derived from deviations with respect to the average value of the horizontal (H) component of the geomagnetic field measured from a network of nearly-auroral/equatorial ground-based magnetometers (Gjerloev 2009). The choice of Super-MAG indices is motivated by the fact that, since the auroral oval moves towards lower latitudes during severe magnetic storms, classical high-latitude geomagnetic indices have some limitations in estimating the correct value of auroral electrojet current intensity. In this framework the Super-MAG collaboration introduced the generalized AE-indices, i.e. SML, SMU and SME, computed using more than 300 different stations. Furthermore, the Super-MAG collaboration has 98 magnetometers in the range of latitudes currently used for constructing SYM-H and \(D_{st}\) indices. So that, this sub-network is used to build up the SuperMAG equivalent of the ring current proxies, namely the SMR index (Newell and Gjerloev 2012).

In this framework, the typical fingerprint of a magnetic storm is monitored through the SMR index, which exhibits a sudden depression towards negative values. On the other hand, the polar substorm activity, i.e. the magnetic disturbance caused by the auroral electrojet current flowing in the auroral region, is investigated by means of the SML index, which is mainly representative of the geomagnetic tail dynamics (Gjerloev et al. 2004; Davis and Sugiura 1966; Kamide and Rostoker 2004). Finally, we use the z-component of the IMF \(B_z\) collected from OMNI database to infer the IF from the solar wind to internal magnetosphere-ionosphere system.

The aforementioned ensemble of magnetic storms is now introduced. In detail, we started with a 23-years dataset (from 1995 to 2018) of \(B_z\), SML and SMR from which we selected a set of magnetic storm periods for which \(\text {SMR}\le -150\) nT by considering a period of 10 days before and after the minimum of SMR during each storm event. In order to complete the ensemble we use the same periods of time for SML and \(B_z\). As a last step, the double peaked storms have been removed from the ensemble by visual inspection since such complex events could introduce spurious effects when considered in our ensemble-based analysis. The final dataset consists of \(N_r=30\) independent storms that are reported in Fig. 1.

Fig. 1
figure 1

From top to bottom, SMR, SML and \(B_z\). The time series are collected at 1 min resolution

4 Results

Fig. 2
figure 2

Top: Contour plot of the transfer entropy from \(B_z\) to SML with respect to the time window and to the time delay \(\tau \). In order to compare the IF in terms of storm phases, the averaged track of SMR is depicted in black solid-line. Bottom: Contour plot of the transfer entropy from \(B_z\) to SMR with respect to the time window and to the time delay \(\tau \). In order to compare the IF in terms of storm phases, the averaged track of SMR is depicted in black solid-line. In both panels the TE is obtained with non-overlapping windows of 1-day width

The main objective of this work is to provide a novel approach to inspect the IF within impulsive and strongly non-stationary processes, such as magnetic storms and magnetospheric substorms. As a first step we aim to evaluate the IF from the z-IMF component, which is representative of the driver, to both high-latitude and low-latitude geomagnetic activity by means of SML and SMR indices, respectively. The top panel of Fig. 2 shows the contour plot of the ensemble-based TE \(T(B_z\rightarrow \text {SML})\) as a function of time and time delay \(\tau \). We report the ensemble-averaged trend of the SMR index inside the figures, since this index is the one we used in the event synchronization and moreover it serves as a guide for the eye in identifying all the different phases of the magnetic storm. Hence, the time is reported in days from/after the minimum SMR peak, which corresponds to \(t=0\). As is clear from the top panel of Fig. 2, there are different enhancements in the IF from \(B_z\) to SML that are not related to the occurrence of the magnetic storm. Furthermore, the highest values of the TE are reached in proximity of the storm, i.e. during the pre-storm period and within the recovery phase, whereas a sudden decrease in the IF is observed during the storm main phase. Conversely, if we consider \(T(B_z\rightarrow \text {SMR})\), reported in the bottom panel of Fig. 2, a significant enchancement of the IF is only present during the storm main phase. In this framework, these results emphasize the different role that the driver (\(B_z\) in this case) plays in contributing to the dynamics of storms and substorms.

Fig. 3
figure 3

Top: Contour plot of the transfer entropy from SML to SMR with respect to the time window and to the time delay \(\tau \). In order to compare the IF in terms of storm phases, the averaged track of SMR is depicted in black solid-line. Bottom: Contour plot of the transfer entropy from SMR to SML with respect to the time window and to the time delay \(\tau \). In order to compare the IF in terms of storm phases, the averaged track of SMR is depicted in black solid-line. In both panels the TE is obtained with non-overlapping windows of 1-day width

In terms of internal dynamics, the characterization of the IF within the magnetosphere-ionosphere system is in general much more complex, since such flow is not unidirectional and feedback processes may be present as well. In this case it is crucial to elucidate the dynamics of the IF. In previous works, the investigation of the IF within the magnetosphere-ionosphere system has been carried by using a one-year dataset and the TE has been estimated over the whole signals neglecting possible time variations of the IF, e.g. in terms of intensity or direction. The ensemble TE from SML to SMR and vice-versa are reported, respectively, in the top and bottom panels of Fig. 3. By looking at \(T(\text {SML}\rightarrow \text {SMR})\) it is clear how the maximum transfer of information from SML towards SMR is strongly localized around the minimum of \(\langle \text {SMR}\rangle \), i.e. during the storm main phase. Moreover, the time lag \(\tau \) at which the maximum is located is \(\sim 0\). By looking at \(T(\text {SMR}\rightarrow \text {SML})\), which is representative of the IF from the ring current to the westward auroral electrojet current system, we observe a quite different scenario. The first enhancement of the TE approaching the storm is located just before the depression of SMR, whereas the maximum TE values are reached in the recovery phase. Contrary to what is observed for \(\text {SML} \rightarrow \text {SMR}\), a sudden decrease of the IF between SMR and SML is observed at the onset of the mean main phase.

5 Discussion and conclusion

In this study we provided a first attempt to characterize the dynamics of the IF within the magnetosphere-ionosphere system using a database of magnetic storms instead of considering a long time series of geomagnetic indices. This allows us to avoid mixing the statistics of quiet and disturbed periods, as well as, thanks to our moving-window approach to follow the transition from quiet and disturbed conditions. However, one of the main limitation in considering the IF as an intrinsically non-stationary measure during transient periods, is the need for a sufficient statistics which clearly cannot be guaranteed. In order to overcome this problem, we introduced the analysis of transfer entropy over an ensemble of independent realizations of magnetic storms in a way similar to the method proposed by Gómez-Herrero et al. (2015). We emphasize again that this approach is somewhat different from computing the transfer entropy between individual trials and then averaging the single results a posteriori.

We presented our approach by analyzing an ensemble of 30 independent magnetic storms. Firstly we studied the dynamics of the IF from solar wind to the magnetosphere-ionosphere system and we found a delayed information transfer according to previous findings (De Michelis et al. 2011; Alberti et al. 2017; Stumpo et al. 2020; Runge et al. 2018; Manshour et al. 2021). However, whereas the IF from \(B_z\) to SMR, i.e., from the solar wind to the low-latitude magnetosphere (ring current), is enhanced only during the onset of the main phase of a magnetic storm, the IF from \(B_z\) to SML, i.e., from the solar wind to the polar ionosphere, enhances not only during storm-times. This fact may be explained by observing that auroral disturbance, i.e. magnetospheric substorms, can also occur outside a magnetic storm (Kamide 1992). Indeed, whereas the development of a main-phase requires a southward oriented \(B_z\) for a sufficient long time, the injection of solar wind particles into the polar ionosphere, i.e., the onset of geomagnetic tail reconnection and the successive impulsive energy dissipation through magnetospheric substorms, occur whenever \(B_z\) is southward-oriented.

The study of the internal dynamics, i.e., the magnetosphere-ionosphere coupling, is in general much more complex to interpret because a large number of current systems are involved. In this framework we found an important IF from SML to SMR at the beginning of the depression of SMR, which can be interpreted as the contribution of the outflow from the ionosphere. The current systems which act as mediators between the magnetosphere and the ionosphere in this case are the Field-Aligned-Currents (FACs). When the minimum of the disturbance is reached, the IF drops abruptly and is enhanced again during the recovery phase. The coupling induced by the FACs is not unidirectional, indeed they form a closed system with a reverse IF from SMR to SML, especially during the recovery phase. A possible explanation is that the excess of energetic particles are re-injected into the ionosphere from the ring current, where dissipation occurs via secondary substorms. On the other hand, this effect may be due to the non-Markovian nature of SML index at time-scales larger than 60 min as recently demonstrated by Benella et al. (2022). Without an appropriate reconstruction of the l-th order Markov process (see Sect. 2), limited essentially by the need for a sufficient statistics, the effects of non-Markovianity may not be negligible so that the IF may be overestimated due to SML itself in this case.

Fig. 4
figure 4

Top: Contour plot of the transfer entropy from SML to SMR with respect to the time window and to the time delay \(\tau \). In order to compare the IF in terms of storm phases, the averaged track of SMR is depicted in black solid-line. Bottom: Contour plot of the transfer entropy from SMR to SML with respect to the time window and to the time delay \(\tau \). In order to compare the IF in terms of storm phases, the averaged track of SMR is depicted in black solid-line. In both panels the TE is evaluated with non-overlapping windows of 2-days width

The picture that during the early stages of a magnetic storm there exists a concurrent effect between a direct driving due to the solar wind activity and high-latitude processes is in agreement with the energy balance equation of \(D_{st}\) index written in the form of Eq. (1), which enabled very good predictions of \(D_{st}\) (Kamide and Fukushima 1971; Gonzalez et al. 1994; Kamide et al. 1998). In particular, Eq. (1) represents the interplay between the direct external driver (i.e., efficiency of magnetic reconnection and efficiency of energy supply) and those internal processes triggered by the energy input given by the external driver. From a statistical point of view, our findings are also in great agreement with a very recent study by Alberti et al. (2022). By using a novel approach based on dynamical systems theory, they computed the dimension of the reconstructed phase space by firstly considering AL and SYM-H alone and then by considering the joint process (AL, SYM-H). Interestingly, from this analysis figured out an independent contribution of AL in the dynamics of SYM-H during the development of the main-phase, in agreement with the IF between SML and SMR.

At this stage, it is also important to mention that the window width used for computing the transfer entropy in Eq. (4), naturally influences the behaviour of the IF. This is not surprising since the window width defines the time-scales in which the IF is measured. For example, if we compute the transfer entropy using a window width of 2 days, we found the results shown in Fig. 4 for the IF from SML to SMR (top panel) and vice-versa (bottom panel). In this case we can see only the contribution of SML to the outflow localized just during the development of the main phase. In the reverse direction, i.e. from SMR to SML, we found a feedback process localized during the start of the recovery phase. Therefore, this behaviour highlights again the dependence of the IF on the time-scales in which it is measured. The time-scale dependence of the coupling for the case external-internal processes has been highlighted by Alberti et al. (2017) by using the delayed mutual information on the filtered signals.

The works by Runge et al. (2018) and Manshour et al. (2021), in contrast to previous findings by De Michelis et al. (2011) and Stumpo et al. (2020), found that the IF from the high-latitude to low-latitude (and vice-versa) is completely explained by the IMF, which might be the common driver. However, these results must be carefully interpreted since they provide an average view of the SMI system. This is related to the use of long time series to infer the IF despite the fact that magnetic storms and substorms do represent transient dynamics of the magnetosphere-ionosphere system. Furthermore, the conditional transfer entropy used by Manshour et al. (2021) is implicitly averaged for both positive and negative values of \(B_z\), i.e., without providing any discrimination between open and closed conditions of the magnetosphere. Furthermore, this approach does not take into account preconditioning features of the magnetosphere-ionosphere system. From a phenomenological point of view and again with the help of Eq. (1), it means that periods when the coupling function of the magnetosphere-ionosphere systems is virtually set to zero (closed magnetosphere) are averaged together with those periods in which the coupling function is considerably different from zero (open magnetosphere). The relative importance of these two contributions depends on the total time the conditions explained above are satisfied, so that non-storm time coupling dominates the time average of the IF. This argument may explain the absence of the IF found by Manshour et al. (2021) and Runge et al. (2018) and, of course, the reason why we performed the analysis without removing the past-history of \(B_z\). A more comprehensive analysis including the difference of southward and northward periods will be presented in a forthcoming paper.

In conclusion, our method provides a framework to study the time-variations of the IF at fixed time-scale. It is particularly suitable for the study of the relation between magnetic storms and substorms and, of course, of the magnetosphere-ionosphere coupling. Nevertheless, in this preliminary study some technical problems such as the reconstruction of l-th and k-th order Markov process as well as the accurate computation of the statistical threshold, have been considered only qualitatively. From physics side, it is interesting to discriminate the IF during northward (closed magnetosphere) and southward (open magnetosphere) IMF periods separately. This may reveal some interesting features of the injection/energization processes as well as the importance of the solar wind dynamic pressure, solar wind velocity and convection electric field.