1 Introduction

Several astronomical observations indicate that our Universe is currently in an accelerated expansion stage [1,2,3,4,5]. A cosmological scenario with cold dark matter (CDM) and dark energy (DE) mimicked by a positive cosmological constant, the so-called \(\Lambda \)CDM model, is considered the standard cosmological model, which fits the observational data with great precision. But, the cosmological constant suffers from some theoretical problems [6,7,8], which motivate alternative considerations that can explain the data and have some theoretical appeal as well. In this regard, numerous cosmological models have been proposed in the literature, by introducing some new dark fluid with negative pressure or modification in the general relativity theory, where additional gravitational degree(s) can generate the accelerated stage of the Universe at late times (See [9,10,11] for a review). On the other hand, from an observational point of view, it is currently under discussion whether the \(\Lambda \)CDM model really is the best scenario to explain the observations, mainly in light of the current Hubble constant \(H_0\) tension. Assuming the \(\Lambda \)CDM scenario, Planck-CMB data analysis provides \(H_0=67.4 \pm 0.5\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\) [12], which is in \(4.4\sigma \) tension with a cosmological model-independent local measurement \(H_0 = 74.03 \pm 1.42\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\) [13] from the Hubble Space Telescope (HST) observations of 70 long-period Cepheids in the Large Magellanic Cloud. Additionally, a combination of time-delay cosmography from H0LiCOW lenses and the distance ladder measurements are in \(5.2\sigma \) tension with the Planck-CMB constraints [14] (see also [15] for an update using H0LiCOW lens based new hierarchical approach where the mass-sheet transform is only constrained by stellar kinematics). Another accurate independent measure was carried out in [16], from Tip of the Red Giant Branch, obtaining \(H_0 = 69.8 \pm 1.1\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\). Several other estimates of \(H_0\) have been obtained in the recent literature (see [17,18,19,20,21]). It has been widely discussed in the literature whether a new physics beyond the standard cosmological model can solve the \(H_0\) tension [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36]. The so-called \(S_8\) tension is also not less important. It is present between the Planck-CMB data with respect to weak lensing measurements and redshift surveys, about the value of the matter energy density \(\Omega _m\) and the amplitude or growth rate of structures (\(\sigma _8\), \(f\sigma _8\)). We refer the reader to [37, 38] and references therein for perspectives and discussions on \(S_8\) tension. Some other recent studies/developments [39,40,41,42,43,44,45,46,47,48,49,50] also suggest that the minimal \(\Lambda \)CDM model is in crisis.

A promising approach for investigation of the cosmological parameters is to consider a model-independent analysis. In principle, this can be done via cosmographic approach [51,52,53,54,55], which consists of performing a series expansion of a cosmological observable around \(z=0\), and then using the data to constrain the kinematic parameters. Such a procedure works well for lower values of z, but can be problematic at higher values of z. An interesting and robust alternative can be to consider a Gaussian process (GP) to reconstruct cosmological parameters in a model-independent way. The GP approach is a generic method of supervised learning (tasks to be learned and/or data training in GP terminology), which is implemented in regression problems and probabilistic classification. A GP is essentially a generalisation of the simple Gaussian distribution to the probability distributions of a function into the range of independent variables. In principle, this can be any stochastic process, however, it is much simpler in a Gaussian scenario and it is also more common, specifically for regression processes, which we use in this study. The GP also provides a model independent smoothing method that can further reconstruct derivatives from data. In this sense, the GP is a non-parametric strategy because it does not depend on a set of free parameters of the particular model to be constrained, although it depends on the choice of the covariance function, which will be explained in more detail in the next section. The GP method has been used to reconstruct the dynamics of the DE, modified gravity, cosmic curvature, estimates of Hubble constant, and other perspectives in cosmology by several authors [56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77].

In this work, our main aim is to employ GP to perform a joint analysis by using the geometrical cosmological probes such as Supernova Type Ia (SN), Cosmic chronometers (CC), Baryon Acoustic Oscillations (BAO), and the H0LiCOW lenses sample to constrain the Hubble constant \(H_0\), and reconstruct some properties of DE, viz., the equation of state parameter w, the sound speed of DE perturbations \(c^2_s\), and the ratio of DE density evolution \(X=\rho _\mathrm{de}/\rho _\mathrm{de,0}\). These are the main quantities that can represent the physical characteristics of DE, and possible deviations from the standard values \(w=-1\), \(c^2_s = 1\) and \(X=1\), can be an indication of a new physics beyond the \(\Lambda \)CDM model. To our knowledge, a model-independent joint analysis from above-mentioned data sets, as will be presented here, is new and not previously investigated in the literature. Indeed, a joint analysis with several observational probes is helpful to obtain tight constraints on the cosmological parameters.

This paper is structured as follows. In Sect. 2, we present the GP methodology as well as the data sets used in this work. In Sect. 3, we describe the modelling framework providing the cosmological information, and discuss our main results in detail. In Sect. 4, we summarize main findings of this study with some future perspectives.

2 Methodology and data analysis

In this section, we summarize our methodology as well as the data sets used for obtaining our results.

2.1 Gaussian processes

The main objective in a GP approximation is to reconstruct a function \(f(x_i)\) from a set of its measured values \(f(x_i) \pm \sigma _i\), where \(x_i\) represent the training points or the positions of the observations. It assumes that the value of the function at any point \(x_i\) follows a Gaussian distribution. The value of the function at \(x_i\) is correlated with the value at other point \(x_i'\). Therefore, we may write the GP as

$$\begin{aligned} f(x_i)={\mathcal {G}}{\mathcal {P}}(\mu (x_i),\text {cov}[f(x_i),f(x_i)]), \end{aligned}$$
(1)

where \(\mu (x_i)\) and \(\text {cov}[f(x_i),f(x_i)]\) are the mean and the variance of the random variable at \(x_i\), respectively. This method has been used in many studies in the context of cosmology (e.g. see [56,57,58]). For the reconstruction of the function \(f(x_i)\), the covariance between the values of this function at different positions \(x_i\) can be modeled as

$$\begin{aligned} \text {cov}[f(x),f(x')] = k(x,x'), \end{aligned}$$
(2)

where \(k(x,x')\) is a priori assumed covariance model (or kernel in GP language), and its choice is often very crucial for obtaining good results regarding the reconstruction of the function \(f(x_i)\). The covariance model, in general, depends on the distance \(|x-x'|\) between the input points (\(x, x'\)), and the covariance function \(k(x,x')\) is expected to return large values when the input points (\(x, x'\)) are close to each other. The most popular and commonly used covariance functions in the literature are the standard Gaussian Squared-Exponential (SE) and the Matérn class of kernels (\(M_{\nu }\)). The SE kernel is defined as

$$\begin{aligned} k_{SE}(x,x') = \sigma _f^2 \exp \left( -\frac{|x-x'|^2}{2 l^2}\right) , \end{aligned}$$
(3)

where \(\sigma _f\) is the signal variance, which controls the strength of the correlation of the function, and l is the length scale that determines the ability to model the main characteristics (global and local) in the evaluation region to be predicted (or coherence length of the correlation in x). These two parameters are often called hyperparameters. They are not the parameters of the function, but of the covariance function. For convenience, in what follows, we redefine \(\tau = |x-x'|\), which is consistent with all the kernels implemented here. The SE kernel, however, is a very smooth covariance function which can very well reproduce global but not local characteristics. To avoid this, the Matérn class kernels are helpful, and the general functional form can be written as

$$\begin{aligned} k_{M_{\nu }}(\tau ) = \sigma _f^2 \frac{2^{1-\nu }}{\Gamma (\nu )} \left( \frac{\sqrt{2 \nu }\tau }{l} \right) ^{\nu } K_{\nu }\left( \frac{\sqrt{2 \nu }\tau }{l} \right) , \end{aligned}$$
(4)

where \(K_{\nu }\) is the modified Bessel function of second kind, \(\Gamma (\nu )\) is the standard Gamma function and \(\nu \) is strictly a positive parameter. An explicit analytic functional form for half-integer values of \(\{\nu = 1/2, 3/2, 5/2, 7/2, 9/2, \ldots \}\) is provided by modified Bessel functions, and when \(\nu \rightarrow \infty \), the \( \text {M}_{\nu } \) covariance function tends to SE kernel. Among other possibilities, \(\nu = 7/2\) and \(\nu = 9/2 \) values are of primary interest, since these correspond to smooth functions with high predictability of derivatives of higher order, although these are not very suitable for predicting rapid variations. These Matern functions for GP in cosmology were first introduced in [58]. On the other hand, the hyperparameters \({\Theta } \equiv \{\sigma _f, l\}\) are learned by optimising the log marginal likelihood, which is defined as

$$\begin{aligned} {\mathcal {L}}({\Theta }) = -\frac{1}{2}\mathbf{y }^{\text {T}}K_{\text {y}}^{-1} \mathbf{y } -\frac{1}{2}\ln |K_\text {y}|+\frac{n}{2}\ln (2 \pi ), \end{aligned}$$
(5)

where \(K_{\text {y}} = K(\mathbf{x} ,\mathbf{x} ') + C\), \(K(\mathbf{x} ,\mathbf{x} ')\) is the covariance matrix with components \(k(x_i,x_j)\), y is the vector of data, C is the covariance matrix of the data for a set of n observations, assuming mean \(\mu = 0\). After optimizing for \(\sigma _f\) and l, one can predict the mean and variance of the function \(f(\mathbf{x} ^{*})\) at chosen points \(\mathbf{x} ^{*}\) through

$$\begin{aligned} \langle f(\mathbf{x} ^{*}) \rangle&= K(\mathbf{x} ^{*},\mathbf{x} ) K_{\text {y}}^{-1}\mathbf{y } \nonumber \\ \text {cov}[f(\mathbf{x} ^{*})]&= K(\mathbf{x} ^{*},\mathbf{x} ^{*}) - K(\mathbf{x} ^{*}, \mathbf{x} ) K_{\text {y}}^{-1}K(\mathbf{x} ,\mathbf{x} ^{*}). \end{aligned}$$
(6)

The GP predictions can also be extended to the derivatives of the functions \(f(x_i)\), although limited by the differentiability of the chosen kernel. The derivative of a GP would also be a GP. Thus, one can obtain the covariance between the function and/or the derivatives involved by differentiating the covariance function as

$$\begin{aligned} \text {cov} \left[ f(x_{i}),\dfrac{\partial f(x_{j})}{\partial x_{j}} \right]&= \dfrac{\partial k(x_{i},x_{j})}{\partial x_{j}} \nonumber \\ \text {cov} \left[ \dfrac{\partial f(x_{i})}{\partial x_{i}}, \dfrac{\partial f(x_{j})}{\partial x_{j}} \right]&= \dfrac{\partial ^{2} k(x_{i},x_{j})}{\partial x_{i}\partial x_{j}}. \end{aligned}$$
(7)

Then, we can write

$$\begin{aligned} f'(x_i)={\mathcal {G}}{\mathcal {P}} \left( \mu '(x_i),\text {cov} \left[ \dfrac{\partial f(x_{i})}{\partial x_{i}}, \dfrac{\partial f(x_{j})}{\partial x_{j}} \right] \right) , \end{aligned}$$
(8)

where \(f'(x_i)\) represent the derivatives with respect to their corresponding independent variables, which for our purpose can be the redshift z. This procedure can similarly be extended for higher derivatives (\(f'(x), f''(x), \ldots \)) in combination with f(x). The mean of the \(i^{th}\) derivative and the covariance between \(i^{th}\) and \(j^{th}\) derivatives, are given by

$$\begin{aligned} \langle f^{(i)}(\mathbf{x} ^{*}) \rangle&= K^{(i)}(\mathbf{x} ^{*}, \mathbf{x} ) K_{\text {y}}^{-1}\mathbf{y } \end{aligned}$$
(9)
$$\begin{aligned} \text {cov}[f^{(i)}(\mathbf{x} ^{*}),f^{(j)}(\mathbf{x} ^{*})]&= K^{(i,j)}(\mathbf{x} ^{*},\mathbf{x} ^{*})\nonumber \\&\quad - K^{(i)}(\mathbf{x} ^{*},\mathbf{x} ) K_{\text {y}}^{-1}K^{(j)}(\mathbf{x} ,\mathbf{x} ^{*}). \end{aligned}$$
(10)

If \(i =j\), then we get the variance of the \(i^{th}\) derivative in Eq. (10). If the data for derivative functions are available, we can perform a joint analysis, which is the case in our study. Since one data type can be in terms of f(x) while another can be rewritten in terms of \(f'(x)\), these different data sets can be combined. In what follows, we describe the data sets that we use in this work.

Fig. 1
figure 1

H(z) (in units of km s\(^{-1}\,\hbox {Mpc}^{-1}\)) vs z (E(z) vs z in case of SN data alone) with \(1\sigma \) and \(2\sigma \) CL regions, reconstructed from SN (top-left), BAO (top-right), CC (bottom-left) and SN+BAO+CC data (bottom-right). Data with errorbars in all the panels are the observational data as mentioned in the legend of each panel

2.2 Data sets

We summarize below the data sets used in our analysis.

Cosmic chronometers (CC): The CC approach is a powerful method to trace the history of cosmic expansion through the measurement of H(z). We consider the compilation of Hubble parameter measurements provided by [78]. This compilation consists of 30 measurements distributed over a redshift range \(0< z < 2\).

Baryon acoustic oscillations (BAO): The BAO is another important cosmological probe, which can trace expanding spherical wave of baryonic perturbations from acoustic oscillations at recombination time through the large-scale structure correlation function, which displays a peak around \(150\, {\text {h}}^{-1} \,\mathrm{Mpc}\). We use BAO measurements from Sloan Digital Sky Survey (SDSS) III DR-12 at three effective binned redshifts \(z = 0.38\), 0.51 and 0.61, reported in [3], the clustering of the SDSS-IV extended Baryon Oscillation Spectroscopic Survey DR14 quasar sample at four effective binned redshifts \(z = 0.98\), 1.23, 1,52 and 1.94, reported in [79], and the high-redshift Lyman-\(\alpha \) measurements at \(z = 2.33\) and \(z = 2.4\) reported in [80] and [81], respectively. Note that the observations are presented in terms of \(H(z) \times (r_d/r_{d,fid})\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\), where \(r_d\) is co-moving sound horizon and \(r_{d,fid}\) is the fiducial input value provided in the above references. In appendix A, we show that different \(r_d\) input values obtained from different data sets do not affect the GP analysis.

Supernovae Type Ia (SN): The SN traditionally have been one of the most important astrophysical tools in establishing the so-called standard cosmological model. For the present analysis, we use the Pantheon compilation, which consists of 1048 SNIa distributed in a redshift range \(0.01< z < 2.3\) [82]. Under the consideration of a spatially flat Universe, the full sample of Pantheon can be summarized into six model independent \(E(z)^{-1}\) data points [83]. We consider the six data points reported by [65] in the form of E(z), including theoretical and statistical considerations made by the authors there for its implementation.

H0LiCOW sample: The Lenses in COSMOGRAIL’s Wellspring programFootnote 1 have measured six lens systems, making use of the measurements of time-delay distances between multiple images of strong gravitational lens systems by elliptical galaxies [14]. In the analyses of this work, we implement these six systems of strongly lensed quasars reported by the H0LiCOW Collaboration. Full information is contained in the so-called time-delay distance \(D_{\Delta t}\). However, additional information can be found in the angular diameter distance to the lens \(D_l\), which offers the possibility of using four additional data points in our analysis. Thus, our total H0LiCOW sample comprises of 10 data points: 6 measurements of time-delay distances and 4 angular diameter distances to the lens for 4 specific objects in the subset information in H0LiCOW sample (see [84, 85] for the description).

Fig. 2
figure 2

Left panel: \(\tilde{D}(z_l)\) vs \(z_l\) with \(1\sigma \) and \(2\sigma \) CL regions, reconstructed from H0LiCOW sample data plus other data (SN+CC+BAO). Right panel: H(z) vs z with \(1\sigma \) and \(2\sigma \) CL regions, enlarged in the redshift range \(z<0.3\), and reconstructed from the combined data SN + BAO + CC + H0LiCOW

3 Results and discussions

First, we verify that analyses carried out from \(k_{M_{\nu }}(\tau )\), with \(\tau = 9/2\) and \(\tau = 7/2\), and \(k_{SE}\) do not generate significantly different results, in the sense that all results are compatible with each other at \(1\sigma \) CL, and hence not generating any disagreement/tension between these input kernels. Thus, in what follows, we use GP formalism with an assumed \(M_{9/2}\) kernel in the whole analysis. For this purpose, we have used some numerical routines available in the public GaPP code [56].

Figure 1 shows the reconstructions from SN, BAO and CC data sets, using GP formalism on each data set individually. On the bottom-right panel, we show the H(z) reconstruction from all these data together. First, from the CC reconstruction, we obtain \(H_0=68.54 \pm 5.06\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\), which has been used in the rescaling process of SN data to carry out the joint analysis with SN+BAO+CC (bottom-right). From SN+BAO+CC analysis, we find \(H_0=67.85 \pm 1.53\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\). Figure 2 (left panel) shows the GP reconstruction of \(D(z_l)\) from H0LiCOW data, where \(z_l\) is the redshift to the lens. On the right panel, we show the reconstruction of H(z) function from SN+BAO+CC+H0LiCOW. In this joint analysis, we obtain \(H_0= 73.78 \pm 0.84\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\), which represents a \(1.1\%\) precision measurement. The strategy that we followed to obtain these results is as follows:

  1. 1.

    The SN+BAO+CC data set is used as in the previous joint analysis, i.e., in terms of H(z) data reconstruction. Thus, now, we just need to re-scale the H0LiCOW data in some convenient way to combine all data for a joint analysis.

  2. 2.

    The time-delay distance in H0LiCOW sample is quantified as

    $$\begin{aligned} D_{\Delta t} = (1 + z_l) \frac{D_l D_s}{D_{ls}}, \end{aligned}$$
    (11)

    which is a combination of three angular diameter distances, namely \(D_l\), \(D_s\) and \(D_{ls}\), where the subscripts stand for diameter distances to the lens l, to the source s, and between the lens and the source ls.

  3. 3.

    At this point, we can get the dimensionless co-moving distance through the relationship

    $$\begin{aligned} \tilde{D}(z) = \frac{H_0}{c}(1+z)D_A, \end{aligned}$$
    (12)

    where \(D_A\) is the angular diameter distance and \(\tilde{D}(z)\) is defined as \(\tilde{D}(z) = \int ^{z}_0 \frac{dz'}{E(z')}\). In this way, we can have: 6 data points from time delay distance \(D_{\Delta t}\), which we referred to as \(\tilde{D}_{\Delta t}\), and 4 data points obtained from angular diameter distance \(D_l\), named as \(\tilde{D}_l\). Thus, we can add these 10 data points for joint analysis, and name simply the H0LiCOW sample (see left panel of Fig. 2). Note that, to get \(\tilde{D}_l\), we directly use the Eq. (12), where \(D_A = D_l\). On the other hand, to obtain \(\tilde{D}_{\Delta t}\), we have to take into account that Eq. (11) depends on the expansion rate of the Universe through \(D_s(z_s,H_0,\Omega _m)\) and \(D_{ls}(z_l, H_0,\Omega _m)\), and in this case, we use the \(H_0\) and \(\Omega _m\) best fit from our SN+BAO+CC joint analysis.

  4. 4.

    For the joint analysis, the relation \(\tilde{D}(z) =\int ^{z}_0 \frac{dz'}{E(z')}\) can be reversed to obtain \(E(z) =\frac{1}{\tilde{D}'(z)}\). So, we can make use of this possibility that offers the reconstruction of the first derivative of the dimensionless co-moving distance \(\tilde{D}'(z)\). For this purpose, we introduce the SN + BAO + CC data set in the form of 1/E(z) and the H0LiCOW data set in the form of \(\tilde{D}(z)\), to obtain the GP reconstruction of dimensionless co-moving distance.

From the joint analysis SN+BAO+CC+H0LiCOW, we find \(H(z=0)= 73.78 \pm 0.84\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\). Figure 2 (right panel) shows the H(z) reconstruction from SN+BAO+CC+H0LiCOW. Figure 3 shows a comparison of our joint analysis estimates on \(H_0\) with others recently obtained in literature. We note that our constraint on \(H_0\) is in accordance with SH0ES and H0LiCOW+STRIDES estimates. On the other hand, we find \(\sim 6\sigma \) tension with current Planck-CMB measurements and \(\sim 2\sigma \) tension with CCHP best fit. We re-analyze our estimates removing BAO data (see appendix A). In this case, we find \(H_0 = 68.57 \pm 1.86\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\) and \(H_0 = 71.65 \pm 1.09 \ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\) from SN+CC and SN+CC+H0LiCOW, respectively.

Fig. 3
figure 3

Compilation of \(H_0\) measurements taken from recent literature, namely, from Planck collaboration (Planck) [12], Dark Energy Survey Year 1 Results (DES+BAO+BBN) [118], the final data release of the BOSS data (BOSS Full-Shape+BAO+BBN) [117], The Carnegie-Chicago Hubble Program (CCHP) [16], H0LiCOW collaboration (H0LiCOW+STRIDES) [14], SH0ES [13], in comparison with the \(H_0\) constraints obtained in this work from the GP analysis using SN+BAO+CC+H0LiCOW

Fig. 4
figure 4

\(O_m(z)\) vs z with \(1\sigma \) and \(2\sigma \) CL regions, reconstructed from SN+BAO+CC data (left panel) and SN+BAO+CC+H0LiCOW data (right panel)

In the context of the standard framework, we can also check the \(O_m(z)\) diagnostic [86]

$$\begin{aligned} O_m(z) = \frac{E^2(z) - 1}{(1 + z)^3 - 1}. \end{aligned}$$
(13)

If the expansion history E(z) is driven by the standard \(\Lambda \)CDM model, then \(O_m(z)\) is practically constant and equal to the density of matter \(\Omega _{m}\), and so, any deviation from this constant can be used to infer the dynamical nature of DE. Figure 4 shows the reconstruction of the \(O_m(z)\) diagnostic. We find \(\Omega _m = 0.292 \pm 0.046\) and \(\Omega _m =0.289 \pm 0.012\) at \(1\sigma \) from SN+BAO+CC and SN+BAO+CC+H0LiCOW analyses, respectively. To obtain these results, we normalize H(z) with respect to \(H_0\) to obtain E(z) for the entire data set except SN, where \(H_0\) is taken from SN+BAO+CC, and SN+BAO+CC+H0LiCOW cases, respectively. The prediction from SN+BAO+CC is compatible with \(\Omega _m = 0.30\) across the analyzed range, but it is interesting to note that for \(z > 2\), we have \(\Omega _m <0.30\) at \(\sim 2\sigma \) from SN+BAO+CC+H0LiCOW. These model-independent \(\Omega _m\) estimates will be used as input values in the reconstruction of w.

The EoS of DE can be written as [87,88,89]

$$\begin{aligned} w(z) = \frac{2(1 + z)E(z)E'(z) - 3E^2(z) + \Omega _{k} (1 + z)^2}{3\left( E^2(z) -\Omega _{m} (1 + z)^3 - \Omega _{k} (1 + z)^2 \right) }, \end{aligned}$$
(14)

where \(\Omega _{m}\) and \(\Omega _k\) are the density parameters of matter (baryonic matter + dark matter) and spatial curvature, respectively. In what follows, we assume \(\Omega _k = 0\), which is a strong, though quite general assumption about spatial geometry.

Fig. 5
figure 5

The EoS w(z) vs z with \(1\sigma \) and \(2\sigma \) CL regions, reconstructed from SN+BAO+CC data (left panel) and SN+BAO+CC+H0LiCOW data (right panel)

Fig. 6
figure 6

\(c_s^2(z)\) vs z with \(1\sigma \) and \(2\sigma \) CL regions, reconstructed from SN+BAO+CC data (left panel) and SN+BAO+CC+H0LiCOW data (right panel). The dashed line is the division of dark energy clustering

Figure 5 shows the w(z) reconstruction from SN+BAO+CC and SN+BAO+CC+H0LiCOW data combinations on the left and right panels, respectively. From both analyses, we notice that w is well constrained for \(z \lesssim 0.5\) with the prediction \(w=-1\). Most of the data correspond to this range in numbers and precision. The GP mean excludes any possibility of \(w \ne -1\) in the whole range of z under consideration. We observe that the best fit prediction is on \(w= -1\) up to \(z \sim 0.5\) for both cases. The addition of the H0LiCOW data considerably improve the reconstruction of w for \(z < 1\). Beyond this range, the best fit prediction can deviate from \(w = -1\), but statistically compatible with a cosmological constant. Evaluating at the present moment, we find \(w(z=0)= -0.999 \pm 0.093\) and \(w(z=0)= -0.998 \pm 0.064\) from SN+BAO+CC and SN+BAO+CC+H0LiCOW, respectively. Note that H0LiCOW sample improves the constraints on \(w(z=0)\) up to \(\sim 2.9\%\).

From the statistical reconstruction of w(z) and its derivative \(w'(z)\), we can analyze the DE adiabatic sound speed \(c^2_s\). Given the relation \(p=w\rho \), we can find

$$\begin{aligned} c^2_s (z) = \frac{\delta p}{\delta \rho } = w(z) + \frac{1 + z}{3} \frac{w'(z)}{1 + w(z)}. \end{aligned}$$
(15)

Figure 6 shows \(c^2_s\) reconstruction from SN+BAO+CC and SN+BAO+CC+H0LiCOW data combinations on the left and right panels, respectively. We note that the DE sound speed is negative at \(\sim 1\sigma \) from SN+BAO+CC when evaluated up to \(z \simeq 2.5\). It is interesting to note that the SN+BAO+CC+H0LiCOW analysis yields \(c^2_s < 0\) at \(2\sigma \) for \(z < 1\). At the present moment, we find \(c^2_s(z=0) = -0.218 \pm 0.137\) and \(c^2_s(z=0) = -0.273 \pm 0.068\) at 1\(\sigma \) CL from SN+BAO+CC and SN+BAO+CC+H0HiCOW, respectively. Therefore, this inference on \(c^2_s\) rules out significantly the possibility for clustering DE models, and also the models with \(c^2_s > 0\) up to high z at least at 1\(\sigma \) CL. The condition \(c^2_s > 0\) is usually imposed to avoid gradient instability. However, the perturbations can still remain stable under \(c^2_s < 0\) consideration [92,93,94,95]. Thus, if the effective sound speed is negative, this would be a smoking gun signature for the existence of an anisotropic stress and possible modifications of gravity. Recently, a possible evidence for \(c^2_s < 0\) is found in [46], and also in a model-independent way from the Hubble data. Now, we look at some models which can potentially explain this result.

Fig. 7
figure 7

X(z) vs z with \(1\sigma \) and \(2\sigma \) CL regions, reconstructed from SN+BAO+CC data (left panel) and SN+BAO+CC+H0LiCOW data (right panel). The dashed black curve corresponds to \(\Lambda \)CDM model prediction \(X = 1\)

The Lagrangian \(L = G_2(\phi , X) + \frac{M^2_{pl}}{2}R\) describes general K-essence scenarios. Here the function \(G_2\) depends on \(\phi \) and \(X = -\frac{1}{2} \nabla ^{\mu } \phi \nabla _{\mu } \phi \), and R is the Ricci scalar curvature. In this case, the sound speed is given by

$$\begin{aligned} c^2_s = \frac{G_{2,X}}{G_{2,X} + \dot{\phi } G_{2,XX}}, \end{aligned}$$
(16)

where \(G_{2,X} \equiv \partial G_2 /\partial X\). Quintessence models correspond to the particular choice \(G_2 = X - V(\phi )\), given \(c^2_s = 1\). Thus, the usual quintessence scenarios are discarded from our results, which predict negative or low values of the sound speed.

Considering the so-called dilatonic ghost condensate [96], given by the Lagrangian,

$$\begin{aligned} G_2 = -X + e^{\lambda \phi /M_{pl}} \frac{X^2}{M^4}, \end{aligned}$$
(17)

where \(\lambda \) and M are free parameters of the model, we can write \(c^2_s\) as

$$\begin{aligned} c^2_s = \frac{2y -1}{6y-1}, \end{aligned}$$
(18)

with \(y = \frac{\dot{\phi }^2 e^{\lambda \phi /M_{pl}}}{2M^4}\). The condition \(y < -1/2\) ensures negative sound speed values.

Another interesting possibility pertains to a unified dark energy and dark matter scenario described by \(G_2 =-b_0 + b_2(X-X_0)^2\), where \(b_0\) and \(b_2\) are free parameters of the model [97]. In this case, the sound speed is

$$\begin{aligned} c^2_s = \frac{X-X_0}{3X - X_0}, \end{aligned}$$
(19)

where \(c^2_s < 0\) for \(X < X_0\).

The above mentioned cases are theoretical examples under the consideration of a minimally coupled gravity scenario, which can reproduce a possible \(c^2_s < 0\) behavior. More generally, in the Horndeski theories of gravity [98,99,100], the speed of sound can be written as

$$\begin{aligned} \alpha c^2_s = \Big [ \Big (1- \frac{\alpha _B}{2} \Big ) (2\alpha _M + \alpha _B) + \frac{\alpha _B}{2} (\ln H^2)' + \alpha '_B \Big ],\qquad \end{aligned}$$
(20)

where prime denotes \(d/d\ln a\), and \(\alpha _i\) are functions expressed in a way that highlights their effects on the theory space [101], namely, kineticity (\(\alpha _K\)), braiding (\(\alpha _B\)) and Planck-mass run rate (\(\alpha _M\)). Further, we define \(\alpha = \alpha _K + 3/2 \alpha ^2_B\). Motivated for the tight constraints on the difference between the speed of gravitational waves and the speed of light to be \(\lesssim 10^{-15}\) from the GW170817 and GRB 170817A observations [102, 103], we assume \(\alpha _T = 0\) (tensor speed excess). Without loss of generality, we can consider \(\alpha > 0\) and the relation \(\alpha _B = R \times \alpha _M\), with R being a constant. For instance, for \(R=-1\), we reproduce f(R) gravity theories. Different R values can manifest the most diverse possible changes in gravity. For a qualitative example, taking \(R=-1\), the running of the Planck mass must satisfy the relationship

$$\begin{aligned} \frac{3}{2}\alpha ^2_M - a \frac{dH}{da}\frac{\alpha _M}{H} - a \frac{d \alpha _M}{da} \le 0, \end{aligned}$$
(21)

for generating \(c^2_s < 0\). At late cosmic time, we have \(\frac{dH}{da}\frac{1}{H} < 0\), and we can consider the theories in a good approximation where \(|\alpha _M \ll 1|\). So we see that the condition \(\alpha _M < 0\), can generate negative \(c^2_s\) values in this case.

Finally, we analyze the function

$$\begin{aligned} X(z) = \frac{\rho _\mathrm{de}}{\rho _\mathrm{de,0}} = \exp \left( 3 \int ^{z}_0 \frac{1 + w(z')}{1 + z'} dz' \right) , \end{aligned}$$
(22)

quantifying the ratio of DE energy density evolution over the cosmic time.

Figure 7 shows X(z) reconstruction from SN+BAO+CC and SN+BAO+CC+H0LiCOW data combinations on the left and right panels, respectively. We note that the evolution of X is fully compatible with the \(\Lambda \)CDM model, and with the best fit model-independent prediction around \(X= 1\) up to \(z \sim 1\), in both analyses. It is interesting to note that X can cross to negative values when \(z > 1\) and \(z > 1.5\) at \(2\sigma \) CL from SN+BAO+CC and SN+BAO+CC+H0LiCOW, respectively. It can also have some interesting theoretical consequences. First, DE with negative density values at large z came to the agenda when it turned out that, within the standard \(\Lambda \)CDM model, the Ly-\(\alpha \) forest measurement from BAO data by the BOSS collaboration [104], prefers a smaller value of the dust density parameter compared to the value preferred by the CMB data. Thus, with the possibility of a preference for negative energy density values at high z, it is argued that the Ly-\(\alpha \) data at \(z \sim 2.34\) can be described by a non-monotonic evolution in H(z) function, which is difficult to achieve in any model with non-negative DE density [105]. Note that in our analysis, we are taking into account the high z Lyman-\(\alpha \) measurements reported in [80] and [81]. It is possible to achieve \(X < 0\) at high z when the cosmological gravitational coupling strength gets weaker with increasing z [106, 107]. A range of other examples of effective sources crossing the energy density below zero also exists, including theories in which the cosmological constant relaxes from a large initial value via an adjustment mechanism [108], and also by modifying gravity theory [109,110,111]. More recently, a graduated DE model characterized by a minimal dynamical deviation from the null inertial mass density is introduced in [112] to obtain negative energy density at high z. Also, seeking inspiration from string theory, the possibility of negative energy density is investigated in [113].

The reconstruction of w(z) and X(z) are robust at low z, where the DE effects begin to be considerable, and a slow evolution of the EoS is well captured at 68% CL. However, the error estimates are larger at high z, where the data density is significantly smaller and the dynamical effects of DE are weaker. The introduction of the H0LiCOW data slightly improves the estimated errors in this range, especially for \(1.0<z<1.5\). On the other hand, the uncertainties of smooth functions may have a greater amplitude than the highly oscillating functions, and in this way the propagation of errors to their derivatives can be overestimated [114]. In our case, the variation of the starting functions is quite smooth with respect to the data and their derivatives as well, leading to the propagation of errors with a greater amplitude, as can be seen in Figs. 5 and 7 at high z. Other aspects that may influence this fact could be the strong dependence on z, as in the case of w(z), and the integrability of the functions with respect to z, as in the case of X(z) (for a brief discussion in this regard, see for example [56]).

Recently, the authors in [115] have obtained a measurement \(H_0= 69.5 \pm 1.7\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\), showing that it is possible to constrain \(H_0\) with an accuracy of 2% with minimal assumptions, from a combination of independent geometric datasets, namely, SN, BAO and CC. They have not used the H0LiCOW data in their analyses as we have used in the present work. They have also reconstructed the DE density parameter X(z), finding similar conclusion as obtained here in this work.

4 Final remarks

We have applied GP to constrain \(H_0\), and to reconstruct some functions that describe physical properties of DE in a model-independent way using cosmological information from SN, CC, BAO and H0LiCOW lenses data. The main results from the joint analysis, i.e., SN+CC+BAO+H0LiCOW, are summarized as follows:

  1. (i)

    A 1.1% accuracy measurement of \(H_0\) is obtained with the best fit value \(H_0= 73.78 \pm 0.84\ \hbox {km s}^{-1}\,\hbox {Mpc}^{-1}\) at 1\(\sigma \) CL.

  2. (ii)

    The EoS of DE is measured at \(\sim \) 6.5% accuracy at the present moment, with \(w(z=0)=-0.98 \pm 0.064\) at \(1\sigma \) CL.

  3. (iii)

    We find possible evidence for \(c^2_s < 0\) at \(\sim \) \(2\sigma \) CL from the analysis of the function behavior at high z. At the present moment, we find \(c^2_s(z=0) = -0.273 \pm 0.068\) at \(1\sigma \) CL.

  4. (iv)

    We find that the ratio of DE density evolution, \(\rho _\mathrm{de}/\rho _\mathrm{de,0}\), can cross to negative values at high-z. This behavior has already been observed by other authors. Here, we re-confirm this possibility for \(z > 1.5\) at \(\sim 2\sigma \).

Certainly, the GP method having the ability to perform joint analysis has a great potential in search for the accurate measurements of cosmological parameters, and analyze physical properties of the dark sector of the Universe in a minimally model-dependent way. It can shed light in the determination of the dynamics of the dark components or even rule out possible theoretical cosmological scenarios. Beyond the scope of the present work, it will be interesting to analyze/reconstruct a possible interaction in the dark sector, where DE and dark matter interact non-gravitationally in a model-independent way, through a robust joint analysis. Such scenarios have been intensively investigated recently in literature. We hope to communicate results in that direction in near future.

Fig. 8
figure 8

\(O_m(z)\) vs z with \(1\sigma \) and \(2\sigma \) CL regions, reconstructed from SN+CC (green) and SN+CC+BAO (blue) in the left panel, and SN+CC+H0LiCOW (green) and SN+CC+H0LiCOW+BAO (blue) in the right panel

Fig. 9
figure 9

\(O_m(z)\) vs z with \(1\sigma \) and \(2\sigma \) CL regions for different values of \(r_d\) (in units of Mpc), reconstructed from SN+BAO+CC data (left panel) and SN+BAO+CC+H0LiCOW data (right panel)