# Implementable tensor methods in unconstrained convex optimization

- 287 Downloads
- 1 Citations

## Abstract

In this paper we develop new tensor methods for unconstrained convex optimization, which solve at each iteration an auxiliary problem of minimizing convex multivariate polynomial. We analyze the simplest scheme, based on minimization of a regularized local model of the objective function, and its accelerated version obtained in the framework of estimating sequences. Their rates of convergence are compared with the worst-case lower complexity bounds for corresponding problem classes. Finally, for the third-order methods, we suggest an efficient technique for solving the auxiliary problem, which is based on the recently developed relative smoothness condition (Bauschke et al. in Math Oper Res 42:330–348, 2017; Lu et al. in SIOPT 28(1):333–354, 2018). With this elaboration, the third-order methods become implementable and very fast. The rate of convergence in terms of the function value for the accelerated third-order scheme reaches the level \(O\left( {1 \over k^4}\right) \), where *k* is the number of iterations. This is very close to the lower bound of the order \(O\left( {1 \over k^5}\right) \), which is also justified in this paper. At the same time, in many important cases the computational cost of one iteration of this method remains on the level typical for the second-order methods.

## Keywords

High-order methods Tensor methods Convex optimization Worst-case complexity bounds Lower complexity bounds## Mathematics Subject Classification

90C25 90C06 65K05## 1 Introduction

*Motivation* In the last decade, we observe an increasing interest to the complexity analysis of the high-order methods. Starting from the paper [31] containing the first global rate of convergence of Cubic Regularization of Newton Method, it became more and more common to provide the second-order methods with the worst-case complexity bounds on different problem classes (see, for example, [5, 11, 12]). New efficiency measurements in this field naturally generated a new spectrum of questions, starting from the possibilities to accelerate the second-order methods (see [27]) up to the lower complexity bounds (see [1, 2, 13, 18]) and attempts of constructing the optimal methods [24].

Another possibility of accelerating the minimization processes consists in the increase of the power of oracle. The idea of using the high-order approximations in Optimization is not new. Initially, such approximations were employed in the optimality conditions (see, for example [22]). However, it seems that the majority of attempts of using the high-order tensors in optimization methods failed by the standard obstacle related to the enormous complexity of minimization of nonconvex multivariate polynomials. To the best of our knowledge, the only theoretical analysis of such schemes for convex problems can be found in an unpublished preprint [3], which is concluded by a pessimistic comment on practical applicability of these methods. For nonconvex problems, several recent papers [8, 9, 10, 14] contain the complexity analysis for high-order methods designed for generating points with small norm of the gradient. For the auxiliary nonconvex optimization problem, these methods need to guarantee a sufficient level for the first-order optimality condition and the local decrease of the objective function. However, for nonconvex functions even this moderate goal is difficult to achieve.

The key observation, which underlies all results of this paper, is that an appropriately regularized Taylor approximation of convex function is a convex multivariate polynomial. This is indeed a very natural property since this regularized approximation usually belongs to the epigraph of convex function. Thus, the auxiliary optimization problem in the high-order (or *tensor*) methods becomes generally solvable by many powerful methods of Convex Optimization. This fact explains our interest to complexity analysis of the simplest tensor scheme (Sect. 2), based on the *convex* regularized Taylor approximation, and to its accelerated version (Sect. 3). The latter method is obtained by the technique of estimating functions (see [25, 26, 27]). Therefore it is similar to Algorithm 4.2 in [3]. The main difference consists in the correct choice of parameters ensuring convexity of the auxiliary problem. We show that this algorithm converges with the rate \(O(\left( {1 \over k}\right) ^{p+1})\), where *k* is the number of iterations and *p* is the degree of the tensor.

In the next Sect. 4, we derive lower complexity bounds for the tensor methods. We show that the lower bound for the rate of convergence is of the order \(O\left( \left( {1 \over k} \right) ^{3p+1 \over 2}\right) \). This result is better than the bound in [1] and coincide with the bound in [2]. However, it seems that our justification is simpler.

For practical implementations, the most important results are included in Sect. 5, where we discuss an efficient scheme for minimizing the regularized Taylor approximation of degree three. This auxiliary convex problem can be treated in the framework of *relative smoothness condition*. The first element of this approach was introduced in [4], for generalizing the Lipschitz condition for the norm of the gradient. In [21] it was shown that the same extension can be applied to the condition of strong convexity. This second step is important since it leads to linearly convergent methods for functions with nonstandard growth properties. The auxiliary problem with the third-order tensor is a good application of this technique. We show that the corresponding method converges linearly, with the rate depending on an absolute constant. In the end of the section, we argue that the complexity of one iteration of the resulting third-order scheme is often of the same order as that of the second-order methods.

In the last Sect. 6 we discuss the presented results and mention the open problems.

*Notations and generalities*In what follows, we denote by \(\mathbb {E}\) a finite-dimensional real vector space, and by \(\mathbb {E}^*\) its dual spaced composed by linear functions on \(\mathbb {E}\). For such a function \(s \in \mathbb {E}^*\), we denote by \(\langle s, x \rangle \) its value at \(x \in \mathbb {E}\). Using a self-adjoint positive-definite operator \(B: \mathbb {E}\rightarrow \mathbb {E}^*\) (notation \(B = B^* \succ 0\)), we can endow these spaces with

*conjugate Euclidean norms*:

*f*at

*x*along directions \(h_i \in \mathbb {E}\), \(i = 1, \dots , p\). Note that \(D^p f(x)[ \cdot ]\) is a

*symmetric*

*p*

*-linear form*. Its

*norm*is defined in the standard way:

*spectral norm*of self-adjoint linear operator (maximal module of all eigenvalues computed with respect to operator

*B*).

*p*-linear and symmetric, we also have

*p*times differentiable on \(\mathbb {E}\). Denote by \(L_p\) its uniform bound for the Lipschitz constant of their

*p*th derivative:

## 2 Convex tensor approximations

*power prox function*

### Theorem 1

^{1}

### Proof

*x*and

*y*from \({\mathrm{dom }}\,f\). Then for any direction \(h \in \mathbb {E}\) we have

*f*are bounded:

### Lemma 1

First two inequalities of this lemma are already known (see, for example, [8]). We provide them with a simple proof for the reader convenience.

### Proof

### Corollary 1

### Proof

*a*,

*b*, and \(\alpha \). \(\square \)

### Theorem 2

### Proof

## 3 Accelerated tensor methods

In order to accelerate method (2.16), we apply a variant of the *estimating sequences technique*, which becomes a standard tool for accelerating the usual Gradient and Second-Order Methods (see, for example, [25, 26, 27]). In our situation, this idea can be applied to tensor methods in the following way.

- Sequence of estimating functionswhere \(\ell _k(x)\) are linear functions in \(x \in \mathbb {E}\), and$$\begin{aligned} \begin{array}{rcl} \psi _k(x)= & {} \ell _k(x) + {C \over p!} d_{p+1}(x-x_0), \quad k = 1, 2, \dots , \end{array} \end{aligned}$$(3.1)
*C*is a positive parameter. Minimizing sequence \(\{ x_k \}_{k=1}^{\infty }\).

- Sequence of scaling parameters \(\{A_k \}_{k=1}^{\infty }\):$$\begin{aligned} A_{k+1} \; {\mathop {=}\limits ^{{\mathrm {def}}}}\; A_k + a_k, \quad k = 1,2, \dots . \end{aligned}$$

*f*, for any \(x \in \mathbb {E}\) we have

*C*and

*M*, relation \(\mathcal{R}_{k+1}^1\) is also valid.

### Theorem 3

### Proof

## 4 Lower complexity bounds for tensor methods

*x*and \(h \in {\mathbb {R}}^n\), we have

*x*,

*y*, and

*h*from \({\mathbb {R}}^n\), by Cauchy-Schwartz inequality we have

*k*, \(2 \le k \le n\), let us define the following \(k \times k\) upper triangular matrix with two nonzero diagonals:

### Assumption 1

Note that for the absolute majority of the first-order, second-order, and tensor methods this assumption is satisfied.

### Lemma 2

### Proof

*h*are as follows

*k*, \(1 \le k \le p\). Hence, since the regularization term in definition (4.6) is formed by the standard Euclidean norm, all stationary points of this function belong to \({\mathbb {R}}^n_{k+1}\).

Assume now that all \(x_i \in {\mathbb {R}}^n_k\), \(i = 1, \dots , k\), for some \(k \ge 1\). Then, as we have already seen, \(\mathcal{S}_{f_t}(x_i) \subseteq {\mathbb {R}}^n_{k+1}\). Hence, the inclusion (4.8) follows from Assumption 1. \(\square \)

Now we can prove the main statement of this section.

### Theorem 4

*p*satisfies Assumption 1. Assume that this method ensures for any function \(f \in \mathcal{F}_p\) with \(L_p((f) < + \infty \) the following rate of convergence:

### Proof

*i*, \(0 \le i \le t\). However,

## 5 Third-order methods: implementation details

Tensor optimization methods, presented in Sects. 2 and 3, are based on the solution of the auxiliary optimization problem (2.6). In the existing literature on the tensor methods [6, 7, 20, 32], it was solved by the standard local technique of Nonconvex Optimization. However, now we know that by Theorem 1, this problem is convex. Hence, it is solvable by the standard and very efficient methods of Convex Optimization.

Since we need to solve this problem at each iteration of the methods, its complexity significantly affects the total computational time. Since the objective function in the problem (2.6) is a *convex multivariate polynomial*, we there could exist some special efficient algorithms for finding its solution. Unfortunately, at this moment the authors failed to find such methods in the literature. Therefore, we present in this section a special approach for solving the problem (2.6) with the third degree Taylor approximation, which is based on the recently developed optimization framework of *relatively smooth functions* (see [4, 21]).

^{2}

Recall that \(D^3f(x)[h_1,h_2,h_3]\) is a symmetric trilinear form. Hence, \(D^3 f(x) [h_1,h_2] \in \mathbb {E}^*\) is a symmetric bilinear vector function, and \(D^3f(x)[h]\) is a linear function of \(h \in \mathbb {E}\), whose values are self-adjoint linear operators from \(\mathbb {E}\) to \(\mathbb {E}^*\) (as Hessians).

### Lemma 3

### Proof

*h*by \(\tau h\) with \(\tau > 0\) and dividing the resulting inequality by \(\tau \), we get

*h*by \(-h\), which gives

### Lemma 4

*Bregman distance*of function \(\rho _x(\cdot )\):

- 1.
**Computation of the gradient**\(\nabla \Omega _{x,M}(h)\). Note thatIn this formula, only the computation of the third derivative may be dangerous. However, this difficulty can be resolved using the technique of automatic differentiation (see, for example, [19]). Indeed, assume we have a sequence of operations for computing the function value$$\begin{aligned} \begin{array}{rcl} \nabla \Omega _{x,M}(h)= & {} \nabla f(x) + \nabla ^2 f(x) h + {1 \over 2}D^3f(x) [h]^2. \end{array} \end{aligned}$$*f*(*x*) with computational complexity*T*. Let us fix a direction \(h \in \mathbb {E}\). Then by forward differentiation, we can generate automatically a sequence of operations for computing the valuewith computational complexity$$\begin{aligned} \begin{array}{rcl} g_h(x)= & {} \langle \nabla ^2 f(x) h, h \rangle \end{array} \end{aligned}$$*O*(*T*). Now, by backward differentiation in*x*, we can compute the gradient of this function:with computational complexity$$\begin{aligned} \begin{array}{rcl} \nabla g(x)= & {} D^3f(x)[h,h] \end{array} \end{aligned}$$*O*(*T*). Thus, the oracle complexity of method (5.8) is proportional to the complexity of computing the function value*f*(*x*). Another example of simple computation of the third derivative is provided by a separable objective function. Assume that \(\mathbb {E}= {\mathbb {R}}^n\) andwhere \(a_i \in {\mathbb {R}}^n\) and univariate functions \(f_i(\cdot )\) are three times continuously differentiable, \(i = 1, \dots , N\). Then vector \(D^3 f[h]^2\) has the following representation:$$\begin{aligned} \begin{array}{rcl} f(x)= & {} \sum \limits _{i=1}^N f_i(b_i - \langle a_i, x \rangle ), \end{array} \end{aligned}$$Thus, for solving the problem (5.4), we need to compute in advance all values$$\begin{aligned} \begin{array}{rcl} D^3 f(x)[h]^2= & {} - \sum \limits _{i=1}^N a_i \, f'''_i(b_i - \langle a_i, x \rangle ) \langle a_i, h \rangle ^2. \end{array} \end{aligned}$$(this needs$$\begin{aligned} \begin{array}{c} f'''_i(b_i - \langle a_i, x \rangle ), \quad i = 1, \dots , N \end{array} \end{aligned}$$*O*(*nN*) operations). After that, each computation of vector \(D^3 f(x)[h]^2 \in {\mathbb {R}}^n\) also needs*O*(*nN*) operations. This computation will be cheaper for the sparse data. - 2.
**Solution of the auxiliary problem**At all iterations of method (5.8), we need to solve an auxiliary problem in the following form:where \(A \succeq 0\) and \(\gamma > 0\). Note that at all these iterations only the vector$$\begin{aligned} \begin{array}{rcl} \min \limits _{h \in \mathbb {E}} \left\{ \langle c, h \rangle + {1 \over 2}\langle A h, h \rangle + {\gamma \over 4} \Vert h \Vert ^4 \right\} , \end{array} \end{aligned}$$(5.10)*c*and coefficients \(\gamma \) are changing, and matrix \(A = \nabla ^2 f(x)\) remains the same. Therefore, before the algorithm (5.8) starts working, it is reasonable to transform this matrix in a tri-diagonal form:where \(U \in {\mathbb {R}}^{n \times n}\) is an orthogonal matrix: \(UU^T = I\), and \(T \in {\mathbb {R}}^{n \times n}\) is tri-diagonal.$$\begin{aligned} \begin{array}{rcl} A= & {} U T U^T, \end{array} \end{aligned}$$

*n*. Moreover, since its objective function is strongly convex and infinitely times differentiable, all reasonable one-dimensional methods have global linear rate of convergence and the quadratic convergence in the end.

Computation of the Hessian \(\nabla ^2 f(x)\) and its tri-diagonal factorization: \(O(nT_f + n^3)\) operations.

- We need \(O(\ln {1 \over \epsilon })\) iterations of method (5.8), in order to get \(\epsilon \)-solution of the auxiliary problem. At each iteration of this method we need:
Compute the gradient \(\nabla \Omega _{x,M}(h_k)\): \(O(T_f)\) operations.

Compute the vector \({\tilde{c}}\) for the univariate problem in (5.11): \(O(n^2)\) operations.

Solve the dual problem in (5.11) up to accuracy \(\delta \): \(O(n \ln {1 \over \delta })\) operations.

Compute an approximate solution \(h = - U(\gamma \tau I +T)^{-1} {\tilde{c}}\) of the problem (5.10), using an approximate solution \(\tau \) of the dual problem: \(O(n^2)\) operations.

For the readers, which are not interested in all these computational details, we just mention that the Galahad Optimization Library [16] has special subroutines for solving the auxiliary problems in the form (5.10).

## 6 Discussion

In this paper, we did an important step towards practical implementation of tensor methods in unconstrained convex optimization. We have shown that the auxiliary optimization problems in these scheme can be reduced to minimization of a convex multivariate polynomial. In the important case of third-order tensor, we have proved that this problem can be efficiently solved by a special optimization scheme derived from the relative smoothness condition.

Our results highlight several interesting questions. One of the direct consequences of our approach is a systematic way of generating convex multivariate polynomials. Is it possible to minimize them by some tools of Algebraic Geometry (see [23] for the related technique like sums of squares, etc.), or we need to treat them using an appropriate technique from Convex Optimization? The results of Sect. 5 demonstrate a probably unbeatable superiority of optimization technique for the third-order polynomials. But what happens with polynomials of higher degree?

One of the difficult unsolved problems in our approach is related to dynamic adjustment of the Lipschitz constant for the highest derivative. This dynamic estimate should not be much bigger than the actual Lipschitz constant. On the other hand, it must ensure convexity of the auxiliary problem solved at each iteration of the tensor methods. This question is clearly crucial for the practical efficiency of the high-order schemes.

Simple comparison of the complexity bounds in Sects. 3 and 4 shows that we failed to develop an optimal tensor scheme. The missing factor in the complexity estimates is of the order of \(O\left( \left( {1 \over \epsilon }\right) ^{{1 \over p+1} - {2 \over 3p + 1}}\right) = O\left( \left( {1 \over \epsilon }\right) ^{p-1 \over (p+1)(3p+1)}\right) \). For \(p = 3\), this factor is of the order of \(O\left( \left( {1 \over \epsilon }\right) ^{{1 \over 20}}\right) \). This means that from the viewpoint of practical efficiency, the cost of one iteration of the hypothetical optimal scheme must be of the same order as that of the accelerated tensor method (3.12). Any additional logarithmic factors in the complexity bound of this “optimal” method will definitely kill its tiny superiority in the convergence rate.

In the last years, we have seen an increasing interest to universal methods, which can adjust to the best possible Hölder condition instead of the Lipschitz one during the running optimization process (see [14, 17, 29]). Of course, it is very interesting to extend this philosophy onto the tensor minimization schemes. Another important extension could be the treatments of the constraints, either in functional form, or using the framework of composite minimization [28]. The main difficulty here is related to the complexity of the auxiliary optimization problems.

One of the main restrictions for practical implementation of our results is the necessity to know the Lipschitz constant of the corresponding derivative. If our estimate is too small, then the auxiliary problem (2.6) may loose convexity. Consequently, we will loose the fast convergence in the auxiliary process (5.8). However, this observation gives us a clue how to tune this constant: if we see that this process is too slow, this means that our estimate is too small. But of course it is very interesting to find a recipe with better theoretical justification.

## Footnotes

## Notes

### Acknowledgements

The author is very thankful to Geovani Grapiglia for interesting discussions of the results. The comments of two anonymous referees were extremely useful.

## References

- 1.Agarwal, N., Hazan, E.: Lower Bounds for Higher-Order Convex Optimization (2017). arXiv:1710.10329v1 [math.OC]
- 2.Arjevani, Y., Shamir, O., Shiff, R.: Oracle Complexity of Second-Order Methods for Smooth Convex Optimization (2017). arXiv:1705.07260 [math.OC]
- 3.Baes, M.: Estimate sequence methods: extensions and approximations. Optim. Online (2009)Google Scholar
- 4.Bauschke, H.H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuety: first-order methods revisited and applications. Math. Oper. Res.
**42**, 330–348 (2017)MathSciNetCrossRefGoogle Scholar - 5.Bian, W., Chen, X., Ye, Y.: Complexity analysis of interior-point algorithms for non-Lipschitz and non-convex minimization. Math. Program.
**139**, 301–327 (2015)CrossRefGoogle Scholar - 6.Birgin, E.G., Gardenghi, J.L., Martines, J.M., Santos, S.A.: Remark on Algorithm 566: Modern Fortran Routines for Testing Unconsrained Optimization Software with Derivatives up to Third-Order. Technical report, Department of Computer Sciences, University of Sao Paolo, Brazil (2018)Google Scholar
- 7.Birgin, E.G., Gardenghi, J.L., Martines, J.M., Santos, S.A.: On the Use of Third-Order Models with Fourth-Order Regularization for Unconstrained Optimization. Technical report, Department of Computer Sciences, University of Sao Paolo, Brazil (2018)Google Scholar
- 8.Birgin, E.G., Gardenghi, J.L., Martines, J.M., Santos, S.A., Toint, PhL: Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularization models. Math. Program.
**163**, 359–368 (2017)MathSciNetCrossRefGoogle Scholar - 9.Carmon, Y., Duchi, J.C., Hinder, O., Sidford, A.: Lower bounds for finding stationary points I. Archiv (2017). arXiv:1710.11606
- 10.Carmon, Y., Duchi, J.C., Hinder, O., Sidford, A.: Lower bounds for finding stationary points II. Archiv (2017). arXiv:1711.00841
- 11.Cartis, C., Gould, N.I.M., Toint, PhL: Adaptive cubic overestimation methods for unconstrained optimization. Part I: motivation, convergence and numerical results. Math. Program.
**130**(2), 295–319 (2012)CrossRefGoogle Scholar - 12.Cartis, C., Gould, N.I.M., Toint, PhL: Adaptive cubic overestimation methods for unconstrained optimization. Part II: worst-case function evaluation complexity. Math. Program.
**127**(2), 245–295 (2011)MathSciNetCrossRefGoogle Scholar - 13.Cartis, C., Gould, N.I.M., Toint, PhL: Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization. Optim. Methods Softw.
**27**(2), 197–219 (2012)MathSciNetCrossRefGoogle Scholar - 14.Cartis, C., Gould, N.I.M., Toint, PhL: Universal regularization methods–varying the power, the smoothness and the accuracy. SIAM. J. Optim.
**29**(1), 595–615 (2019)MathSciNetCrossRefGoogle Scholar - 15.Conn, A.R., Gould, N.I.M., Toint, PhL: Trust Region Methods. MOS-SIAM Series on Optimization, New York (2000)CrossRefGoogle Scholar
- 16.Gould, N.I.M., Orban, D., Toint, PhL: GALAHAD, a library of thread-safe Fortran 90 packages for large-scale nonlinear optimization. ACM Trans. Math. Softw.
**29**(4), 353–372 (2003)MathSciNetCrossRefGoogle Scholar - 17.Grapiglia, G.N., Nesterov, Yu.: Regularized Newton methods for minimizing functions with Hölder continuous Hessians. SIOPT
**27**(1), 478–506 (2017)CrossRefGoogle Scholar - 18.Grapiglia, G.N., Yuan, J., Yuan, Y.: On the convergence and worst-case complexity of trust-region and regularization methods for unconstrained optimization. Math. Program.
**152**, 491–520 (2015)MathSciNetCrossRefGoogle Scholar - 19.Griewank, A., Walther, A.: Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Applied Mathematics, vol. 105, 2nd edn. SIAM, Philadelphia (2008)CrossRefGoogle Scholar
- 20.Gundersen, G., Steihaug, T.: On large-scale unconstrained optimization problems and higher order methods. Optim. Methods. Softw.
**25**(3), 337–358 (2010)MathSciNetCrossRefGoogle Scholar - 21.Lu, H., Freund, R., Nesterov, Yu.: Relatively smooth convex optimization by first-order methods, and applications. SIOPT
**28**(1), 333–354 (2018)MathSciNetCrossRefGoogle Scholar - 22.Hoffmann, K.H., Kornstaedt, H.J.: Higher-order necessary conditions in abstract mathematical programming. JOTA
**26**, 533–568 (1978)MathSciNetCrossRefGoogle Scholar - 23.Lasserre, J.B.: Moments, Positive Polynomials and Their Applications. Imperial College Press, London (2010)zbMATHGoogle Scholar
- 24.Monteiro, R.D.C., Svaiter, B.F.: An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIOPT
**23**(2), 1092–1125 (2013)MathSciNetCrossRefGoogle Scholar - 25.Nesterov, Yu.: Introductory Lectures on Convex Optimization. Kluwer, Boston (2004)CrossRefGoogle Scholar
- 26.Nesterov, Yu.: Smooth minimization of non-smooth functions. Math. Program.
**103**(1), 127–152 (2005)MathSciNetCrossRefGoogle Scholar - 27.Nesterov, Yu.: Accelerating the cubic regularization of Newton’s method on convex problems. Math. Program.
**112**(1), 159–181 (2008)MathSciNetCrossRefGoogle Scholar - 28.Nesterov, Yu.: Gradient methods for minimizing composite functions. Math. Program.
**140**(1), 125–161 (2013)MathSciNetCrossRefGoogle Scholar - 29.Nesterov, Yu.: Universal gradient methods for convex optimization problems. Math. Program.
**152**, 381–404 (2015)MathSciNetCrossRefGoogle Scholar - 30.Nesterov, Yu., Nemirovskii, A.: Interior Point Polynomial Methods in Convex Programming: Theory and Applications. SIAM, Philadelphia (1994)CrossRefGoogle Scholar
- 31.Nesterov, Yu., Polyak, B.: Cubic regularization of Newton’s method and its global performance. Math. Program.
**108**(1), 177–205 (2006)MathSciNetCrossRefGoogle Scholar - 32.Schnabel, R.B., Chow, T.T.: Tensor methods for unconstrained optimization using second derivatives. SIAM J. Optim.
**1**(3), 293–315 (1991)MathSciNetCrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.