1 Introduction

Statistical computations for insurance policy, inventory management, Bayesian point estimation, and other areas often involve the computation of partial moments (Winkler et al. 1972). Bawa and Lindenberg (1977) derive a capital asset pricing model from utility functions based on lower partial moments. The i-th upper (lower) partial moment of a univariate real-valued random variable X with cumulative distribution function F is defined as the i-th moment in excess of (below) a certain threshold \(a\in {\mathbb {R}}\),

$$\begin{aligned} \mu _i^+(a)&:= \int _{a}^\infty \! (x-a)^i \, \mathrm{d} F(x)&\mu _i^-(a)&:= \int _{-\infty }^a \! (a-x)^i \, \mathrm{d} F(x). \end{aligned}$$
(1)

This paper presents an algorithm that bounds the partial moments of X using information contained in a sequence of (full) moments of X. This approach is strongly related to the moment problem.

The moment problem originally considered whether a certain sequence of moments corresponds to at least one univariate probability measure. This problem has been extensively discussed in the mathematical literature (Akhiezer 1965; Kreĭn and Nudelman 1977; Shohat and Tamarkin 1943; Stoyanov 2013). The admissible support of the probability measure splits the moment problem in three different subproblems: The admissible support can be unrestricted (Hamburger moment problem), restricted to the positive half-line (Stieltjes), or restricted to a bounded interval (Haussdorff). The algorithm presented in this paper fits into the Hamburger moment problem as it allows for X with support at both tails.

In case a sequence of moments is known to correspond to at least one probability measure, an important follow-up question involves what information on the distribution is contained in this sequence. Research on this question has focused on several characteristics:

(i):

class of probability measure (Berg 1995; Gut 2002; Lin 1997; Pakes et al. 2001; Stoyanov 2000)

(ii):

tail of the distribution (Goria and Tagliani 2003; Lindsay and Basak 2000)

(iii):

mode (Gavriliadis 2008)

(iv):

cumulative distribution function (Mnatsakanov 2008b; Mnatsakanov and Hakobyan 2009)

(v):

density function (Gavriliadis and Athanassoulis 2009; Mnatsakanov 2008a)

(vi):

Shannon entropy (Milev et al. 2012)

The above-mentioned literature can differ in the admissible support of the probability measure.

By using a sequence of moments, this paper adds to the literature on the moment problem by presenting an iterative algorithm that bounds the partial moments (1) at \(a=0\). More specifically, the algorithm bounds the partial moments from the positive part, \(\max (X,0)\), and the negative part, \(\max (-X,0)\), of a univariate real-valued random variable X by using information contained in known (ratios of) subsequent finite moments of X. In case one is interested in the partial moments at \(a\ne 0\), obtain moments of \(X_a:=X-a\) from the sequence of moments of X and bound the moments of \(\max (X_a,0)\) and \(\max (-X_a,0)\).

The bounds on the partial moments imply bounds on unobserved higher order moments of X. Using a Taylor expansion, one can obtain bounds on the moments of the transformation f(X) for a certain function \(f:{\mathbb {R}}\rightarrow {\mathbb {R}}\).

Where applicable, procedures from related literature can be added to our algorithm. If the random variable X belongs to the class of Pearson distributions,Footnote 1 a recursive relationship on partial moments is in Winkler et al. (1972). In case X has a known finite support, the information on the support implies additional bounds (Barnett et al. 2002) and suitable numerical optimization routines can be invoked (Dokov and Morton 2005; Frauendorfer 1988; Kall 1991). Because the sequence of known moments of X typically consists of more than six moments, an unrestricted support can hamper a numerical optimization routine that restricts the sequence of moments.

The paper proceeds as follows. Preliminaries are in Sect. 2. Section 3 presents the iterative algorithm that bounds partial moments, and how these bounds imply bounds on other expressions. In Sect. 4, the algorithm is demonstrated for two example distributions. Section 5 is reserved for concluding remarks and points of discussion.

2 Preliminaries

Let \(\mathbb {R}^+_0 := [0,\infty )\) denote the set of nonnegative real numbers in the set \(\mathbb {R}\) of real numbers. The sets \(\mathbb {N}^+_0 = \{0,1,\ldots \}\) and \(\mathbb {N}^+=\{1,2,\ldots \}\) denote sets of natural numbers. Let \(X:\varOmega \rightarrow \mathbb {R}\) be a univariate real-valued random variable defined on the probability space \((\varOmega ,\mathcal {F},\mathbb {P})\) with unknown cumulative distribution function (cdf) \(F:\mathbb {R}\rightarrow [0,1]\).

The known moments \(\mu _1,\ldots ,\mu _n\) of X are finite. Assume \(\mu _2>0\) to avoid trivial outcomes where all moments are zero. Denote the positive and negative part of X by \(X^{+}:=\max (X,0)\) and \(X^{-}:=\max (-X, 0)\), respectively. Let \(\mu _i^+\) and \(\mu _i^-\) represent the i-th moment of \(X^+\) and \(X^-\), respectively (\(i\in \mathbb {N}^+\)). Thus, \(\mu _i^+:=\mu _i^+(0)\) and \(\mu _i^-:=\mu _i^-(0)\) denote the partial moments of X at \(a=0\) in (1). Set \(\mu _0^+ := {{\mathbb P}\!\left( X > 0 \right) }\), \(\mu _0^- := {{\mathbb P}\!\left( X < 0 \right) }\), and \(\mu _0 := {{\mathbb P}\!\left( X\ne 0 \right) }\). The latter definition is a convention that enables tighter bounds.

For a quasi-degenerate random variable X, there exists \(x\in {\mathbb {R}}\) with \({{\mathbb P}\!\left( X\in \{0,x\} \right) }=1\). A nonquasi-degenerate (nqd) random variable X is a random variable that is not quasi-degenerate, thus \({{\mathbb P}\!\left( X\in \{0,x\} \right) }<1\) for all \(x\in {\mathbb {R}}\). A degenerate X is necessarily quasi-degenerate, while a nqd X is necessarily non-degenerate.

Define the i-th moment ratio of X as \(m_i := \mu _i/\mu _{i-1}\) with \(m_i=\infty \) if \(\mu _{i-1}=0\) (\(i\in \mathbb {N}^+\)). Let \(m_i^+\) and \(m_i^-\) refer to the i-th moment ratio of the positive part \(X^+\) and the negative part \(X^-\), respectively. Both ratios can be referred to as a partial moment ratio of X. Lower bounds and upper bounds are denoted by underlines and bars, respectively. For instance, \(\underline{\mu }_i^+\) is a lower bound on \(\mu _i^+\) and \(\bar{m}_i^-\) is an upper bound on \(m_i^- := \mu _i^- / \mu _{i-1}^-\). The negation of \(s\in \{+,-\}\) is denoted by \(-s\). The operation \(s\, x\) denotes x if \(s=+\) and \(-x\) if \(s=-\). The operation s / x means 1 / x if \(s=+\) and \(-1/x\) if \(s=-\). The equivalence \(X \equiv Y\) means that \(X=Y\) holds almost surely: \({{\mathbb P}\!\left( X=Y \right) } = 1\).

3 The algorithm

This section outlines the algorithm in a number of lemmas and a theorem. By using some simple constraints, the first lemma provides bounds on partial moments and partial moment ratios. This enables an initialization of the bounds.

Lemma 1

[Initialization I] For \(s\in \{+,-\}\),

$$\begin{aligned} 0 \le \mu _{2n}^s&\le \mu _{2n}&&&n&\in \mathbb {N}^+_0 \end{aligned}$$
(2)
$$\begin{aligned} \mu _{2n-1}^s&\ge s \, \mu _{2n-1}&m_{2n-1}^s&\ge s \, m_{2n-1}&\frac{1}{m_{2n}^s}&\ge \frac{s}{m_{2n}}&n&\in \mathbb {N}^+. \end{aligned}$$
(3)
  1. (i)

    The nonnegativity constraint in (2) holds with equality if and only if \(X^s\equiv 0\).

    The constraint \(\mu _{2n}^s\le \mu _{2n}\) in (2) holds with equality if and only if \(X^{-s}\equiv 0\).

  2. (ii)

    Each of the constraints in (3) holds with equality if and only if \(X^{-s}\equiv 0\).

Consider the Christoffel function associated with cdf F of a non-degenerate X:

$$\begin{aligned} \lambda _k(x)&:=\frac{1}{\sum _{j=0}^k\left[ P_j(x)\right] ^2}&k&\in \mathbb {N}^+, \end{aligned}$$

where \(P_0(x)\equiv 1\) and

$$\begin{aligned}&P_j(x) := \frac{D_j(x)}{\sqrt{H_{2j}H_{2j-2}}}&D_j(x)&:= \begin{vmatrix} 1&\mu _1&\cdots&\mu _j\\ \mu _1&\mu _2&\cdots&\mu _{j+1}\\ \vdots&\vdots&\ddots&\vdots \\ \mu _{j-1}&\mu _j&\cdots&\mu _{2j-1}\\ 1&x&\cdots&x^j \end{vmatrix}&H_0&:= 1 \\&\quad \quad H_{2j} := \begin{vmatrix} 1&\mu _1&\cdots&\mu _j\\ \mu _1&\mu _2&\cdots&\mu _{j+1}\\ \vdots&\vdots&\ddots&\vdots \\ \mu _j&\mu _{j+1}&\cdots&\mu _{2j} \end{vmatrix}&j&\in \mathbb {N}^+. \end{aligned}$$

Let

$$\begin{aligned} L_j^{(i)}&:= 1 - \sum _{l=i}^j \lambda _j\left( x_j^{(l)}\right)&U_j^{(i)}&:= \sum _{l=1}^i \lambda _j\left( x_j^{(l)}\right)&i&=1,2,\ldots ,j&j&\in \mathbb {N}^+, \end{aligned}$$

where \(x^{(1)}_j \le \cdots \le x^{(j)}_j\) are the j zeros of the j-th degree polynomial \(P_j\).

Using the definitions above, the following lemma from Gavriliadis and Athanassoulis (2009) bounds the distribution function at certain points.

Lemma 2

Suppose the moments \(\mu _1,\ldots ,\mu _{2n}\) of a non-degenerate X are known (\(n\in \mathbb {N}^+\)). The cumulative distribution function F satisfies at the j zeros \(x^{(1)}_j, \ldots , x^{(j)}_j\) of polynomial \(P_j\),

$$\begin{aligned} L_j^{(i)}&\le F\left( x_j^{(i)}\right) \le U_j^{(i)}&j&= 1,\ldots ,n. \end{aligned}$$

Let \(\tilde{x}^{(1)}< \cdots < \tilde{x}^{(N)}\) represent the strictly increasing sequence of the N distinct zeros of the set of polynomials \(\{P_j\}_{j=1}^n\):

$$\begin{aligned} \bigcup _{j=1}^N {\tilde{x}}^{(j)} = \bigcup _{j=1}^n \bigcup _{i=1}^j x_j^{(i)} \end{aligned}$$

Define the increasing step functions

$$\begin{aligned} L(x)&= {\left\{ \begin{array}{ll} 0 &{} x < \tilde{x}^{(1)}\\ \max _{i,j} \{ L_j^{(i)} : x_j^{(i)} \le x \} &{} x \ge \tilde{x}^{(1)} \end{array}\right. }\\ U(x)&= {\left\{ \begin{array}{ll} \min _{i,j} \{ U_j^{(i)} : x_j^{(i)} \ge x\} &{} x \le \tilde{x}^{(N)}\\ 1 &{} x > \tilde{x}^{(N)}. \end{array}\right. } \end{aligned}$$

Notice that for L(x) (respectively U(x)), one can simply take for each j the largest (smallest) i that satisfies \(x^{(i)}_j\le x\) (\(x^{(i)}_j \ge x\)) since \(L^{(i)}_j\) (\(U^{(i)}_j\)) is nondecreasing with i. It follows from Lemma 2 that \(L(x)\le F(x) \le U(x)\) holds for all \(x\in {\mathbb {R}}\). Using these bounds on the cdf F(x), Lemma 3 produces lower bounds on the partial moments:

Lemma 3

(Initialization II) Given the notation above,

$$\begin{aligned} \mu ^+_i&\ge \sum _{j=j_0^+}^{N} \left( \tilde{x}^{(j)} \right) ^i dU_j&\mu ^-_i&\ge \sum _{j=1}^{j_0^+-1} \left( -\tilde{x}^{(j)} \right) ^i dL_j&i&\in \mathbb {N}^+_0, \end{aligned}$$
(4)

where \(j_0^+ = \mathop {{{\mathrm{argmin}}}}\nolimits _{j\in \{1,\ldots ,N\}} \{ \tilde{x}^{(j)} > 0 \}\), \(dL_1 = L(\tilde{x}^{(1)})\), \(dU_{N}=1 - U(\tilde{x}^{(N)})\), and

$$\begin{aligned} dL_j&= L(\tilde{x}^{(j)}) - L(\tilde{x}^{(j-1)})&dU_{j-1}&= U(\tilde{x}^{(j)}) - U(\tilde{x}^{(j-1)})&j&= 2,\ldots ,N. \end{aligned}$$

Upper bounds on the partial moments \(\mu _i^+\) and \(\mu _i^-\) are not available along the lines in Lemma 3. For instance, the summation \(\sum _{j=j_0^+}^{N} \left( \tilde{x}^{(j)} \right) ^i dL_j\) is not an upper bound on \(\mu _i^+\) since \(dL_N\) is not a bound for \(\mathrm{d}F\) on \((x^{(N)}, \infty )\). The iterative algorithm we develop derives upper bounds on \(\mu _i^+\) and \(\mu _i^-\) under certain assumptions. In addition, the algorithm can sharpen the lower bounds obtained in Lemma 3.

The next lemma is useful to apply in the iterative algorithm as well as for proving some inequalities.

Lemma 4

(Moment ordering)

  1. (i)

    For nonnegative X and \(n\ge 2\), or for arbitrary X and even \(n \ge 2\),

    $$\begin{aligned} \mu _{n-1}^{2} \le \mu _{n}\mu _{n-2}. \end{aligned}$$
  2. (ii)

    For nonnegative X with \(\mu _{1}>0\) the moment ratio \(m_i=\mu _i/\mu _{i-1}\) increases with \(i\in \mathbb {N}^+\):

    $$\begin{aligned} 0<m_{1}\le m_{2}\le m_{3}\le \cdots . \end{aligned}$$

The inequalities are all strict if and only if X is nqd.

Lemma 5 produces bounds on the moment ratio of certain summations of nonnegative random variables.

Lemma 5

Consider \(Y = \sum _{j\in J} Y_j\) where each \(Y_j:\varOmega \rightarrow \mathbb {R}_0^+\) is a nonnegative random variable with strictly positive first moment. Let \(i \in \mathbb {N}^+\).

  1. (i)

    Suppose the following two conditions both hold

    1. (a)

      Y is a mixture

    2. (b)

      for each \(j\in J\), \(m_i(Y_j) = g_i h_j\) and \(m_{i+1}(Y_j) = g_{i+1} h_j\) where \(g_i,g_{i+1},h_j\in {\mathbb {R}}\), then

      $$\begin{aligned} \max _{j\in J} \frac{m_{i+1}(Y_j)}{m_{i}(Y_j)}&\le \frac{m_{i+1}(Y)}{m_{i}(Y)}. \end{aligned}$$
      (5)
  2. (ii)

    If Y is a mixture, i.e., for almost all outcomes \(\omega \in \varOmega \) at most one \(Y_j(\omega )\) is nonzero, then

    $$\begin{aligned} m_i(Y)&\le \max _{j\in J} m_i(Y_j). \end{aligned}$$
    (6)
  3. (iii)

    If each \(Y_j\) is independent,

    $$\begin{aligned} m_i(Y)&\le \sum _{j\in J} m_i(Y_j). \end{aligned}$$
    (7)
  4. (iv)

    Suppose the following two conditions both hold

    1. (a)

      each \(Y_j\) is independent

    2. (b)

      each \(Y_j\) is logconcave,

    then for each nonempty \(I\subseteq J\), \(\sum _{j\in I} Y_j\) is logconcave and

    $$\begin{aligned} m_i\left( \sum _{j\in I} Y_j \right)&\le m_i(Y). \end{aligned}$$
    (8)

Sufficient condition (b) in (i) reflects that the moment ratio \(m_i(Y_j)\) must be separable into two parts with a common functional form across i and j. Nonnegative distributions with moment ratios that satisfy this condition include the Gamma distribution and distributions with a single, scalar parameter such as the half-normal distribution and uniform distributions of the type U[0, b] with \(b>0\). The moment ratio of the log-normal distribution has not a separable form.

It is natural to inquire about the necessity of the sufficient conditions in Lemma 5. We provide for each condition a counterexample where the sufficient condition is not satisfied and the corresponding inequality fails to hold.

(i):

(a) Distributions that violate sufficient condition (a) may not satisfy (5) as one can verify using \(Y=Y_1+Y_2\) with independent \(Y_1,Y_2 \sim \exp (1)\). (b) For instance, the mixture Y with \({{\mathbb P}\!\left( Y=Y_1 \right) }={{\mathbb P}\!\left( Y=Y_2 \right) }=\frac{1}{2}\) where \(Y_1 \equiv 1\) and \(Y_2 \sim \exp (1)\) satisfies (a), but violates (b). Inequality (5) is invalid here since \(m_2(Y_2)/m_1(Y_2) = 2 > \frac{3}{2} = m_2(Y)/m_1(Y)\).

(ii):

Equation (6) can be invalid if the mixture condition on Y is dropped. This follows for the case where \(Y_1 \equiv Y_2 \equiv 1\).

(iii):

Dependence between different \(Y_j\) can make inequality (7) invalid. For instance, the case where \({{\mathbb P}\!\left( (Y_1, Y_2)=\left( 0,\frac{1}{2}\right) \right) } = {{\mathbb P}\!\left( (Y_1,Y_2)=\left( 1,1\right) \right) }=\frac{1}{2}\) leads to \(m_3(Y_1)+m_3(Y_2)=1.9<1.91\cdots =m_3(Y)\).

(iv):

(a) Consider \(Y=Y_1+Y_2\) with the log-concave distributions \(Y_1\sim \mathrm{Exp}(1)\) and \(Y_2\sim \mathrm{Exp}\left( \frac{1}{2}\right) \).

To violate (a), impose a perfect negative correlation on the quantiles \(F_{Y_1}(Y_1)\) and \(F_{Y_2}(Y_2)\) by letting \(F_{Y_1}(Y_1) = 1-F_{Y_2}(Y_2)\). In other words, \((Y_1,Y_2)\sim \left\{ \left( -\ln (1-U), -2\ln (U) \right) : U\sim U[0,1] \right\} \) such that \({{\mathbb E}\!\left[ Y^2\right] } = \int _0^1 \ln ^2\left( u^2(1-u)\right) \mathrm{d}u=18-\frac{2}{3}\pi ^2\). It follows that \(m_2(Y)=6-\frac{2}{9}\pi ^2 < 4 = m_2(Y_2)\). (b) For the independent, non-logconcave case where \({{\mathbb P}\!\left( Y_1=0 \right) }=\frac{9}{10}\), \({{\mathbb P}\!\left( Y_1=10 \right) }=\frac{1}{10}\), and \(Y_2\equiv 1\): \(m_2(Y_1+Y_2) = \frac{13}{2} < 10 = m_2(Y_1)\).

The following lemma can be supplemented with bounds from Lemma 5. This is helpful in the examples in Sect. 4.

Lemma 6

(Initialization III) Let \(i\in \mathbb {N}^+\).

  1. (i)

    Suppose \(X=\sum _{j\in I} X_j\) is a mixture of random variables \(X_j:\varOmega \rightarrow \mathbb {R}\), i.e., for almost all outcomes \(\omega \in \varOmega \) at most one \(X_j(\omega )\) is nonzero, then

    $$\begin{aligned} m_i^+ := m_i(X^+)&\le \max _{j\in I} m_i(X_j^+). \end{aligned}$$
    (9)
  2. (ii)

    If \(X = Y - Z\) with \(Y = \sum _j Y_j\) and the following conditions hold

    (a) Y and Z are independent and nonnegative random variables

    (b) each \(Y_j\) is independent

    (c) each \(Y_j\) is log-concave,Footnote 2

    then

    $$\begin{aligned} m_i(X^+)&\le m_i(Y). \end{aligned}$$
    (10)
  3. (iii)

    If \(X=Y-Z\) is a mixture of nonnegative random variables Y and Z, i.e., \({{\mathbb P}\!\left( \{Y=0\} \cup \{Z=0\} \right) }=1\), then \(X^+\equiv Y\) and

    $$\begin{aligned} m_i(X^+)&= m_i(Y). \end{aligned}$$
    (11)

Similar relations apply to \(m_i^-:=m_i(X^-)\) and \(m_i(Z)\).

Condition (c) in Lemma 6(ii) on logconcave distributions is satisfied by many univariate distributions. Examples include the exponential distribution, the normal distribution, the uniform distribution, and the Gamma distribution with shape parameter at least one. The Gamma distribution with shape parameter less than one, the Student’s t-distribution, and the log-normal distribution are not log-concave.

We provide for each sufficient condition in Lemma 6 an example where the condition is not satisfied and the corresponding inequality is violated.

(i):

Inequality (9) fails to hold for the nonmixture case where \(X\equiv 2\) with \(X_1 \equiv X_2 \equiv 1\). Here, X, is not a mixture and \(m_i(X^+) = 2 > 1 = \max (m_i(X_1),m_i(X_2))\) where \(i\in \mathbb {N}^+\).

(ii):

(a) Suppose \(Y\sim U[0,1]\) and \(Z=Y 1_{Y<\frac{1}{2}}\). The independence assumption (a) fails to hold, while assumptions (b) and (c) are satisfied. Inequality (10) is not satisfied: \(m_2(X^+)=\frac{7}{9} > \frac{2}{3} = m_2(Y)\).

(b) Consider again the counterexample of Lemma 5(iv) where \(Y=Y_1+Y_2\) with the log-concave distributions \(Y_1\sim \mathrm{Exp}(1)\) and \(Y_2\sim \mathrm{Exp}\left( \frac{1}{2}\right) \), and let \(Z\equiv 5\). This satisfies conditions (a) and (c). To violate (b), impose a perfect negative correlation on the quantiles \(F_{Y_1}(Y_1)\) and \(F_{Y_2}(Y_2)\) by letting \(F_{Y_1}(Y_1) = 1 - F_{Y_2}(Y_2)\). It can be verified that \(m_2(X^+) = 3.88\cdots > 3.81\cdots = 6-\frac{2}{9}\pi ^2 = m_2(Y)\).

(c) Suppose \({{\mathbb P}\!\left( Y=1 \right) } = \frac{9}{10}\), \({{\mathbb P}\!\left( Y=9 \right) } = \frac{1}{10}\), and \(Z\equiv 1\). While conditions (a) and (b) are satisfied, condition (c) on logconcavity of each \(Y_j\) fails to hold. Inequality (10) also fails to hold: \(m_2(X^+)= 8 > 5 = m_2(Y)\).

(iii):

The nonmixture case with \(Y\equiv 2\) and \(Z\equiv 1\) corresponds for each \(i\in \mathbb {N}^+\) to \(X\equiv X^+ \equiv 1\) and \(m_i^+ = 1 \ne 2 = m_i(Y)\).

The next theorem presents the theoretical novelty of the paper. For a nqd X, the moment \({\mu }_i^s\) is bounded in terms of the bounds on \(m_i^s\), which are in turn in the proof derived from the bounds on the moments of \(X^{-s}\). The requirement of a nqd X can be easily verified, since the moments of a quasi-degenerate distribution are uniquely characterized by finite \(m_1=m_2=\ldots \) and \(\mu _i = \left( \mu _n\right) ^{i/n}\) (\(i\in \mathbb {N}^+_0\)). To identify a quasi-degenerate X, it therefore suffices if \(\mu _i = \left( \mu _n\right) ^{i/n}\) holds for some odd i and even n. The requirements on i and n exclude X that are restricted to the set \(\{-x, 0, x\}\) for certain \(x\in {\mathbb {R}}\).

Theorem 1

(\({\mu }_i^s\) in terms of the bounds on \(m_i^s\)) Consider a nqd X with known moments \(\mu _{n-2}\), \(\mu _{n-1}\), \(\mu _{n}\) with \(n\ge 2\) even. Define

$$\begin{aligned} a&:= \mu _{n}{\mu _{n-2}}-{\mu _{n-1}^2}&b&:=\sqrt{\frac{\mu _{n}}{\mu _{n-2}}} \end{aligned}$$
(12)

with constants (\(s\in \{+,-\}\))

$$\begin{aligned} u^s(x)&{:=} \frac{a}{\mu _{n}/x - 2s\,\mu _{n-1} + x\mu _{n}}&v^s\left( c,d\right)&{:=} \left\{ \begin{array}{lrl} u^s(b) &{} d &{}< b< c\\ \min \{u^s(c),u^s(d)\} &{} c &{}\le b\le {d}\\ u^s(d) &{} b &{}< \min ( c, d)\\ u^s(c) &{} b &{}> \max ( c, d). \end{array}\right. \end{aligned}$$
(13)

Assume for the bounds on the partial moment ratios

$$\begin{aligned}&({\text {Lemma}}~1)&\underline{m}_{n-1}^s&\ge s\,m_{n-1}&1/\bar{m}_{n}^s&\ge s/m_n \end{aligned}$$
(14)
$$\begin{aligned}&({\text {Lemma}}~4({ ii}))&\underline{m}_{n-1}^s&\le \underline{m}_n^s&\bar{m}_{n-1}^s&\le \bar{m}_n^s. \end{aligned}$$
(15)

The moments of \(X^s\) are bounded by

$$\begin{aligned} \mu _{n-2}^s&\le \frac{1}{\underline{m}_{n-1}^s}v^s(\underline{m}_{n-1}^s,\underline{m}_n^s)&\mu _{n-1}^s&\le v^s(\bar{m}_{n-1}^s,\underline{m}_n^s)&\mu _n^s&\le \bar{m}_n^s \, v^s(\bar{m}_{n-1}^s,\bar{m}_{n}^s). \end{aligned}$$
(16)

The next lemma bounds \(\mu _i^s\) using previously obtained bounds.

Lemma 7

(\({\mu }_i^s\) in terms of the bounds on \(m_j^s\), \({\mu }_j^s\), and \(\mu _j^{-s}\)) For \(i\in \mathbb {N}^+\) and \(s \in \{+,-\}\),

$$\begin{aligned} \mu _{i+1}^s&\ge \frac{\left( \underline{\mu }_{i}^s\right) ^2}{\bar{\mu }_{i-1}^s} \end{aligned}$$
(17)
$$\begin{aligned} \mu _i^s&\ge \max _{j<i} \left( \underline{\mu }_j^s\right) ^{i/j}&\mu _i^s&\ge \max \left( \frac{\left( \underline{\mu }_{i+1}^s\right) ^2}{\bar{\mu }_{i+2}^s}, \underline{m}_{i}^s\underline{\mu }_{i-1}^s , \frac{\underline{\mu }_{i+1}^s}{\bar{m}_{i+1}^s}\right) \end{aligned}$$
(18)
$$\begin{aligned} \mu _i^s&\le \min _{j>i} \left( \bar{\mu }_j^s\right) ^{i/j}&\mu _i^s&\le \min \left( \sqrt{\bar{\mu }_{i-1}^s\bar{\mu }_{i+1}^s},\bar{m}_{i}^s\bar{\mu }_{i-1}^s , \frac{\bar{\mu }_{i+1}^s}{\underline{m}_{i+1}^s} \right) \end{aligned}$$
(19)
$$\begin{aligned} \mu _{2i-1}^s&\ge \underline{\mu }_{2i-1}^{-s} +s\mu _{2i-1}&\mu _{2i-1}^s&\le \bar{\mu }_{2i-1}^{-s} + s \mu _{2i-1}\end{aligned}$$
(20)
$$\begin{aligned} \mu _{2i}^s&\ge \mu _{2i}-\bar{\mu }_{2i}^{-s}&\mu _{2i}^s&\le \mu _{2i}-\underline{\mu }_{2i}^{-s}. \end{aligned}$$
(21)

Lemma 8 bounds the moment ratios in terms of the bounds on the moments.

Lemma 8

(\(m_i^s\) in terms of the bounds on \({\mu }_i^s\))

$$\begin{aligned} \frac{\underline{\mu }^s_i}{\bar{\mu }^s_{i-1}}&\le m_i^s\le \frac{\bar{\mu }^s_{i}}{\underline{\mu }^s_{i-1}}&i&\in \mathbb {N}^+ \end{aligned}$$
(22)
$$\begin{aligned} \sqrt{\frac{\underline{\mu }^s_{i}}{\bar{\mu }^s_{i-2}}}&\le m_i^s\le \sqrt{\frac{\bar{\mu }^s_{i+1}}{\underline{\mu }^s_{i-1}}}&i&=2,3,\ldots . \end{aligned}$$
(23)

The bounds in (22) hold with equality if and only if the bounds on \(\mu _{i-1}\) and \(\mu _{i}\) hold with equality. The bounds in (23) hold with equality if and only if the bounds on \(\mu _j^s\) (\(j\in \{i-2,\ldots ,i+1\}\)) hold with equality and \(X^s\) is quasi-degenerate.

The algorithm below is the iterative algorithm that bounds the partial moments \(\mu _i^+\) and \(\mu _i^-\) and the corresponding moment ratios \(m_i^+=\mu _i^+/\mu _{i-1}^+\) and \(m_i^-=\mu _i^-/\mu _{i-1}^-\).

Algorithm 1

(Partial moments) The moments \(\mu _1,\ldots ,\mu _n\) of the random variable X are known. Before proceeding to the next step, each step is executed for \(i=1,\ldots ,n\) [and also for \(i=0\) in (iii)–(iv)]. The index j denotes an index that may depend on i.

  1. (i)

    Initialize bounds on \(m_i^+\) and \(m_i^-\) (Lemma 1, 3, and 6).

  2. (ii)

    Bound \(m_i^+\) and \(m_i^-\) in terms of the bounds on \(m_j^+\) and \(m_j^-\) (Lemma 4(ii)).

  3. (iii)

    Bound \(\mu _i^+\) and \(\mu _i^-\) in terms of the bounds on \(m_j^+\) and \(m_j^-\) (Theorem 1).

  4. (iv)

    Bound \(\mu _i^+\) and \(\mu _i^-\) in terms of the bounds on \(\mu _j^+\), \(\mu _j^-\), \(m_j^+\), and \(m_j^-\) (Lemma 7).

  5. (v)

    Bound \(m_i^+\) and \(m_i^-\) in terms of the bounds on \(\mu _j^+\) and \(\mu _j^-\) (Lemma 8).

  6. (vi)

    Stop if the change in the bounds since step (ii) is smaller than some predetermined tolerance, otherwise go to step (ii).

Within each step, apply all applicable inequalities. The different assumptions in particularly Lemmas 5 and 6 must be checked in advance. Some may be valid for a given random variable X, while some others may not.

The following algorithm applies Lemma 6(iii) and is helpful for bounding unobserved higher order moments.

Algorithm 2

(Extrapolation) The moments \(\mu _1,\ldots ,\mu _n\) of \(X=Y-Z\) are known with \({{\mathbb P}\!\left( \{Y=0\} \cup \{Z=0\} \right) }=1\). Execute Algorithm 1 to obtain bounds on \(\mu _0^s, \ldots , \mu _n^s\) (\(s\in \{+,-\}\)). For a certain \(\nu \in \mathbb {N}^+\), construct bounds on \(\mu _i^s\) using bounds on \(m_i^s\) where \(i= n+1,\ldots ,n+\nu \):

  1. (i)

    Lower bounds If (a) \(X^s\) is a mixture, and (b) \(m_i(X_j^s) = g_i h_j\) where \(i > n\) and \(g_i, h_j\in {\mathbb {R}}\): apply Lemma 5(i)

    else apply Lemma 4(ii).

  2. (ii)

    Upper bounds If \(X^s\) is a mixture: apply Lemma 5(ii)

    else if \( X^s = \sum _j X_j^s\) with each \(X_j^s\) independent: apply Lemma 5(iii).

Conditions on \(X^s\) are needed for an upper bound on partial moments \(\mu ^s_i\) (\(i>n\)) in Algorithm 2(ii). To see this, the tail of any \(X^s\) can admit \({{\mathbb P}\!\left( X^s = x \right) }=1/x^{i-\varepsilon }\) for arbitrarily large x and \(\varepsilon \in (0,i)\). By Markov’s inequality, the probability mass at x of such \(X^s\) has unbounded impact on moments \(i\in \mathbb {N}^+\) and higher if \(x\rightarrow \infty \):

$$\begin{aligned} {{\mathbb E}\!\left[ \left( X^s\right) ^{i}\right] } \ge x^{i} \, {{\mathbb P}\!\left( \left( X^s\right) ^{i} \ge x^{i} \right) } \ge \frac{x^{i}}{x^{i-\varepsilon }} = x^{\varepsilon } \rightarrow \infty \end{aligned}$$

The bounds on the moments of \(X^+\) and \(X^-\) from Algorithm 2 provide bounds on the unobserved higher order moments of X through \(\mu _i=\mu ^+_i + (-1)^i \mu ^-_i\) with \(i=n+1,n+2,\ldots \). This can be used in a Taylor expansion of the transformation f(X). For an expansion of f(X) around a nonzero a, one can consider using \(\max (X_a,0)\) and \(\max (-X_a,0)\) with \(X_a:=X-a\).

4 Examples

The example in Sect. 4.1 reports bounds on partial moments for a case using Algorithm 1. Section 4.2 contains an example that uses Algorithm 1 and bounds a Taylor series expansion by extrapolating the results to higher order moments.

4.1 Example 1: a sum of random variables

Consider \(X=\frac{1}{2}T-U+Z\) with T an exponential distribution with intensity one, U a uniform distribution on [0, 1], and Z a standard normal distribution. The variables T, U, and Z are independent. Since the odd moments of Z are zero, the moments \(\mu _i\) of X can be obtained from the expansion

$$\begin{aligned} {{\mathbb E}\!\left[ X^i\right] } = \sum _{j=0}^{{\lfloor i/2 \rfloor }} \sum _{k=0}^{i-2j} \left( {\begin{array}{c}i\\ i-2j-k,\,k,\,2j\end{array}}\right) {{\mathbb E}\!\left[ \left( \frac{1}{2}T\right) ^{i-2j-k}\right] } {{\mathbb E}\!\left[ (-U)^k\right] } {{\mathbb E}\!\left[ Z^{2j}\right] } \end{aligned}$$
(24)

where \(i \in \mathbb {N}^+_0\), \({\lfloor i/2 \rfloor }\) denotes the largest natural number not greater than i / 2 \(({\lfloor i/2 \rfloor } = \max _{j\in {\mathbb {N}}_0^+} \{ j : j\le {i}/{2} \})\), and

$$\begin{aligned} {{\mathbb E}\!\left[ T^i\right] }&= i!&{{\mathbb E}\!\left[ U^k\right] }&= \frac{1}{1+k}&{{\mathbb E}\!\left[ Z^{2j}\right] }&= \frac{(2j)!}{2^j(j!)}. \end{aligned}$$
(25)

We obtain the i-th moment of X from (24) for \(i=0, 1, \ldots , 16\).

We are interested in bounding the moments of \(X^+:=\max (X,0)\) and \(X^-:=\max (-X,0)\). Write \(X=\left( \frac{1}{2}T+Z^+)-(U+Z^-\right) \), where the components T, U, \(Z^+\), and \(Z^-\) are logconcave distributions. Notice that (i) \(X^+\not \equiv \frac{1}{2}T + Z^+\) and \(X^-\not \equiv U + Z^-\) and (ii) the convolutions \(\frac{1}{2}T + Z^+\) and \(U + Z^-\) are logconcave (Lemma 5(iv)).

Consider the independent distributions (i) \(T_1, T_2 \,\buildrel d \over =T\), (ii) \(U_1, U_2 \,\buildrel d \over =U\), and (iii) \(Z_1, Z_2 \,\buildrel d \over =Z\). Since \({{\mathbb P}\!\left( Z\ge 0 \right) }=\frac{1}{2}\), the distribution of X is equal in distribution to a mixture of two distributions:

$$\begin{aligned} X \,\buildrel d \over ={\left\{ \begin{array}{ll} \left( \frac{1}{2}T_0 + |Z_0 |\right) - U_0 &{} \text {with probability }\frac{1}{2}\\ \frac{1}{2}T_1 - \left( U_1+|Z_1 |\right) &{} \text {with probability }\frac{1}{2}\\ \end{array}\right. } \end{aligned}$$
(26)

Both components in (26) consist of three independent log-concave distributions. Upper bounds on \(m_i^+:=m_i^+(X)\) and \(m_i^-:=m_i^-(X)\) are initialized by applying Lemma 6(i)–(ii) and then Lemma 5(iv),

$$\begin{aligned} m_i^+&\le \max \left( m_i\left( \frac{1}{2}T+|Z |\right) ,m_i\left( |Z |\right) \right) = m_i\left( \frac{1}{2}T + |Z |\right) \end{aligned}$$
(27)
$$\begin{aligned} m_i^-&\le \max \left( m_i\left( U\right) , m_i\left( U+|Z |\right) \right) = m_i\left( U + |Z |\right) . \end{aligned}$$
(28)

The initial moment ratios in (27)–(28) follow from

$$\begin{aligned} {{\mathbb E}\!\left[ \left( \frac{1}{2}T + |Z | \right) ^i\right] }&= \sum _{j=0}^i \left( {\begin{array}{c}i\\ j\end{array}}\right) {{\mathbb E}\!\left[ \left( \frac{1}{2}T\right) ^{i-j}\right] } {{\mathbb E}\!\left[ |Z |^{j}\right] } \nonumber \\ {{\mathbb E}\!\left[ \left( U + |Z | \right) ^i\right] }&= \sum _{j=0}^i \left( {\begin{array}{c}i\\ j\end{array}}\right) {{\mathbb E}\!\left[ U^{i-j}\right] } {{\mathbb E}\!\left[ |Z |^j\right] }, \end{aligned}$$
(29)

where the moments of T and U are in (25), and

$$\begin{aligned} {{{\mathbb E}\!\left[ |Z|^j\right] }} = \sqrt{\frac{2^j}{\pi }} \Gamma \left( \frac{j+1}{2} \right) . \end{aligned}$$

As a measure of convergence after iteration k, consider the change in the width of the error bounds:

$$\begin{aligned} \varepsilon _k&:= \max _{\begin{array}{c} i=0,\ldots ,16\\ s\in \{+,-\} \end{array}} \left\{ 1-\frac{\bar{\mu }^s_{i,k}-\underline{\mu }^s_{i,k}}{\bar{\mu }^s_{i,k-1}-\underline{\mu }^s_{i,k-1}} \right\} \quad k\in \mathbb {N}^+, \end{aligned}$$
(30)

where the first subscript i of each moment bound refers to the considered order of the partial moments, and the second subscript k refers to the iteration number with iteration 0 immediately after the initialization. A small \(\varepsilon _k\) indicates a small improvement in the error bounds.

Table 1 Moments of X from (24) and final bounds on partial moments of X from \(X^s\) from Algorithm 1 (\(s\in \{+,-\}\))
Table 2 Bounds on the 10-th moment of \(X^s\) for different iterations of Algorithm 1 (\(s\in \{+,-\}\))
Table 3 Bounds on the 16-th moment of \(X^s\) for different iterations of Algorithm 1 (\(s\in \{+,-\}\))

Using \(\varepsilon _k < 10^{-10}\) as a stopping condition, Algorithm 1 stops after 83 iterations and 151 ms of CPU time.Footnote 3 Table 1 reports the bounds on the moments \(\mu _i^+\) of \(X^+\) and \(\mu _i^-\) of \(X^-\). By (20)–(21), the difference \(\bar{\mu }_i^s-\underline{\mu }_i^s\) is independent of the sign \(s\in \{+,-\}\) (3rd column). This difference relative to the moment \(\mu _i\) tends to decrease with the order i of the moment (4th column). This can be helpful for bounding Taylor expansions because the moments of the highest observed order are most important for bounding unobserved higher order moments of X (here, the moments greater than order sixteen).

The difference between the bounds relative to \(\underline{\mu }_i^+\) is also decreasing with the order (7th column). In contrast, the difference between the bounds relative to \(\underline{\mu }_i^-\) increases for the higher order moments to 43.7% (rightmost column). This higher percentage may reflect that the sequence of 16 moments cannot uniquely pin down the characteristics of particularly \(X^-\). The reason is that \(X^+\) tends to be larger than \(X^-\), as indicated by the moment bounds. As such, the higher order moments of X are mainly determined by the higher order moments of \(X^+\).

The convergence by iteration is for the partial moments of order 10 in Table 2. Here, 25 iterations suffice for having each of the bounds with a difference less than \(0.03\%\) of the bounds after the final iteration on the 10-th moment. The final bounds on \(\mu _{10}^+\) and \(\mu _{10}^-\) differ from each other by 2.2 and \(22.0\%\), respectively. Table 3 indicates that for this example, executing 25 iterations gives bounds with a difference of at most \(0.14\%\) to the final bounds on the 16-th partial moments of X. The relative difference between the final bounds on \(\mu _{16}^+\) is, by coincidence, also \(0.14\%\).

Figure 1 depicts the series \(\varepsilon _k\) as a function of the iteration k. The linear trend in \(\varepsilon _k\) suggests that for some \(\tilde{a}, \tilde{b} \in {\mathbb {R}}\),

$$\begin{aligned} \log (\varepsilon _k) \approx \tilde{a}+\tilde{b} k. \end{aligned}$$
(31)

A small number of numerical experiments suggest that the exponential convergence of \(\varepsilon _k\) in (31) might hold in general. A proof of this conjecture is beyond the scope of this paper.

Fig. 1
figure 1

Relative decrease \(\varepsilon _k\) in the width of the error bound \(\bar{\mu }_i-\underline{\mu }_i\) of subsequent iterations (see (30)). Maximum is taken over moments \(i=0, 1, \ldots 16\)

The product \(\prod _{k=1}^\infty \left( 1-\varepsilon _k \right) \) is a measure for the cumulative decrease in the width of the error bound over all iterations. Provided (31) is the correct model and \(\tilde{b} \le 0\), the lower bound and upper bound do not converge to each other:

$$\begin{aligned} \log \prod _{k=1}^\infty \left( 1-\varepsilon _k \right)&= \sum _{k=1}^\infty \log \left( 1-\varepsilon _k \right) \approx - \sum _{k=1}^\infty \varepsilon _k = - \sum _{k=1}^\infty e^{\tilde{a}+\tilde{b}k}\nonumber \\&= - a \sum _{k=1}^\infty b^{k} =-\frac{ab}{1-b} \ne 0, \end{aligned}$$
(32)

where \(a=e^{\tilde{a}}\) and \(b=e^{\tilde{b}}\). An estimation of the linear regression (31) using \(\varepsilon _{20},\ldots ,\varepsilon _{83}\) of this example gives \(\tilde{a}=1.45\) and \(\tilde{b}=-0.299\). This predicts a small cumulative relative decrease in the width of the error bounds after iteration 83:

$$\begin{aligned} \log \prod _{k=84}^\infty (1-\varepsilon _k) = -\frac{ab^{84}}{1-b}\approx -2.05\times 10^{-10} \end{aligned}$$
(33)

Because \(\varepsilon _k\) is a maximum over different orders of moments, the value in (33) can be interpreted as an upper bound on the cumulative decrease of the partial moments \(0,1,\ldots ,16\).

In some cases, initial bounds on the moment ratios can be difficult to obtain. Instead of the initial bounds (27)–(28), suppose we were to initialize the upper bounds on the moment ratios by

$$\begin{aligned} m_i^+&\le m_i\left( \frac{1}{2}T\right) + m_i(|Z |)&m_i^-&\le m_i(U) + m_i(|Z |)&i&= 1,\ldots ,16. \end{aligned}$$
(34)

By Lemma 5(iii), this gives less strict initial bounds on the moment ratios than (27)–(28) (Table 4).

The effect of the initial bounds on the final bounds is substantial as can be seen by comparing Tables 1 and 5. Each difference \(\bar{\mu }_i^s-\underline{\mu }_i^s\) is higher in Table 5 than in Table 1. This underlines the importance of providing initial bounds in Algorithm 1 that are as strict as possible. Particularly, the bounds on the moments of \(X^-\) are sensitive to the initial bounds. More specifically, the accuracy of the bounds on the highest order moment of \(X^-\) decreases by three orders of magnitude when the less strict initial bounds in (34) are imposed (rightmost column in Tables 1 and 5).

Table 4 Initialization of the upper bounds \(\bar{m}_i^+\) and \(\bar{m}_i^-\) on the moment ratios using the bounds (27)–(28) and the weaker bounds in (34)
Table 5 Moments of X from (24) and final bounds on partial moments of X from \(X^s\) from Algorithm 1 (\(s\in \{+,-\}\))

4.2 Example 2: the exponential function on a quadratic form

Suppose one is interested in \({{\mathbb E}\!\left[ \exp (X)\right] }\) with the quadratic form \(X=Z' \tilde{A} Z = \frac{1}{2} Z' \! \left[ \tilde{A} + \tilde{A}' \right] \! Z\) where \(Z\sim N(\mathbf{0},I)\) has dimension d and \(\tilde{A}\) is a \(d\times d\) matrix with eigenvalues less than \(\frac{1}{2}\). Diagonalize the symmetric matrix \(A:=\frac{1}{2}\left[ \tilde{A} + \tilde{A}'\right] \) as \(V\varLambda V'\) with \(\varLambda \) a diagonal matrix with diagonal entries the eigenvalues \(\lambda _1 \le \cdots \le \lambda _d < \frac{1}{2}\). Since \(V'Z \sim Z\), the random variable X is a weighted summation of d independent Chi-squared distributions with one degree of freedom,

$$\begin{aligned} X=Z' A Z \sim Z' \varLambda Z = \sum _{i=1}^d \lambda _i Z_i^2. \end{aligned}$$

We can infer the exact outcome of \({{\mathbb E}\!\left[ \exp (X)\right] }\) from the moment generating function of a Chi-squared distribution:

$$\begin{aligned} {{\mathbb E}\!\left[ \exp (X)\right] }&= \prod _{i=1}^d{{\mathbb E}\!\left[ \exp (\lambda _i Z_i^2) \right] } = \frac{1}{\prod _{i=1}^d \sqrt{1 - 2\lambda _i} }&\mathrm{if }\,\max _i(\lambda _i) < \frac{1}{2} \end{aligned}$$
(35)

The expectation of \(\exp (X)\) is infinite if \(\max _i\lambda _i \ge \frac{1}{2}\). The outcome in (35) enables a direct comparison with the bounds that we obtain. It should be stressed that an exact representation of \({{\mathbb E}\!\left[ f(X)\right] }\) is in general unavailable.

Consider the Taylor series expansion

$$\begin{aligned} {{\mathbb E}\!\left[ \exp (X)\right] } = \sum _{i=0}^\infty \frac{1}{i!}{{\mathbb E}\!\left[ X^i\right] }. \end{aligned}$$
(36)

The moments of X are (Magnus 1986, Lemma 3)

$$\begin{aligned} {{\mathbb E}\!\left[ X^i\right] }&= \sum _\nu \gamma _i(\nu )\prod _{j=1}^i \left( {\sum _k}\lambda _k^j \right) ^{n_j}&\gamma _i(\nu )&= \frac{i! 2^i}{\prod _{j=1}^i[n_j!(2j)^{n_j}]}. \end{aligned}$$
(37)

where the summation is over all \(\nu = (n_1, \ldots , n_i)\) with each \(n_j\in \mathbb {N}_0\) and \(\sum _{j=1}^i n_j j = i\). This procedure is computationally expensive for moments of a high order i. A procedure based on Algorithm 2 enables us to bound high moments of X, and thus \({{\mathbb E}\!\left[ \exp (X)\right] }\) in (36). The bounds follow from a few additional steps we outline below.

Using a polar coordinate system, it can be verified that \(X \sim Z'\varLambda Z \sim (\sqrt{R} U_d)' \varLambda (\sqrt{R} U_d)\) where the direction vector \(U_d\) follows a uniform distribution on the unit d-sphere \(S_d\) and R, which is the squared distance to the origin, follows a \(\chi ^2(d)\)-distribution. The latter distribution is a Gamma distribution with shape parameter d / 2 and scale parameter 2. This means that X is an infinite mixture of Gamma distributions \(X_{{{\mathbf {u}}}}\) with each component characterized by some \({{\mathbf {u}}}\in S^d\). More specifically, each component \(X_{{\mathbf {u}}}\) in the mixture is a Gamma distribution scaled by \(\lambda ({{\mathbf {u}}}) := {{\mathbf {u}}}'\varLambda {{\mathbf {u}}}\). The scaling parameter \(\lambda (\mathbf{u})\) varies between \(\lambda _{\min }= \min _i \lambda _i=\lambda _1\) and \(\lambda _{\max } = \max _i \lambda _i=\lambda _d\). For each component \(X_{{{\mathbf {u}}}}\), a positive (negative) \(\lambda ({{\mathbf {u}}})\) indicates that the corresponding Gamma distribution is added (subtracted). We loosely write \(X_{{\mathbf {u}}}\sim \mathrm{Gamma}(d/2,2\lambda ({{\mathbf {u}}}))\) for all \(\lambda ({{\mathbf {u}}})\), including negative \(\lambda ({{\mathbf {u}}})\).

Define the scaled remainder \({\xi }_n(Y)\) of a random variable Y by the functional

$$\begin{aligned} {\xi }_n(Y)&:= \frac{1}{{{\mathbb E}\!\left[ Y^n\right] }}\sum _{i=1}^\infty \frac{{{\mathbb E}\!\left[ Y^{n+i}\right] }}{(n+i)!}&n&\in \mathbb {N}^+_0. \end{aligned}$$
(38)

For a random variable Y that degenerates at x, i.e., \(Y\equiv x\), the functional in (38) can be written as a function of x,

$$\begin{aligned} {\xi }_n(x)&= \sum _{i=1}^\infty \frac{x^i}{(n+i)!} =\frac{_1F_1(1, n+2; x) - 1}{(n+1)!}&n&\in \mathbb {N}^+_0, \end{aligned}$$

where \(_1F_1(a, b; x)\) is the confluent hypergeometric function of the first kind with parameters a and b. By (36) and (38),

$$\begin{aligned} {{\mathbb E}\!\left[ \exp (X)\right] }&= 1 + \sum _{i=1}^n \frac{1}{i!}\mu _i + \mu _n^+\xi _n(X^+) + \mu _n^-\xi _n(-X^-)&n&\in \mathbb {N}^+. \end{aligned}$$
(39)

The moments \(\mu _1,\ldots ,\mu _n\) are obtained from (37), while Algorithm 1 produces bounds on \(\mu _n^+\) and \(\mu _n^-\). The following lemma derives bounds on \(\xi _n(X^+)\) and \(\xi _n(-X^-)\) in (39).

Lemma 9

Let \(n\in \mathbb {N}^+\).

  1. (i)

    The functionals \(\xi _n(X^+)\) and \(\xi _n(-X^-)\) are bounded by

    $$\begin{aligned} 0 \le \xi _n(\underline{m}^+_{n+1})&\le \xi _n(X^+)\\ \xi _n(-\bar{m}^-_{n+1})&\le \xi _n(-X^-) \le \xi _n(-\underline{m}_n^-) \le 0. \end{aligned}$$
  2. (ii)

    If the random variable X is a mixture of Gamma distributions \(T_\theta \sim \mathrm{Gamma}(k, \theta )\) with fixed shape parameter \(k>0\), scale parameter \(\theta \in \left[ \theta _{\min }, \theta _{\max } \right] \subseteq \left( -1,1\right) \), and \(T_{\theta } = -T_{-\theta }\) for \(\theta <0\), then

    $$\begin{aligned}&\xi _n \left( \frac{ (k+n)\underline{m}^+_{n} }{ k+n-1 } \right) \le \xi _n(X^+) \le \frac{1}{n!} \left[ {}_{2} F_{1}\left( 1, k+n ; \, n+1; \, [\theta _{\max }]^+ \right) -1 \right] \\&\frac{1}{n!} \left[ {}_{2} F_{1}\left( 1, k+n ; \, n+1; \, -[\theta _{\min }]^- \right) -1 \right] \le \xi _n(-X^-), \end{aligned}$$

    where \(_2F_1\) is the Gaussian hypergeometric function.

Table 6 \(d = 2\), \(n = 14\), and stopping criterion \(\varepsilon _k < 10^{-6}\)
Table 7 \(d = 20\), \(n = 14\), stopping criterion \(\varepsilon _k < 10^{-6}\), and \(\lambda \) equally spaced on \(\left[ \lambda _{\min },\lambda _{\max }\right] \)
Table 8 \(d = 20\), \(n = 14\), stopping criterion \(\varepsilon _k < 10^{-6}\), and multiplicity = 10 for both \(\lambda _{\min }\) and \(\lambda _{\max }\)

The bounds in Lemma 9(i) can deal with any distribution X, particularly any eigenvalue of A. In contrast, the two bounds in Lemma 9(ii) that are based on the Gaussian hypergeometric function \(_2F_1\) require that each absolute eigenvalue of A is less than \(\frac{1}{2}\), because \(X_\mathbf{u}\) has scale parameter \(\theta (\mathbf{u}) = 2\mathbf{u}^T A \mathbf{u} = 2\lambda (\mathbf{u})\) with \(\mathbf{u} \in S^d\).

We present example cases where the minimal eigenvalue \(\lambda _{\min }\) is \(-\,0.1\), \(-\,0.2\), \(-\,0.3\), or \(-\,0.4\), while the maximal eigenvalue \(\lambda _{\max }\) is 0.1, 0.2, 0.3, or 0.4. The other eigenvalues are either equally spaced between \(\lambda _{\min }\) and \(\lambda _{\max }\), or equally split at the two extremes.

Define

$$\begin{aligned} Y :=\exp (X)\quad r :=\frac{ \bar{\mu }_Y - \underline{\mu }_Y}{\mu _Y}\quad w := \frac{{\mu _Y} - { \underline{\mu }_Y}}{{ \bar{\mu }_Y} - { \underline{\mu }_Y}}, \end{aligned}$$
(40)

where \(\underline{\mu }_Y\) and \(\bar{\mu }_Y\) are a lower and an upper bound on the mean \(\mu _Y\) of Y, respectively. The ratio r represents the size of the maximal error relative to the mean \(\mu _Y\). A small r indicates more accurate bounds. The weight \(w\in [0,1]\) is the normalized location of \(\mu _Y\) on the interval \(\left[ {\underline{\mu }}_Y, \bar{\mu }_Y \right] \). A value of w close to zero (one) reflects that the lower (upper) bound is the most accurate bound on \(\mu _Y\). The bounds are equally accurate if \(w=\frac{1}{2}\).

Algorithm 1 stops after iteration k if \(\varepsilon _k< 10^{-6}\) with \(\varepsilon _k\) as in (30), or if \(k = 100\). Subsequently, the expression in (39) can be bounded. Table 6 reports several statistics for the case with two eigenvalues (\(d=2\)). The mean \(\mu _Y\) is in all cases of the same order of magnitude. The maximal relative error r is minimal when both extreme eigenvalues \(\lambda _{\min }\) and \(\lambda _{\max }\) are close to zero; a similar observation holds with \(d=20\) in Table 7 (eigenvalues equally spaced on the interval \(\left[ \lambda _{\min }, \lambda _{\max } \right] \)) and in Table 8 (10 eigenvalues at both \(\lambda _{\min }\) and \(\lambda _{\max }\)). Indeed, we can perfectly estimate \(\mu _Y = 1\) for the case where each eigenvalue equals zero.

It follows from w in Table 6 that with \(d=2\), the upper bound tends to be more accurate than the lower bound if \(\lambda _{\max }\) is high, thus when \(X^+\) tends to be large. This observation is reversed for \(d=20\) (Tables 7 and 8). Compared to \(\lambda _{\max }\), the value of \(\lambda _{\min }\) has a smaller impact on w. The accuracy decreases with the dimension d and the magnitude of the eigenvalues of A. More specifically, we observe a lower accuracy r if \(d=20\) and \(\max ( \left| \lambda _{\min } \right| , \left| \lambda _{\max } \right| ) \ge 0.3\).

The computation time in Tables 6, 7, 8 is the mean CPU time of 5,000 computations of each model. The standard error is at most 0.12 ms. The computation time and the number of iterations are lower in cases where \(\left| \lambda _{\min } \right| = \left| \lambda _{\max } \right| \).

5 Discussion and conclusions

This paper has presented an iterative algorithm that bounds the lower and upper partial moment at \(a\in {\mathbb {R}}\) of the random variable X with a known finite sequence of moments of X. In a numerical example, particularly the higher order partial moments can have narrow bounds. The obtained bounds imply bounds on unobserved higher order moments of X which is useful for bounding moments of the transformation f(X). In another application, a transformation \(f(x)=e^x\) is considered for the quadratic form \(X=Z'AZ\) where Z is a multivariate normal distribution. Numerical experiments suggest that the obtained bounds on \({{\mathbb E}\!\left[ \exp (X)\right] }\) are most accurate if \(X^+\) is not too large. The accuracy depends on the dimension as well as the eigenvalues of A.