Multiscale stochastic optimization: modeling aspects and scenario generation
 79 Downloads
Abstract
Realworld multistage stochastic optimization problems are often characterized by the fact that the decision maker may take actions only at specific points in time, even if relevant data can be observed much more frequently. In such a case there are not only multiple decision stages present but also several observation periods between consecutive decisions, where profits/costs occur contingent on the stochastic evolution of some uncertainty factors. We refer to such multistage decision problems with encapsulated multiperiod random costs, as multiscale stochastic optimization problems. In this article, we present a tailormade modeling framework for such problems, which allows for a computational solution. We first establish new results related to the generation of scenario lattices and then incorporate the multiscale feature by leveraging the theory of stochastic bridge processes. All necessary ingredients to our proposed modeling framework are elaborated explicitly for various popular examples, including both diffusion and jump models.
Keywords
Stochastic programming Scenario generation Bridge process Stochastic bridge Diffusion bridge Lévy bridge Compound Poisson bridge Simulation of stochastic bridge Multiple time scales Multihorizon Multistage stochastic optimization1 Introduction

Multiperiod models The decisions are made at the very beginning whereas the consequences of the decisions depend on the development of a process over time. A typical example is a buyandhold portfolio strategy.

Multistage models Decisions can be made at regular moments in time. Typical examples are active portfolio strategies.

Supply network extension problems, where major decisions (such as whether to defer, to stage, to mothball, or to abandon a certain infrastructure investment opportunity; cf. [28]) can only be made at strategic time points (say, once every few years), but resulting profits/costs are subject to daily fluctuations of market prices.

Inventory control problems with limited storage capacity and backlogged/lost demand due to outofstock events, where procurement of goods is restricted by logistical constraints/time delays.

Structured portfolio investment problems, where rebalancing is possible only at given time points (say, once every few weeks due to product terms and conditions), but contained barrier features make profits/losses depend on the full trajectory of asset prices.

Power plant management problems, where operating plans need to be fixed for a certain period ahead (say, once every few days due to physical constraints avoiding instant reaction to market conditions), but actual profits/losses depend on each tick of the energy market.
Looking only at the coarser decision scale, the requirements to the discrete structure are the same as for any standard multistage stochastic optimization problem. In general, there are three different strategies for the generation of discrete scenarios out of a sample of observed data, as illustrated in Fig. 2. Fans are not an appropriate structure for multistage decision problems, as they cannot reflect the evolution of information. Scenario trees are a popular tool in the literature. However, scenario trees are practically intractable for problems involving a large number of decision stages, due to their exponential growth over time. Therefore, one often reverts to scenario lattices in such cases. While the literature on the construction of scenario trees is relatively rich (see, e.g., [16, 18, 23, 32, 33]), the lattice construction literature is rather sparse. The stateofthe art approach is based on the minimization of a distance measure between the targeted distribution and its discretization (“optimal quantization”), see [3, 25, 32]. In this article, we study a lattice generation method along the very upper path of Fig. 2. More precisely, it is a “direct” method for the case when a timehomogeneous Markovian diffusion model is selected in the first step. The approach is purely based on the infinitesimal drift and diffusion coefficient functions of a diffusion model and directly provides a scenario lattice, without requiring a simulation/quantization procedure. While the idea of such a discretization technique appeared already in an early paper by Pflug and Swietanowski [34], it has not been analyzed (or used) yet in the stochastic optimization literature (cf. the review article of Löhndorf [24]). We make the approach complete in this paper by proving a stability result and error estimate for the optimal value of a generic multistage stochastic optimization problem. In particular, we show that the approximation error regarding the optimal value in the continuous (state space) diffusion model can be controlled when the suggested lattice generation method is applied.
Once the decision time scale has been discretized with a scenario tree/lattice model, a coherent approach for the finer observation time scale requires an interpolation that respects the laws of the underlying stochastic process. This brings us to the theory of stochastic bridges, i.e., processes pinned to a given value at the beginning and the end of a certain time period. We suggest to use a simulation engine to generate a set of paths of the bridge process, and then compute expected profits/costs between decisions based on a MonteCarlo simulation. This requires a simulatable form of the bridge process. The stochastic processes literature seems to offer mainly abstract theory in this respect. There are some articles on simulation methods (i.e., mainly acceptancerejection methods) for diffusion bridges and jumpdiffusion bridges in the statistical analysis literature since the early 2000’s, see [8, 9, 14, 30, 36]. However, these methods are inefficient due to a possibly large rejection rate. To make our suggested modeling approach directly applicable, we work out explicitly the bridge process dynamics for some popular diffusion models, including geometric Brownian motion, the Vašíček model, and the Cox–Ingersoll–Ross model. Based on these dynamics, efficient simulation is possible by means of standard discretization schemes for stochastic differential equations. Moreover, we present a simulation scheme for the example of geometric Brownian motion, which operates directly on generated paths from the unconditioned process and thus enables an even more efficient generation of bridge process trajectories. If the cost function is particularly amenable (e.g., linear), a simulation might not even be required, as expected costs can be computed analytically in some models. We also include jump processes in our analysis, as we propose a simulation algorithm for compound Poisson bridges in the case of Normally, Exponentially, or Gamma distributed jump sizes. In particular, we discuss the simulation of the number of jumps of the bridge process and derive the conditional distribution of each jumpsize given both the final value of the bridge process as well as the number of jumps in the interval.
The general contribution of this article is to propose a modeling framework and a corresponding scenario generation method, such that an efficient computational solution of multiscale stochastic optimization problems is possible. The details of this contribution are threefold. First, it consists of the general modeling idea, which is based on a consistent but separate scenario generation approach for the two involved time scales. Second, we analyze theoretically a widely unknown direct method for the construction of scenario lattices when the underlying stochastic model is of the diffusion type; this is purely related to the coarser decision time scale. Third, as regards the finer observation time scale, we elaborate the details of a consistent interpolation procedure for a number of popular modeling choices. This includes the presentation of a novel simulation algorithm for compound Poisson bridges.
The outline of the paper is as follows. Section 2 deals with the generation of discrete scenarios as a model for the information flow over the decision stages. In Sect. 3, we present the details related to the suggested interpolation approach for the information flow through the intermediate observation periods. Section 4 illustrates our modeling framework with a simple multiscale inventory control problem. Moreover, we discuss the applicability and the benefits of the proposed approach. We conclude in Sect. 5.
2 Scenario lattice generation for decision stages
Computational methods for stochastic optimization problems require discrete structures. For multistage problems, scenario trees are the standard models for the evolution of uncertainty over time. Scenario trees allow for general pathdependent solutions, as for each node there exists a unique path from the root of the tree. However, scenario trees grow exponentially in the number of stages, a fact that easily overwhelms any computer’s memory when it comes to practicallysized problems.^{1} Therefore, if the underlying stochastic model is a Markov process, one typically discretizes it in the form of a scenario lattice. Lattice models are special cases of graphbased models, where a node does not necessarily have a unique predecessor. Different paths may then connect in a certain node at some later stage. In this way, one can obtain a rich set of paths with relatively few nodes.
The construction of scenario lattices typically works in a twostep procedure. First, one discretizes the marginal distributions for all stages. In a second step, one decides about allowed state transitions and determines conditional transition probabilities between consecutive stages. The stateoftheart method for such a lattice generation procedure is based on the stagewise minimization of the (Wasserstein) distance between the modeled distribution—which is typically continuous—and its discretization on the lattice. A detailed description of this approach can be found in Löhndorf and Wozabal [25, Sect. 3.2].
We will now study an alternative lattice generation method, which is not based on optimal quantization theory but rather relies on Markov chain approximation results. In particular, this approach allows to construct a scenario lattice directly from the dynamics of a Markovian diffusion process.
2.1 Markov chain approximation for diffusion processes
Birthanddeath Markov chains are discrete stochastic processes defined on the integer grid, where each transition depends only on the current state and allows for three possibilities: to remain in the current state, to move one unit up, or to move one down. Many Markov chains can be approximated by a diffusion process. It works by a transformation of the time scale and a renormalization of the state variable. The idea is, e.g., explained in the book of Karlin and Taylor [20, Ch. 15]. Pflug and Swietanowski [34] have looked at the problem from the converse perspective. They elaborate, without providing error estimates, that any diffusion process possessing a stationary distribution can be approximated by a birthanddeath Markov chain in the following way.

\(\vert \mu (x)  \mu (y) \vert \le L\cdot \vert xy \vert \), \(\vert \sigma (x)  \sigma (y) \vert \le L\cdot \vert xy \vert ,\)

\(\mu ^2(x) \le L^2\cdot (1+x^2)\), \(\sigma ^2(x) \le L^2\cdot (1+x^2),\)
Algorithm 2.1
 1.Choose a strictly monotonic, three times differentiable function H(x) with \(H^{\prime \prime }(0)\le M<\infty ,\) for some constant M, as well as functions g(x) and \(\tau (x)\) with \(\vert \tau (x)\vert \le 1\) for all x, in such a way that the drift and diffusion coefficient functions in (1) are matched:$$\begin{aligned} \mu (H(x))= & {} H^\prime (x)g(H(x)) + \frac{1}{2}H^{\prime \prime }(x)\tau ^2(H(x)) \\ \sigma (H(x))= & {} H^\prime (x) \tau (H(x)). \end{aligned}$$
 2.
Determine the initial state \(i_0\) such that \(H(\frac{i_0}{2^N}) = x_0\).
 3.Define the transition probabilitieswhere \([x]_0^{1} := \min \{\max \{x,0\},1\}\), for jumping up, down, and remaining in its state, respectively.$$\begin{aligned}&p_{i,N}^{u} := \left[ \frac{1}{2} \left( \tau ^2\left( H\left( \frac{i}{2^N}\right) \right) + \frac{1}{2^N}g\left( H\left( \frac{i}{2^N}\right) \right) \right) \right] _0^{1} , \\&p_{i,N}^{d} := \left[ \frac{1}{2} \left( \tau ^2\left( H\left( \frac{i}{2^N}\right) \right)  \frac{1}{2^N}g\left( H\left( \frac{i}{2^N}\right) \right) \right) \right] _0^{1} , \\&p_{i,N}^{r} := 1  p_{i,N}^{u}  p_{i,N}^{d}, \end{aligned}$$
 4.
Define the piecewise constant (continuous time) process \({\tilde{X}}^N\), where \({\tilde{X}}_t^N := {\tilde{X}}^{N}_{\lfloor 2^{2N}t\rfloor }\) lives in the states \(H\left( \frac{i}{2^N}\right) \); the floor function being denoted by \(\lfloor \cdot \rfloor \).
While the idea of Algorithm 2.1 was originally presented in the early paper [34], it has not been analyzed yet in the context of stochastic optimization. We now make the approach complete by deriving an error estimate for the optimal value of a generic multistage stochastic optimization problem, when the underlying diffusion model is approximated by the method of Algorithm 2.1. We start with some preliminary results required for the proof.
Lemma 2.1
Proof
Bounds for diffusion processes can be found in the literature. We will use the following result.
Proposition 2.1
Proof
See the book of Platen and Heath [35, Lemma 7.8.1]. \(\square \)
We now establish that weak convergence implies convergence in Wasserstein distance, if the second moments of all involved probability measures are bounded.
Lemma 2.2
Proof
Theorem 2.1
Proof
The subsequent result bounds the difference in the value of a diffusion process at a certain future time, if it starts from different values at time zero.
Proposition 2.2
Proof
See Cox et al. [12, Cor. 2.19] \(\square \)
With the above auxiliary results in hands, we now define a generic multistage stochastic optimization problem. The approximation quality of its optimal value, when the uncertainty process is modeled by a diffusion but approximated on the basis of Algorithm 2.1, is the object that we eventually want to analyze
Definition 2.1
To interpret problem (10), it is the objective to select a nonanticipative (constraint (10b)) decision policy x, which fulfills certain additional constraints (10a), in such a way that cumulative expected costs are minimized. One may think, for instance, in terms of portfolio losses \(C_t\) resulting from the stochastic evolution of the financial market \(\xi _t\) as well as the selected portfolio composition \(x_t \). Shortselling restrictions would then be an example for “additional constraints” on the decision process.
The concept of the Wasserstein distance^{2} between probability measures will be a key ingredient for our analysis of Algorithm 2.1 in terms of its approximation quality with respect to the optimal solution of GenMSP. In particular, we will rely on the following general stability result for the optimal solution of GenMSP, when the underlying probability model varies.
Proposition 2.3
We are now ready to formulate the main result of this section.
Theorem 2.2
Consider a GenMSP according to Definition 2.1. Let the uncertainty process \(\xi \) be modeled by a diffusion according to (1). Assume that the coefficient functions satisfy the regularity condition (8). Observe \(\xi \) in all decision stages \(t=0,\dots ,T\) of GenMSP and denote the resulting discretetime continuous statespace model by \({\mathbb {P}}\). Let \(\xi \) be discretized according to the Markov chain approximation method given in Algorithm 2.1 and denote the discrete model resulting from the Nth approximation by \(\tilde{{\mathbb {P}}}^{N}\). Then, the optimal value \(v^*(\tilde{{\mathbb {P}}}^{N})\) of the approximate problem tends to the optimal value \(v^*({\mathbb {P}})\) of the original problem, as \(N\rightarrow \infty \). For fixed N, an error estimate of the form (13) holds.
Proof
We want to show that \({\mathbb {P}}\) and \(\tilde{{\mathbb {P}}}^{N}\) satisfy the conditions (11) and (12), with \(\varepsilon _t\downarrow 0\) as N increases. Then, the statement follows readily from Proposition 2.3.
Thus, condition (12) is shown to be satisfied. \(\square \)
Remark
The rescaling of time was necessary in the construction of Algorithm 2.1 in order for Theorem 2.2 to hold. However, notice that the method in essence specifies a ternary transition rule. While blindly using the directly resulting ternary lattice would not rely on any supporting theory, it might still be interesting to test its performance, especially for problems with multiple observation periods but relatively few decisions.
3 Interpolating bridge processes
In Sect. 2, we discussed the generation of discrete scenario trees/lattices out of continuous parametric models, as it is typically required for the computational solution of any multistage stochastic optimization problem. For multiscale problems, a discretization of the information flow through all decision stages is not enough, as the stochasticity of the costs between the decision stages is an important factor. In such cases, we suggest to draw on the theory of stochastic bridge processes in order to simulate the behavior of the uncertainty process (with arbitrary granularity of the time increment) between consecutive decisions. In particular, this approach ensures the consistency of the finer multiperiod observation scale and the coarser decision scale by simulating trajectories for the multiperiod costs that connect two decision nodes with each other in a tree/lattice model.
In this section, we make our proposed modeling approach directly applicable by working out the details for several popular examples of stochastic models. In particular, we present a new simulation algorithm for compound Poisson bridges and derive the dynamics for a few diffusion bridge examples in explicit form. From the latter dynamics, a simulation engine can easily be implemented on the basis of any discretization scheme for stochastic differential equations.^{3}
3.1 Diffusion processes
We start with a generic multidimensional model with drift and multiple factors. Afterwards, we derive the bridge process dynamics explicitly for several special cases that are frequently used in the literature. The general theory for diffusion bridges is wellestablished (see [4, 13, 37]), but the literature is quite abstract. In particular, we are not aware of any standard textbook that offers explicit examples apart from the basic Brownian bridge. Our relatively simple proof of the subsequent theorem is a generalization and elaboration of the derivations contained in an unpublished manuscript by Lyons [26], that we found online.
Theorem 3.1
Proof
We subsequently focus on the onedimensional case. Let \(X_{t_2} = x_2\) be fixed for all examples below.
3.1.1 Pathwise construction of the bridge process for GBM
The subsequent result shows how to translate a set of Brownian motion trajectories into a set of geometric Brownian bridge trajectories. Thus, simulation of the GBM bridge is straightforward and requires only the generation of Gaussian random variables.
Proposition 3.1
Proof
3.2 Jump processes
Stochastic processes that do not fluctuate in a continuous manner but rather by sudden jumps, are popular models for a variety of applications. The majority of typical jump models belongs to the class of Lévy processes. Lévy processes are stochastic processes characterized by independent and stationary increments as well as stochastically continuous sample paths.^{4} In addition to their prominence in the physical sciences,^{5} there is a particularly vast literature on Lévy processes as a model for the random evolution of variables present in the financial markets.^{6} As we are dealing with bridge processes here, let us mention the fact that the Markov property of Lévy bridges is inherited from the Markov property of Lévy processes [19, Proposition 2.3.1].
3.2.1 Compound Poisson bridges
The most fundamental and prominent jump process is the Poisson process, counting the number of occurrences of some random event. For the modeling of a situation where not only the number of those (quantifiable) events but also their size matters, the compound Poisson process is a natural extension. It is extensively used, e.g., for actuarial applications as insurance companies are naturally not only interested in the number of claims happening to their customers but even more importantly in the claim sizes.^{7}
We present a method to simulate sample paths from a compound Poisson bridge process, i.e. a compound Poisson process with given initial and final value (and time). For jumpsize distribution families that are closed under convolution or where convolution results in another tractable parametric family, some ingredients to our simulation scheme can be derived analytically and thus efficient simulation is possible. We carry out this exercise for the most popular representatives of jumpsize distributions, i.e., the Normal distribution, the Exponential distribution, and the Gamma distribution. For distributions that do not allow for a tractable representation of the required convolution objects, one will have to revert to statistical procedures such as acceptancerejection methods.
Consider a compound Poisson process X with intensity \(\gamma \) and jumpsize distribution given by the density f. To avoid notational conflicts, we reserve the lower index in \(X_t\) to describe the process X at time t. In contrast, we use an upper index to enumerate individual jumps (as random variables). The realization of an ‘ith’ jump \(X^i\) is denoted by \(x_i\). Consider now the process \(X_t\) in the interval \([t_1,t_2]\), where we are given the values \(X_{t_1}\) and \(X_{t_2}\). Define \(c := X_{t_2}  X_{t_1}\). We suggest the simulation of the bridge process to be performed in the following three steps.
Having determined the value \({\bar{N}}\), one can then easily compute an approximation of \({\mathbb {P}} [N=n \vert \sum _{i=1}^N X^{i} = c]\), for all \(n=1,\ldots ,{\bar{N}}\), by cutting the sum in the denominator of (22) after the index \({\bar{N}}\). The cumulative distribution function is then easily obtained by summation of the single probabilities and inverse transform sampling gives a straightforward simulation scheme by applying its inverse to random draws from the uniform distribution on [0, 1].
II: Simulation of the jumping times Suppose that some value n for the number of jumps in the interval \([t_1,t_2]\) has been simulated by the method outlined above. Then, the precise jumping times are uniformly distributed over this interval. More precisely, the joint distribution of the jumping times \((\tau _1,\ldots ,\tau _n)\) equals the law of the order statistics of n independent uniform random variables on \([t_1,t_2]\) (cf., e.g., [10, Prop. 2.9]). Thus, the jumping times can easily be generated by another n calls of a standard (pseudo) random number generator.
Remark
Notice that for the last jump \(X^n\) the conditional jumpsize distribution is a Dirac distribution with all the mass centered in the remaining gap between the target value and the value of the process after the penultimate jump \(x^{n1}\). Hence, the proposed simulation procedure ends up in the targeted value with probability one.
For Propositions 3.2–3.4 below, denote the jump size distribution by F. We study the conditional distribution \({\hat{F}}_k\) of the kth jump \(X^k\) given the value of the sum of the remaining (\(nk+1\)) jumps, i.e. \(\sum _{i=k}^{n}X^i= c\sum _{j=1}^{k1}x_j =:C_k\).
Proposition 3.2
(Gaussian jumps) Let \(F\sim {\mathcal {N}}(\mu , \sigma ^2)\). Then, \({\hat{F}}_k^{\mathcal {N}}\) is a normal distribution with mean \(\frac{C_k}{nk+1}\) and variance \(\left( \frac{nk}{nk+1}\right) \sigma ^2\), for any \(1\le k\le n\).
Proof
Proposition 3.3
Proof
Remark
The Lomax distribution (sometimes also called Pareto type II distribution) is a special case of the generalized Pareto distribution (GP). In particular, it holds \({\text {Lomax}}(\alpha , \beta ) \sim {\text {GP}}(0, 1/\alpha , \beta /\alpha )\). The GP distribution is typically contained in commercial software packages. Built in functions can then be used for straightforward simulation of the Lomax distribution.
Remark
Observe in passing, as a quick crosscheck of the above results, that in both the Normal and the Exponential distribution case, the derived conditional distributions \({\hat{F}}_n\) of the last jump \(X^n\) have expectation \(C_n\) and zero variance.
Proposition 3.4
Proof
Remark
As the Gamma function \(\Gamma (\cdot )\) is only defined for strictly positive arguments, the case \(k=n\) is not covered in Proposition 3.4 above. However, we have generally addressed the latter case before.
Remark
Remark
(Further Lévy processes) For most Lévy processes, the density function at a given future time is not available in (semi)closed form. However, in some special cases, bridge processes turn out to be of a surprisingly tractable nature. In the dissertation of Hoyle [19], one can find results for 1 / 2stable processes, Inverse Gaussian processes and Cauchy processes, which imply that a simulation of associated bridges can be performed in a straightforward way: In the first two cases, by applying a deterministic function to a random draw from the standard Normal distribution; in the third case, the cumulative distribution function is given in terms of standard functions.
4 Illustration by example
In this section, we discuss the proposed modeling approach by a prototypical example. Moreover, we report about an implementation in the context of a realworld industrial application. In order to focus on the essential characteristics of the class of multiscale stochastic optimization problems, we will keep the complexity of the purely illustrative example as simple as possible.
4.1 A simple inventory control problem
Consider a business where some (perishable) goods can be sold for a unit price a. The stock can be replenished each Monday morning for the price b per unit. During the week, the products are sold but the stock cannot be replenished. The demand varies. If the business runs out of stock, then costs c occur depending on the remaining time until the next opportunity to fill the stock. For products left in stock at the end of the week, we assume that only 30% can still be used for the next week, but 70% need to be thrown away.
As a model for the demand, we use the Vašíček model [see (18) in Sect. 3.1]. In particular, for the sake of simplicity we do not consider any seasonal patterns. Let the parameters of the Vašíček model be given by \(\theta = 105, \kappa = 0.5, \sigma = 10\), and the starting value \(x_0 = 100\). Threestage problems are the smallest instances involving all issues that are typically connected to multistage decision making under uncertainty. Hence, the objective in the subsequent illustrative example is to maximize expected profits over two upcoming weeks.
4.1.1 Modeling the problem
where \(\sigma (X)\) denotes the filtration generated by the demand process X. We set the problem parameters to \(a=10, b=7\) and \(c=1000\). We observe the demand on an hourly basis, 24/7.
The key observation here is that profits depend (in a highly nonlinear fashion) on the whole demand trajectory, while a replenishment decision for the stock can only be made once a week. The pathdependency is due to the presence of the stopping times \(\tau _t\) in the objective. We apply our suggested methodology and generate a collection of paths between each pair of consecutive decision nodes. In such a way, expected profits during the week can be computed by a simple Monte Carlo simulation. The SDE describing the Vašíček bridge process is given in (19).
4.1.2 Discretization of decision stages
Lattice via Algorithm 2.1—number of nodes at stage t, Nth iteration
4.1.3 Comparison with other modeling approaches

Large trees/lattices In principle, one can understand any multiscale problem as a standard multistage problem, where the constraints rule out that decisions are made on the finer observation scale. After all, both scales are associated with the same underlying process. Then, one might simply use a very large tree/lattice model for the uncertainty process, which branches in each observation time. However, this will typically result in computational intractability. Even for the very small illustrative example discussed in this section, where there are only two decision points, hourly observation would already require a structure with 336 branching times. Any tree model would clearly explode even for much smaller instances. A ternary lattice model would involve more than 100 thousand nodes. Compared to our approach, it would require massive resources to construct such a lattice, store it, and compute a solution on it.

Reduced trees If multistage problems grow too large to be modeled on regular scenario trees, it seems popular in the applied stochastic programming literature to use trees that only branch irregularly, i.e., certain branches remain constant up to/after a certain time (cf. [15, 16, 17, 29]). However, this means to use clairvoyant branches where a computed policy does not reflect the uncertainty faced by the decision maker. In fact, using such degenerate trees violates the fundamentals of multistage stochastic optimization, which is exactly based on the idea of (direct) stochastic lookahead policies. Our approach, on the other hand, does not turn the decision maker clairvoyant up to/after a certain time and is hence perfectly aligned with the fundamental paradigm that a decision policy must reflect the uncertainty faced by the decision maker at any point in time. For our example, the reduction of a tree with 336 branching times to a computational instance would need to be so massive, that basically a fan with very few branchings would remain.

Deterministic interpolation function Given a tree/lattice model for the information flow over the decision stages, one might simply choose a rudimentary interpolation approach, such as a constant or linear interpolation function, to compute the multiperiod costs between decisions. However, this is inconsistent as both the decision and the observation scale are actually associated with one and the same uncertainty process. It would mean to completely remove the stochasticity between decisions, whereas our approach takes into account the random fluctuations along the way from one decision node to the other. In the context of our example, a constant interpolation would be completely meaningless, as it would correspond to assuming that all the selling activity occurs in a single instant of time each week. A linear interpolation would simply not be in line with the essence of the problem that one does not know in advance if/when one will run out of stock during the week.

Multihorizon stochastic programming A solution approach for a class of problems which are of a similar flavour, yet crucially different in nature, is called multihorizon stochastic programming (see [21, 27, 41, 42, 46, 48]). Infrastructure planning problems, being the original motivation by Kaut et al. [21], typically involve (rarely happening) strategic decisions as well as operational tasks (daily business). To overcome the above mentioned memory issue resulting from frequently branching scenario trees, the authors of [21] suggest to start with a tree for the strategic scale only. In a second step, they attach another tree to each node of the strategicscale tree. The key assumption for the multihorizon stochastic programming approach to be appropriate is that the strategic scale and all operational scales are independent from each other. In contrast, the approach suggested in the present paper is designed for problems where the two scales are clearly related to the same uncertainty process. Therefore, our approach ensures that different scenarios in between consecutive decisions are eventually bundled in one node (by leveraging the theory of stochastic bridges). Moreover, for our illustrative example each of the “operational” trees would still require 168 branching times, such that again serious simplification would be required to make the approach computationally tractable.

respects the stochasticity in between decisions,

ensures the consistency of all involved scales with respect to a single uncertainty process, and

keeps the problem computationally tractable.
4.1.4 Numerical illustration
We have discussed above the qualitative strengths of our modeling approach. The simple example that we used to exemplify our explanations shall now serve to illustrate numerically two important aspects. First, an appropriate modeling approach is increasingly important the stronger is the pathdependency of the multiperiod costs. In our example, this pathdependency is higher, the larger is the value of the parameter c, which represents the costs that occur during the time span when the agent is out of stock. The second aspect is that the way how the multiperiod costs are modeled, does have a considerable impact on the resulting optimal value, even if the coststructure does not depend heavily on a particular path. Even if we set \(c=0\), we observe an overestimated value of almost 3% if we use an adhoc linear interpolation instead of our consistent modeling approach. Notice that this impact is related to a problem with only two decision stages.
Numerical illustration of the impact of using our modeling approach versus an inconsistent linear interpolation heuristic
c  Opt. value (bridge process)  Opt. value (linear interp.)  Impact (%) 

0  562  577  + 2.7 
10  561  576  + 2.7 
100  556  571  + 2.7 
500  534  552  + 3.4 
1000  507  529  + 4.34 
5000  419  464  + 10.7 
4.2 A realworld application
We have implemented the modeling approach suggested in this paper in the context of an industrial project dealing with the valuation of a thermal power plant. While the focus of that project lied on the incorporation of model ambiguity into a value function approximation policy, the valuation problem itself was presented to us in the form of a classical multiscale stochastic optimization problem: operating plans for the power plant must be fixed on a weekly basis (for management purposes of all the involved resources), but the intraweek profits resulting from the most recent decision depend on (uncertain) market prices that are observed in 4h blocks. It is thus required to model a weekly decision scale with 42 observation periods within each week.
A classical tree model over all observation periods would be intractable even for a single week. If we use a ternary lattice model, the first week already involves 1764 nodes. The second week requires 5292 additional nodes and modeling a quarter of a year with a ternary lattice involves about 300 thousand nodes in total. On the other hand, with our approach the lattice model is related to a much coarser time granularity, discretizing only the information flow on the weekly decision scale. Then, a time horizon of a quarter of a year involves only 196 nodes on a (ternary) lattice. Considering the finer observation scale, such a lattice involves 507 different arcs, along which an interpolation is required. An inconsistent interpolation approach would distort the expected costs in each such intraweek segment.
The multiscale modeling approach of the present paper proved to be very useful for this practicallysized problem. In fact, the power plant model—as it was presented to us by our industry partner—turned out to be of such a tractable form that expected intraweek costs could even be calculated by an analytical formula, based on the derived bridge process dynamics for the underlying uncertainty process. If a simulation is required, this obviously slows down the computation process. Still, the approach allows for a scenariowise decomposition with respect to the decision time scale, i.e., of the tree/lattice model. Thus, computational tractability is typically not limited by the multiscale feature of a problem, when our modeling approach is applied.
The studied valuation problem involves an extensive model of the power plant and is based on real data provided to us by the operating energy company. Thus, we refer the reader to our separate paper [44] for all the details.
5 Conclusion
In this article, we have proposed a computational modeling framework for multistage stochastic optimization problems with embedded multiperiod problems. We have named the subject of the study of this problem class multiscale stochastic optimization. The suggested approach is based on a separation between the (standard) multistage decision problem, and the problem of determining pathdependent costs between two consecutive decisions. The paper contains a contribution to both parts. One section was dedicated to the construction of scenario lattices as a discrete structure representing a timehomogeneous Markovian diffusion model. In particular, we examined a Markovchain approximation approach and showed that the approximation error with respect to the optimal value of a generic multistage stochastic optimization problem can be controlled with the suggested methodology. In a second part, we suggested to leverage the theory of stochastic bridges in order to tackle the embedded multiperiod problem, which takes place on a much finer timescale than the decision scale. We elaborated explicitly several examples of popular diffusion models and proposed a new simulation algorithm for compound Poisson bridges. A simple multiscale inventory control problem finally served to illustrate the proposed methodology and discuss it in the context of a concrete example. Moreover, we reported about an implementation as part of a realworld industrial project, where our approach turned out to be very convenient. The latter may be seen as a proof of concept.
Footnotes
 1.
For the simplest form of a binary tree (which typically will be a rather poor uncertainty model), hourly decisions for a time horizon of one day will correspond to about 17 million nodes, daily decisions for one month will give about 1 billion of nodes, and weekly decisions for one year will result in a magnitude of \(10^{15}\) nodes.
 2.
The definition of the Wasserstein distance can be found in the “Appendix”.
 3.
See, e.g., the book of Kloeden and Platen [22] for a detailed treatment.
 4.
 5.
 6.
 7.
The book of Albrecher and Asmussen [2] includes a comprehensive treatment of the compound Poisson model in risk theory, including not only an exhaustive list of its properties but also a discussion of its wide range of applications. In particular, the problem studied in [2, Chapter V, pg. 146] is of a related flavor to the problem of this section: They characterize a sample path in the compound Poisson risk model given that it leads to ruin.
 8.The density function of the \({\text {Lomax}}(\alpha ,\beta )\) distribution is given bywhere \(\alpha \) is a shape parameter and \(1/\beta \) is a scale parameter.$$\begin{aligned} f_{\text {Lom}}(y;\alpha ,\beta ) = \frac{\alpha }{\beta } \left( 1+\frac{y}{\beta }\right) ^{(\alpha +1)}, \end{aligned}$$
 9.The density function of the generalized Beta distribution of first kind is given bywhere \(B(\cdot )\) denotes Euler’s Beta function.$$\begin{aligned} f_{\text {GB1}}(y; a,b,p,q) = \frac{\vert a \vert y^{ap1} \left( 1\left( \frac{y}{b}\right) ^{a}\right) ^{q1}}{b^{ap}B(p,q)}, \end{aligned}$$
Notes
Acknowledgements
Open access funding provided by University of Vienna.
References
 1.Applebaum, D.: Lévy Processes and Stochastic Calculus. Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge (2004)CrossRefGoogle Scholar
 2.Asmussen, S., Albrecher, H.: Ruin Probabilities. Advanced Series on Statistical Science & Applied Probability. World Scientific, Singapore (2010)CrossRefGoogle Scholar
 3.Bally, V., Pagès, G.: A quantization algorithm for solving multidimensional discretetime optimal stopping problems. Bernoulli 9(6), 1003–1049 (2003). 12MathSciNetCrossRefGoogle Scholar
 4.Barczy, M.: Diffusion bridges and affine processes. Habilitation thesis, University of Debrecen, Hungary (2015)Google Scholar
 5.BarndorffNielsen, O.E., Mikosch, T., Resnick, S.I.: Lévy Processes: Theory and Applications. Birkhäuser, Boston (2001)CrossRefGoogle Scholar
 6.Bertoin, J.: Lévy Processes. Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge (1998)zbMATHGoogle Scholar
 7.Billingsley, P., Topsøe, F.: Uniformity in weak convergence. Z. Wahrscheinlichkeitstheorie Verwandte Geb. 7(1), 1–16 (1967)MathSciNetCrossRefGoogle Scholar
 8.Bladt, M., Finch, S., Sørensen, M.: Simulation of multivariate diffusion bridges. J. R. Stat. Soc. B 78(2), 343–369 (2016)MathSciNetCrossRefGoogle Scholar
 9.Bladt, M., Sørensen, M.: Simple simulation of diffusion bridges with application to likelihood inference for diffusions. Bernoulli 20(2), 645–675 (2014). 05MathSciNetCrossRefGoogle Scholar
 10.Cont, R., Tankov, P.: Financial Modelling With Jump Processes. Chapman & Hall, Boca Raton (2004)zbMATHGoogle Scholar
 11.Cox, J.C., Ingersoll, J.E., Ross, S.: A theory of the term structure of interest rates. Econometrica 53(2), 385–407 (1985)MathSciNetCrossRefGoogle Scholar
 12.Cox, S., Hutzenthaler, M., Jentzen, A.: Local Lipschitz continuity in the initial value and strong completeness for nonlinear stochastic differential equations. Technical report 201335, ETH Zurich (2013). arXiv:1309.5595v2
 13.Fitzsimmons, P., Pitman, J., Yor, M.: Markovian bridges: construction, palm interpretation, and splicing. In: Çinlar, E., Chung, K.L., Sharpe, M.J. (eds.) Seminar on Stochastic Processes, 1992, pp. 101–134. Birkhäuser, Boston (1993)CrossRefGoogle Scholar
 14.Gonçalves, F.B., Roberts, G.O.: Exact simulation problems for jumpdiffusions. Methodol. Comput. Appl. Probab. 16(4), 907–930 (2014)MathSciNetCrossRefGoogle Scholar
 15.GroweKuska, N., Heitsch, H., Römisch, W.: Scenario reduction and scenario tree construction for power management problems. In: 2003 IEEE Bologna Power Tech Conference Proceedings, vol. 3 (2003)Google Scholar
 16.Heitsch, H., Römisch, W.: Scenario tree modeling for multistage stochastic programs. Math. Program. 118(2), 371–406 (2009)MathSciNetCrossRefGoogle Scholar
 17.Heitsch, H., Römisch, W.: Scenario tree reduction for multistage stochastic programs. Comput. Manag. Sci. 6(2), 117–133 (2009)MathSciNetCrossRefGoogle Scholar
 18.Høyland, K., Wallace, S.W.: Generating scenario trees for multistage decision problems. Manag. Sci. 47(2), 295–307 (2001). 2CrossRefGoogle Scholar
 19.Hoyle, A.E.V.: Informationbased models for finance and insurance. Ph.D. thesis, Imperial College London (2010)Google Scholar
 20.Karlin, S., Taylor, H.M.: A Second Course in Stochastic Processes, vol. 2. Elsevier, Amsterdam Science (1981)zbMATHGoogle Scholar
 21.Kaut, M., Midthun, K., Werner, A., Tomasgard, A., Hellemo, L., Fodstad, M.: Multihorizon stochastic programming. Comput. Manag. Sci. 11(1), 179–193 (2014)MathSciNetCrossRefGoogle Scholar
 22.Kloeden, P.E., Platen, E.: Numerical Solution of Stochastic Differential Equations. Stochastic Modelling and Applied Probability. Springer, Berlin (2011)zbMATHGoogle Scholar
 23.Kovacevic, R., Pichler, A.: Tree approximation for discrete time stochastic processes: a process distance approach. Ann. Oper. Res. 235(1), 395–421 (2015)MathSciNetCrossRefGoogle Scholar
 24.Löhndorf, N.: An empirical analysis of scenario generation methods for stochastic optimization. Eur. J. Oper. Res. 255(1), 121–132 (2016)MathSciNetCrossRefGoogle Scholar
 25.Löhndorf, N., Wozabal, D.: Gas storage valuation in incomplete markets (2019). http://www.optimizationonline.org/DB_FILE/2017/02/5863.pdf. Accessed 18 July 2019
 26.Lyons, S.M.J.: Introduction to stochastic differential equations. Technical report, School of Informatics, University of Edinburgh (2013)Google Scholar
 27.Maggioni, F., Allevi, E., Tomasgard, A.: Bounds in multihorizon stochastic programs. Ann. Oper. Res. 12, 1–21 (2018)Google Scholar
 28.Maier, S., Pflug, G.Ch., Polak, J.W.: Valuing portfolios of interdependent real options under exogenous and endogenous uncertainties. Eur. J. Oper. Res. (2019). https://doi.org/10.1016/j.ejor.2019.01.055
 29.Moriggia, V., Kopa, M., Vitali, S.: Pension fund management with hedging derivatives, stochastic dominance and nodal contamination. Omega 87, 127–141 (2019)CrossRefGoogle Scholar
 30.Papaspiliopoulos, O., Roberts, G.: Importance sampling techniques for estimation of diffusion models. In: Kessler, M., Lindner, A., Sørensen, M. (eds.) Statistical Methods for Stochastic Differential Equations. Chapman & Hall/CRC Monographs on Statistics & Applied Probability, Chapter 4, pp. 311–335. Taylor & Francis, Boca Raton (2012)CrossRefGoogle Scholar
 31.Pflug, G.Ch., Pichler, A.: A distance for multistage stochastic optimization models. SIAM J. Optim. 22(1), 1–23 (2012)Google Scholar
 32.Pflug, G.Ch., Pichler, A.: Multistage Stochastic Optimization Springer Series in Operations Research and Financial Engineering. Springer, Berlin (2014)Google Scholar
 33.Pflug, G.Ch.: Scenario tree generation for multiperiod financial optimization by optimal discretization. Math. Program. 89(2), 251–271 (2001)Google Scholar
 34.Pflug, G.Ch., Swietanowski, A., Dockner, E.J., Moritsch, H.: The AURORA financial management system: model and parallel implementation design. Ann. OR 99(1–4), 189–206 (2000)Google Scholar
 35.Platen, E., Heath, D.: A Benchmark Approach to Quantitative Finance. Springer Finance. Springer, Berlin (2006)CrossRefGoogle Scholar
 36.Pollock, M.: On the exact simulation of (jump) diffusion bridges. In: Proceedings of the 2015 Winter Simulation Conference, WSC ’15, pp. 348–359. IEEE Press, Piscataway, NJ (2015)Google Scholar
 37.Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Springer, Berlin (2004)zbMATHGoogle Scholar
 38.Sato, K.I.: Lévy Processes and Infinitely Divisible Distributions. Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge (1999)zbMATHGoogle Scholar
 39.Schoutens, W.: Lévy Processes in Finance: Pricing Financial Derivatives. Wiley Series in Probability and Statistics. Wiley, New York (2003)CrossRefGoogle Scholar
 40.Schoutens, W., Cariboni, J.: Lévy Processes in Credit Risk. The Wiley Finance Series. Wiley, New York (2010)zbMATHGoogle Scholar
 41.Seljom, P., Tomasgard, A.: The impact of policy actions and future energy prices on the costoptimal development of the energy system in Norway and Sweden. Energy Policy 106(C), 85–102 (2017)CrossRefGoogle Scholar
 42.Skar, C., Doorman, G., PérezValdés, G. A., Tomasgard, A.: A multihorizon stochastic programming model for the European power system. Censes working paper 2/2016, NTNU Trondheim (2016). ISBN: 9788293198130Google Scholar
 43.Vallender, S.S.: Calculation of the Wasserstein distance between probability distributions on the line. Theory Probab. Appl. 18(4), 3 (1974)CrossRefGoogle Scholar
 44.van Ackooij, W., Escobar, D., Glanzer, M., Pflug, G.Ch.: Distributionally robust optimization with multiple time scales: valuation of a thermal power plant (2019). http://www.optimizationonline.org/DB_HTML/2019/04/7157.html. Accessed 18 July 2019
 45.Vašíček, O.: An equilibrium characterization of the term structure. J. Finan. Econ. 5(2), 177–188 (1977)CrossRefGoogle Scholar
 46.Werner, A.S., Pichler, A., Midthun, K.T., Hellemo, L., Tomasgard, A.: Risk Measures in Multihorizon Scenario Trees, pp. 177–201. Springer, Boston (2013)Google Scholar
 47.Woyczyński, W.A.: Lévy Processes in the Physical Sciences, pp. 241–266. Birkhäuser, Boston (2001)CrossRefGoogle Scholar
 48.Zhonghua, S., Egging, R., Huppmann, D., Tomasgard, A.: A multistage multihorizon stochastic equilibrium model of multifuel energy markets. Censes working paper 2/2016, NTNU Trondheim (2015). ISBN: 9788293198154Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.