A Bluff-and-Fix Algorithm for Polynomial Chaos Methods
- 126 Downloads
Abstract
Stochastic Galerkin methods can be used to approximate the solution to a differential equation in the presence of uncertainties represented as stochastic inputs or parameters. The strategy is to express the resulting stochastic solution using \(M + 1\) terms of a polynomial chaos expansion and then derive and solve a deterministic, coupled system of PDEs with standard numerical techniques. One of the critical advantages of this approach is its provable convergence as M increases. The challenge is that the solution to the M system cannot easily reuse an already-existing computer solution to the \(M-1\) system. We present a promising iterative strategy to address this issue. Numerical estimates of the accuracy and efficiency of the proposed algorithm (bluff-and-fix) demonstrate that it can be more effective than using monolithic methods to solve the whole M + 1 system directly.
Keywords
Polynomial chaos Galerkin projections Stochastic differential equations Numerical PDE solvers Spectral methods1 Introduction
Uncertainty quantification (UQ) in physical models governed by systems of partial differential equations is important to build confidence in the resulting predictions. A common approach is to represent the sources of uncertainty as stochastic variables; in this context the solution to the original differential equations becomes random. Stochastic Galerkin schemes (SGS) are used to approximate the solution to parametrized differential equations. In particular, they utilize a functional basis on the parameter to express the solution and then derive and solve a deterministic system of PDEs with standard numerical techniques [2]. A Galerkin method projects the randomness in a solution onto a finite-dimensional basis, making deterministic computations possible. SGS are part of a broader class known as spectral methods.
The most common UQ strategies involve Monte Carlo (MC) algorithms, which suffer from a slow convergence rate proportional to the inverse square root of the number of samples [5]. If each sample evaluation is expensive — as is often the case for the solution of a PDE—this slow convergence can make obtaining tens of thousands of samples computationally infeasible. Initial spectral method applications to UQ problems showed orders-of-magnitude reductions in the cost needed to estimate statistics with comparable accuracy [3].
In the present approach for SGS, the unknown quantities are expressed as an infinite series of orthogonal polynomials in the space of the random input variable. This representation has its roots in the work of Wiener [7], who expressed a Gaussian process as an infinite series of Hermite polynomials. Ghanem and Spanos [4] truncated Wiener’s representation and used the resulting finite series as a key ingredient in a stochastic finite element method. SGS based on polynomial expansions are often referred to as polynomial chaos approaches.
As statistics of \(\xi \), we require that both \(u(x, t; \xi )\) and \(f(x, \xi )\) have existing second moments — and in accordance with \(u(0, t;\xi ) = 0\), we assume \(f(0, \xi ) = 0\) as well.^{2} Note that these are the only restrictions; namely, even though f is chosen as sinusoidal in the example of Sect. 2, we do not require f to be periodic, bounded over the real line, zero on the whole boundary \(\partial \mathcal {D}\), etc.
2 Inviscid Burgers’ Equation
Our choice of orthogonal polynomials \(\Psi _k\) will rely on the distribution of the \(\xi \) random variable. Throughout this paper, we will choose \(\xi \sim \mathcal {N}(0,1)\) and the \(\Psi _k\) to be Hermite polynomials; however, many of the results apply almost identically to other distributions and their corresponding polynomials.
Lemma 1
Proof
See Appendix.
Equations (6) and (7), along with the last term in the LHS of Eq. (8), will be used to inform an algorithm to solve the M system using the solution to the \(M-1\) system. This is described in the following section.
3 Bluff-and-Fix (BNF) Algorithm
We will use the superscript \(\mathbf {u}^{(M)}\) to denote the \((M+1) \times 1\) vector of functions that is the solution to the M system. Similarly, \(u_k^{(M)}\) is the kth component function of \(\mathbf {u}^{(M)}\). When a component function \(u_k^{(M)}(x, t)\) is written without the superscript i.e. as \(u_k(x, t)\), the value of M is considered to be a fixed but arbitrary positive integer.
3.1 One Step Bluff-and-Fix
From (12), we know there is a discrepancy between \(u_k^{(M-1)}\) and \(u_k^{(M)}\) for \(k < M.\) Regardless, we can bluff and take the solutions we have \(u_0^{(M-1)}, \ldots , u_{M-1}^{(M-1)}\) to solve for some approximation of \(u_M^{(M)}\), which we can call \(\widehat{u}_M^{ (M)},\) and then back-substitute \(\widehat{u}_M^{ (M)}\) into the previous M equations in the M system to solve for \(\widehat{u}_0^{ (M)}, \ldots , \widehat{u}_{M-1}^{ (M)}\). This approach is potentially more efficient than calculating the solution of the M system directly via classic monolithic methods. However, we will opt for an algorithm with even less computation time.
A workable strategy is based on a similar idea. Instead of re-computing all of the \(\widehat{u}_0^{ (M)}, \ldots , \widehat{u}_{M-1}^{ (M)}\) after obtaining \(\widehat{u}_M^{ (M)}\), we re-solve for the least accurate \(\widehat{u}^{ (M)}_k\) at the same time as solving for \(\widehat{u}_M^{ (M)}.\) That is, we only correct the \(\widehat{u}_k^{ (M)}\) that we believe will be the worst approximations of their corresponding \(u_k^{(M)}\). The \(\mathcal {I}\) in Algorithm 1 is the collection of correction indices i.e. the indices k denoting which \(u^{(M)}_k\) approximations are corrected.
Algorithm 1 poses two immediate questions.
- 1.
How large should we make the correction size \(c \in \{1, \ldots , M\}\)?
- 2.
How should we choose the correction indices \(\mathcal {I}\)? That is, how do we pick which \(\widehat{u}_k^{(M)}\) approximations should be fixed?
To address question 1, note that if \(c = M\), then the one step bluff-and-fix (BNF) is equivalent to solving the entire M system directly. In this case, we use \(\mathbf {u}^{(M-1)}\) to set none of the approximations \(\widehat{u}_k^{\,(M)}\) (no bluffing), and the entire coupled system \(\{ \widehat{u}_k^{ (M)}\}_{k=0}^M\) of \(M + 1\) equations is solved by standard numeric techniques.
For the other extreme, choosing \(c = 1\) corresponds to ignoring the difference between the \(M-1\) and M system solutions as much as possible; we set \(\{ \widehat{u}_k^{ (M)}\}_{k=0}^{M-1} \leftarrow \mathbf {u}^{(M-1)}\) and solve only a single PDE for \(\widehat{u}_M^{ (M)}.\) Heuristically, c can represent a trade-off between accuracy and efficiency. The hypothesis is that by choosing which \(\widehat{u}_k^{\,(M)}\) to correct judiciously, we can still well-approximate \(\mathbf {u}^{(M)}\) when \(c < M.\)
This brings us to the second posed question. To determine the correction indices \(\mathcal {I}\), we target the \(u^{(M)}_k\) such that \(u_k^{(M-1)}\) and \(u_k^{(M)}\) are very different so that the approximation \(\widehat{u}^{(M)}_k =u^{(M-1)}_k\) is a poor one. While we cannot examine \(\Vert \widehat{u}^{ (M)}_k - u_k^{(M)}\Vert \) directly, we can see from (12) that the difference between the numeric solutions \(u^{(M)}_k\) and \(u_k^{(M-1)}\) arises from the function \(G_k\). So we would like \(\mathcal {I}\) to ideally include \(k^* = {{\,\mathrm{arg\,max}\,}}_{k \in \{0, \ldots M-1\}} \left\| G_k \right\| ,\) where \(\Vert \cdot \Vert \) denotes some function norm over \((x,t) \in \mathcal {D}.\) From the definition of \(G_k\), we do not know what values its input functions \((\mathbf {v}, w)\) or their derivatives will take over \((x,t) \in \mathcal {D}\); however, we do know the entries of the matrix \(L_M.\)
Now there is some choice. The function \(G_k\) is the difference between \(F_{M}\) and \(F_{M-1}\) from Eqs. (10) and (11), and all three of these equations are scaled by the \(\frac{1}{k!}\) factor. We can keep this \(\frac{1}{k!}\) factor and select the \(G_k\) that is large in an “absolute error” sense. Alternatively, we can ignore this scaling and choose the \(\widehat{u}_k^{(M)}\) to fix such that the difference function \(G_k\) is significant relative to its corresponding \(F_M\) and \(F_{M-1}\).
The former approach (call it the absolute version) targets \(k_1^* = {{\,\mathrm{arg\,max}\,}}_{0 \le k \le M-1} \frac{1}{k!} \Vert (\widetilde{L}_M)_{k \bullet } \Vert _\infty = {{\,\mathrm{arg\,max}\,}}_{0 \le k \le M - 1} \frac{1}{k!} \sum _{j=0}^M |(\widetilde{L}_M)_{k,j}|\). Equivalently, by Eq. (13), this is selecting \(k_1^* = {{\,\mathrm{arg\,max}\,}}_{0\le k\le M-1}\sum _{j=1}^{M}|(D_M \widetilde{L}_M)_{kj}| \) i.e. the \(k_1^*\) indexing the row of \(D_M \widetilde{L}_M\) with largest absolute row sum — where we recall that \(\widetilde{L}_M\) is the matrix of the first M rows of \(L_M.\) The latter approach (call it the relative version) simply picks the row indexing the largest row sum in \(\widetilde{L}_M\) itself (i.e. not in \(D_M \widetilde{L}_M\)).
Selecting the row of \(\widetilde{L}_M\) with maximal absolute row sum is simple; it is straight-forward to verify that row \(M - 1\) obtains the maximum (though may not do so uniquely) and that the row sums of \(\widetilde{L}_M\) are non-decreasing as the row index k increases. The structure of the \(D_L \widetilde{L}_M\) matrix does not lend itself to as obvious of a pattern.
We opt for the relative version when constructing our algorithm, since its numeric results are overwhelmingly promising (as discussed in Sect. 4) and its implementation avoids an additional row sorting step; however, this is a potential area for future investigation. Since we require \(|\mathcal {I}| = c\) for a given correction size parameter c, we simply pick the indices corresponding to the last c rows in the M system i.e. \(\mathcal {I} = \{M-c + 1, \ldots , M \}\).
3.2 Iterative Bluff-and-Fix
An assumption of the one step bluff-and-fix algorithm is that you already have solved the fully coupled \(M - 1\) system via some explicit time-stepping scheme (e.g. fourth-order Runge-Kutta). Realistically, we likely only want to solve a fully coupled \(M_0\) system for when \(M_0\) is small (say 2 or 3). How can we then use this information for approximating \(\mathbf {u}^{(M)}\) for a larger \(M > M_0\)?
The iterative bluff-and-fix algorithm uses some baseline \(\mathbf {u}^{(M_0)}\) solution to get an approximation \(\widehat{\mathbf {u}}^{(M_0 +1)}\) of \(\mathbf {u}^{(M_0+1)}\). The approximation \(\widehat{\mathbf {u}}^{(M_0 +1)}\) is then re-fed into the one step bluff-and-fix algorithm, instead of the “true” \(\mathbf {u}^{(M_0 +1)}\), to determine \(\widehat{\mathbf {u}}^{(M_0 +2)},\) and the process continues.
4 Numerical Results
We report solutions for Burgers’ equation with uncertain initial conditions, which are \(u(x,0; \xi ) = \xi \sin (x)\) for \(\xi \sim \mathcal {N}(0,1).\) The equation is solved for \(x \in [0, 3]\) on a uniform grid with \(\varDelta _x = 0.05\). Time integration is based on the Runge-Kutta 4-step (RK4) scheme with \(\varDelta _t = 0.001\).
4.1 Results for One Step Version
- 1.
correction size \(c = 1\) (only \(\widehat{u}_M^{(M)}\) is computed per step),
- 2.
correction size \(c = 2\) (\(\widehat{u}_M^{\,(M)}\) is computed and \(\widehat{u}_{M-1}^{\,(M)}\) is fixed per step), and
- 3.
correction size \(c = 3\) (\(\widehat{u}_M^{\,(M)}\) is computed and both \(\widehat{u}_{M-2}^{\,(M)}\) and \(\widehat{u}_{M-1}^{\,(M)}\) are fixed per step).
The results are shown in Table 1.
Picking a small correction size (say one or two) can be sufficient for producing accurate solution approximations. For instance, using the solution \(\mathbf {u}^{(4)}\) to produce \(\widehat{\mathbf {u}}^{(5)}\) has average absolute error of under \(1\%\) for all c, around \(4\%\) average relative error for \(c = 1\), and under \(1\%\) average relative error for \(c = 2, 3\), as shown in the \(M = 5\) row of Table 1. Figure 1 displays how the approximate solution \(\widehat{u}_5^{\, (5)}\) in this system converges to the RK4 solution as the correction size c increases from 1 to 2. Similarly, using the solution \(\mathbf {u}^{(6)}\) to produce \(\widehat{\mathbf {u}}^{(7)}\) has average absolute error of under \(1\%\) for all c values, around \(6\%\) average relative error for \(c = 1\), and about \(2.5\%\) average relative error for \(c = 2\), as shown in the \(M = 7\) row of Table 1.
Average absolute and relative errors for correction sizes \(c \in \{1,2, 3\} \) in one step bluff-and-fix for \(M = 3, \ldots , 8\).
M | Avg. Abs. Error | Avg. Rel. Error | ||||
---|---|---|---|---|---|---|
c = 1 | c = 2 | c = 3 | c = 1 | c = 2 | c = 3 | |
3 | 0.1515 | 0.05593 | 0.01114 | 0.03196 | 0.004364 | 0.001683 |
4 | 0.03481 | 0.01823 | 0.007423 | 0.03711 | 0.005768 | 0.0008361 |
5 | 0.009536 | 0.006371 | 0.003599 | 0.04300 | 0.009232 | 0.001656 |
6 | 0.003279 | 0.002619 | 0.001859 | 0.05112 | 0.01530 | 0.003765 |
7 | 0.001330 | 0.001195 | 0.0009746 | 0.05986 | 0.02541 | 0.008166 |
8 | 0.0006404 | 0.0006118 | 0.0005623 | 0.06080 | 0.02630 | 0.01116 |
Selection of correction indices \(\mathcal {I}\) ranked by priority in one step bluff-and-fix against the ideal \(\widehat{u}_k^{\, (M)}\) to correct (in this numeric example) for \(M \in \{3, \ldots , 8\}\). Matching indices are shown in blue.
We see from Table 2 that one step bluff-and-fix is often spot-on for guessing which approximations \(\widehat{u}_k^{(M)}\) are least accurate. When defining the “worst” approximation by relative error, BNF always selects correctly for \(c < M-1\). While the ideal indices when \(c = M - 1\) or \(c = M\) are not all chosen, this is likely no issue, as in practice a correction size that large would not be used. (Recall that \(c = M\) is equivalent to using regular RK4.) Also, it should be noted that bluff-and-fix can still produce an accurate solution approximation when an “incorrect” index is chosen — it just might not be the best approximation possible given that value of c.
In addition, we test the computational cost of one step and iterative bluff-and-fix for different correction sizes and M values. The runtimes are measured using the %timeit command in iPython with parameters \(\texttt {-r 10 -n 10}\) to obtain an average with standard deviation over 100 realizations. All computations are performed on a machine with a 1.8 GHz Intel Dual-Core processor. The results are displayed in Table 3.
Runtimes of one step bluff-and-fix to approximate \(\mathbf {u}^{(M)}\) when given \(\mathbf {u}^{(M-1)}\) compared with runtimes of solving full M systems via RK4. Tested over \(M = 3, \ldots , 8\) with correction sizes \(c = 2, 3\). Each time measurement is averaged over 100 loops to provide a confidence interval.
M | Avg. One Step BNF Runtime | Avg. RK4 Runtime | |
---|---|---|---|
\(c = 2\) | \(c = 3\) | ||
3 | 146 ms \(\pm \, 7.81\) ms | 175 ms \(\pm \, 14.1\) ms | 181 ms \(\pm \, 7.81\) ms |
4 | 162 ms \( \pm \, 13.2\) ms | 183 ms \(\pm \, 22.5\) ms | 222 ms \(\pm \, 22.4\) ms |
5 | 167 ms \( \pm \, 21.3\) ms | 184 ms \(\pm \, 4.42 \) ms | 260 ms \(\pm \, 15.2\) ms |
6 | 167 ms \( \pm \, 5.42\) ms | 189 ms \(\pm \, 3.96\) ms | 315 ms \(\pm \, 37.6\) ms |
7 | 191 ms \(\pm \, 21.1\) ms | 202 ms \(\pm \, 6.01\) ms | 336 ms \(\pm \, 13.1\) ms |
8 | 216 ms \(\pm \, 31.5\) ms | 217 ms \(\pm \, 5.38 \) ms | 377 ms \(\pm \, 3.77\) ms |
4.2 Results for Iterative Version
Now we present results for Algorithm 2 when using the baseline solution \(\mathbf {u}^{(M_0)}\) for \(M_0 = 2\) to approximate solutions to the M systems for \(M = 3, \ldots , 8\). Correction sizes \(c = 1, 2, 3\) are all tested. The results are displayed in Table 4.
We observe how quickly the solution approximation from iterative bluff-and-fix converges to the “true” RK4 solution as the correction size is increased (Fig. 2). For example, the average relative error in \(\widehat{\mathbf {u}}^{(7)}\) drops from \({\sim }\)27\(\%\) (c = 1) to \({\sim }\)5.8\(\%\) (\(c = 2\)) to \({\sim }\) 1.4\(\%\) (\(c = 3\)), and the average absolute error falls from \({\sim }\)8.7\(\%\) to \({\sim }\)3.5\(\%\) to \({\sim }\)1\(\%\) (Table 4).
Average absolute and relative errors in iterative bluff-and-fix when using the baseline solution \(M_0 = 2\) to approximate solutions to the M systems for \(M = 3, \ldots , 8\). Results for correction sizes \(c =1,2, 3\) are shown.
M | Avg. Abs. Error | Avg. Rel. Error | ||||
---|---|---|---|---|---|---|
c = 1 | c = 2 | c = 3 | c = 1 | c = 2 | c = 3 | |
3 | 0.1515 | 0.05593 | 0.01115 | 0.03196 | 0.004363 | 0.001682 |
4 | 0.1337 | 0.05301 | 0.01420 | 0.07467 | 0.01019 | 0.001924 |
5 | 0.1145 | 0.04634 | 0.01357 | 0.1313 | 0.02034 | 0.003412 |
6 | 0.09894 | 0.04033 | 0.01227 | 0.2013 | 0.03642 | 0.006904 |
7 | 0.08676 | 0.03547 | 0.01096 | 0.2740 | 0.05793 | 0.01400 |
8 | 0.07711 | 0.03147 | 0.009704 | 0.3586 | 0.08464 | 0.02020 |
As before, we test the computational cost via %timeit in iPython for different correction sizes and M values, averaging each measurement over 100 realizations to provide a confidence interval. For solving the \(M = 8\) system with iterative bluff-and-fix with \(M_0 = 2\), the computation time is 853 ms \(\pm 39.5\) ms, 1.12 s \(\pm 121\) ms, and 1.28 s \(\pm 63.5\) ms for \(c = 1, 2, 3,\) respectively. We can see that the added cost from correcting \(\widehat{u}_{M-1}^{\,(M)}\), as opposed to just solving for \(\widehat{u}_M^{(M)}\), is on average only 267 ms. This suggests that the reduction in error between \(c = 1\) and \(c = 2\) potentially comes at a cheap cost, especially given that average absolute error and relative error is under \(6\%\) and \(8.5\%\) (respectively) after this transition.
Using RK4 to only solve the \(M = 8\) system is faster than using iterative bluff-and-fix from the baseline \(M_0 = 2\) solution: the former spends 426 ms \(\pm \, 48.6\) ms per loop. However, it is possible that for larger M values, the iterative bluff-and-fix algorithm will eventually out-perform RK4 in terms of runtime. This is an area for future investigation.
Furthermore, iterative bluff-and-fix has the advantage of producing approximate solutions to all of the M systems for \(M = 3, \ldots , 8\) along the way. When repeatedly solving the full M system via RK4 for \(M = 3, \ldots , 8\), the average runtime is 2.05 s ± 145 ms per loop — meaning bluff-and-fix with correction sizes \(c = 1\) (averaged at 853 ms), \(c = 2\) (averaged at 1.21 s), and \(c = 3\) (averaged at 1.28 s) is far more efficient for this type of goal.
5 Conclusion
Polynomial chaos (PC) methods are effective for incorporating and quantifying uncertainties in problems governed by partial differential equations. In this paper, we present a promising algorithm (one step bluff-and-fix) for utilizing the solution to a polynomial chaos \(M - 1\) system arising from inviscid Burgers’ equation to approximate the solution to the corresponding M system. We expand the algorithm to an iterative version, which utilizes the solution to an \(M_0\) system to approximate the solution to an M system for a general \(M > M_0.\) Bluff-and-fix is shown to be effective in producing accurate approximations, even when its correction size parameter is small, for both its one step and iterative versions. In the one step version, these approximations are produced more efficiently than doing so with a standard monolithic numeric scheme. While iterative bluff-and-fix initialized from some baseline \(M_0\) can be less efficient than solving the full M system directly, it has the advantage of producing approximations to all of the m systems along the way for \(M_0 \le m \le M\) — and does so faster than the monolithic method solves all of the full m systems one by one. In general, it could be beneficial to know the solution to an M system for a consecutive range of M values, because then one could observe when the difference between consecutive system solutions is small, which provides a rough sense of the M value sufficient for solution convergence.
Future work will be focused on generalizing and testing the algorithm on other nonlinear PDEs with uncertain initial conditions. We also plan to investigate different choices of the uncertainty representation \(\xi \). It is expected that the priority ordering for which solutions to “fix" first — that is, the choice of the correction indices — will change, since that ordering depends on ranking the absolute row sums of \(L_M\), and the precise structure of \(L_M\) depends on the choice of orthogonal polynomial.
Footnotes
- 1.
For convenience we set \(\mathcal {D} = [0,1] \times [0, T]\), though all of the presented results follow immediately when \(\mathcal {D} = [a,b] \times \mathcal {T}\) for some arbitrary interval \([a,b] \subset \mathbb {R}_x\) and bounded \(\mathcal {T} \subset \mathbb {R}_{t \ge 0}\) such that \(0 \in \mathcal {T}.\)
- 2.
These assumptions ensure that the needed conditions for applying the Cameron-Martin theorem [1] are met.
References
- 1.Cameron, R., Martin, W.: The orthogonal development of non-linear functionals in series of Fourier-Hermite functionals. Ann. Math. 48, 385–392 (1947)MathSciNetCrossRefGoogle Scholar
- 2.Constantine, P.: A primer on stochastic Galerkin methods. Academic homepage. https://www.cs.colorado.edu/~paco3637/docs/constantine2007primer.pdf. Accessed 6 Feb 2020
- 3.Constantine, P.: Spectral methods for parametrized matrix equations. Ph.D. thesis, Stanford University (2009)Google Scholar
- 4.Ghanem, R., Spanos, P.: Stochastic Finite Elements: A Spectral Approach. Springer, New York (1991). https://doi.org/10.1007/978-1-4612-3094-6CrossRefzbMATHGoogle Scholar
- 5.Owen, A.: Monte Carlo theory, methods and examples, Ch. 1. Academic homepage. https://statweb.stanford.edu/~owen/mc/. Accessed 15 April 2020
- 6.Szegö, G.: Orthogonal Polynomials, 2nd edn. American Mathematical Society, New York (1959)zbMATHGoogle Scholar
- 7.Wiener, N.: The homogenous chaos. Amer. J. Math. 60(4), 897–936 (1938)MathSciNetCrossRefGoogle Scholar
- 8.Xiu, D., Karniadakis, G.: The Wiener-Askey polynomial chaos for stochastic differential equations. SIAM J. Sci. Comput. 24, 619–644 (2002)MathSciNetCrossRefGoogle Scholar