The danger of combining block red–black ordering with modified incomplete factorizations and its remedy by perturbation or relaxation

Shioya, Akemi; Yamamoto, Yusaku

doi:10.1007/s13160-017-0277-5

The danger of combining block red–black ordering with modified incomplete factorizations and its remedy by perturbation or relaxation

Original Paper
Area 2
Published: 11 October 2017

Volume 35, pages 195–216, (2018)
Cite this article

Japan Journal of Industrial and Applied Mathematics Aims and scope Submit manuscript

123 Accesses
1 Citation
Explore all metrics

Abstract

Modified incomplete LU/Cholesky factorizations without fill-ins are popular preconditioners for Krylov subspace methods, because they require no extra memory and have more potential of accelerating the convergence than simple ILU/IC preconditioners. For parallelizing preconditioners, the block red–black ordering is attractive due to its highly parallel nature and small number of synchronization points. Hence, their combination seems to produce powerful and parallelizable preconditioners. In fact, however, this combination can cause breakdown of the factorization due to the occurrence of zero pivots. We analyze this phenomenon and give necessary and sufficient conditions of zero pivots in the case of a regular grid. We also show both theoretically and experimentally that adding perturbation to the diagonal elements or relaxing the compensation of dropped fill-ins is useful to alleviate the problem. Numerical tests show that the resulting preconditioners are highly effective and are applicable for up to $10^3$ level of parallelism.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Domain Decomposition Method for Nonconforming Finite Element Approximations of Eigenvalue Problems

Article 25 April 2024

An Efficient GIPM Algorithm for Computing the Smallest V-Singular Value of the Partially Symmetric Tensor

Article 26 April 2024

Finite Element Mesh Generation

Notes

Strictly speaking, a breakdown can occur in the elimination of a row $i^{\prime }<i$ not belonging to $\mathcal {I}(i)$. But this can be avoided by changing the order of elimination so that the rows in $\mathcal {I}(i)\cup \{i\}$ are eliminated first.

References

Birdsall, C.K.: Particle-in-cell charged-particle simulations, plus Monte Carlo collisions with neutral atoms, PIC-MCC. IEEE Trans. Plasma Sci. 19(2), 65–85 (1991)
Article Google Scholar
Hirsch, C.: Numerical Computation of Internal and External Flows: The Fundamentals of Computational Fluid Dynamics. Butterworth-Heinemann, Oxford (2007)
Google Scholar
Meijerink, J.A., van der Vorst, H.A.: An iterative solution method for linear systems of which the coefficient matrix is a symmetric $M$-matrix. Math. Comput. 31, 148–162 (1977)
MathSciNet MATH Google Scholar
Gustafsson, I.: A class of first order factorization methods. BIT 18, 142–156 (1978)
Article MathSciNet MATH Google Scholar
Greenbaum, A.: Iterative Methods for Solving Linear Systems. SIAM, Philadelphia (1997)
Book MATH Google Scholar
Fujino, S., Mori, M., Takeuchi, T.: Performance of hyperplane ordering on vector computers. J. Comput. Appl. Math. 38, 125–136 (1991)
Article MATH Google Scholar
George, A., Liu, J.W.H.: Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall, Englewood Cliffs (1981)
MATH Google Scholar
Davis, T.A.: Direct Methods for Sparse Linear Systems. SIAM, Philadelphia (2006)
Book MATH Google Scholar
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003)
Book MATH Google Scholar
Iwashita, T., Shimasaki, M.: Block red-black ordering for parallelized ICCG solver with fewer synchronization points. IPSJ J. 43, 893–904 (2002–2004)
Iwashita, T., Shimasaki, M.: Algebraic block red-black ordering method for parallelized ICCG solver with fast convergence and low communication costs. IEEE Trans. Magn. 39, 1713–1716 (2003)
Article Google Scholar
Semba, K., Tani, K., Yamada, T., Iwashita, T., Takahashi, Y., Nakashima, H.: Parallel performance of multi-threaded ICCG solver based on algebraic block multi-color ordering in finite element electromagnetic field analyses. IEEE Trans. Magn. 49, 1581–1584 (2013)
Article Google Scholar
Iwashita, T., Nakashima, H., Takahashi, Y.: Algebraic block multi-color ordering method for parallel multi-threaded sparse triangular solver in ICCG method. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium (IPDPS), pp. 474–483. IEEE (2012)
Guessous, N., Souhar, O.: The effect of block red-black ordering on block ILU preconditioner for sparse matrices. J. Appl. Math. Comput. 17, 283–296 (2005)
Article MathSciNet MATH Google Scholar
Eijkhout, V.: Beware of unperturbed modified incomplete factorizations. In: Belgium, B., Beauwens, R., de Groen, P. (eds.) Proc. of the IMACS International Symposium on Iterative Methods in Linear Algebra (1992)
Gustafsson, I.: Modified incomplete Cholesky (MIC) methods. In: Evans, D. (ed.) Preconditioning Methods Theory and Applications, pp. 265–293. Gordon and Breach, New York (1983)
Google Scholar
Axelsson, O., Lindskog, G.: On the eigenvalue distribution of a class of preconditioning method. Numer. Math. 48, 479–498 (1986)
Article MathSciNet MATH Google Scholar
Eijkhout, V.: Analysis of parallel incomplete point factorizations. Linear Algebra Appl. 154, 723–740 (1991)
Article MathSciNet MATH Google Scholar
Van der Vorst, H.A.: BI-CGSTAB: a fast and smoothly converging variant of BI-CG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 13(2), 631–644 (1992)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We are grateful to the anonymous reviewer, whose comments helped us to improve the quality of this paper. We thank Prof. Takeshi Iwashita of Hokkaido University for valuable comments on our work. The present study is supported in part by the Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Scientific Research (nos. 26286087, 15H02708, 15H02709, 16KT0016, 17H02828, 17K19966). The computational experiments in this paper were performed using the FX10 parallel computer at the Education Center on Computational Science and Engineering of Kobe University.

Author information

Authors and Affiliations

The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, 182-8585, Japan
Akemi Shioya & Yusaku Yamamoto

Authors

Akemi Shioya
View author publications
You can also search for this author in PubMed Google Scholar
Yusaku Yamamoto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akemi Shioya.

Appendix: Proof of Theorem 4

Proof

As we showed in the proof of Theorem 3, the off-diagonal nonzero elements of A are unchanged throughout the factorization and remain negative. Note also that if $k\in \mathcal {I}(i)\cup \{i\}$, $\sum _{l\ne k}(-a_{kl})=a_{kk}=1$ from the zero row sum property. Hence, $0<\xi \le \eta \le 1$.

Before going into the proof, we introduce some notations. Let $\mathcal {L}(k)$ and $\mathcal {R}(k)$ be the set of column indices of nonzero elements to the left and right of the diagonal, respectively, in the k-th row. Then, $\mathcal {L}(k)$ is the set of indices of the rows that update the k-th row. Also, since A has a symmetric nonzero pattern, $\mathcal {R}(k)$ is the set of indices of the rows updated by the k-th row. We also denote by $r_k$ the row sum of the k-th row after elimination. Since there are no nonzero elements to the left of the diagonal after elimination, we have

$$\begin{aligned} r_k = u_{kk}+\sum _{l\in \mathcal {R}(k)} a_{kl}. \end{aligned}$$

(9)

If $k\in \mathcal {I}(i)\cup \{i\}$, we have

$$\begin{aligned} \sum _{l\in \mathcal {L}(k)} a_{kl} + a_{kk} + \sum _{l\in \mathcal {R}(k)} a_{kl} = \sum _{l\in \mathcal {L}(k)} a_{kl} + 1 + \sum _{l\in \mathcal {R}(k)} a_{kl} =0 \end{aligned}$$

(10)

from the zero row sum property. Combining (9) with (10) gives

$$\begin{aligned} u_{kk} = a_{kk} + \sum _{l\in \mathcal {L}(k)} a_{kl} + r_k. \end{aligned}$$

(11)

Now, suppose that $k\in \mathcal {I}(i)\cup \{i\}$ and $\mathcal {L}(k)\ne \emptyset $. We consider the change of the row sum of row k due to elimination by row $l\in \mathcal {L}(k)$. By this elimination, $a_{kl}$ is annihilated and other off-diagonal elements remain intact. Hence, the contribution to the row sum from the off-diagonal elements increases by $-a_{kl}$. The diagonal element $a_{kk}$ increases by $-a_{kl}a_{lk}/u_{ll}$ by the elimination. The sum of fill-ins to be added to the diagonal is $-\alpha (a_{kl}/u_{ll})\sum _{j\in \mathcal {R}(l)\backslash \{k\}}a_{lj}$. Combining all of these contributions, we can write the change in the row sum as

$$\begin{aligned}&-a_{kl}-\frac{a_{kl}a_{lk}}{u_{ll}}-\alpha \frac{a_{kl}}{u_{ll}}\sum _{j\in \mathcal {R}(l)\backslash \{k\}}a_{lj} \nonumber \\&\quad =-\frac{a_{kl}}{u_{ll}}\left\{ u_{ll}+a_{lk}+\sum _{j\in \mathcal {R}(l)\backslash \{k\}}a_{lj}+(1-\alpha )\sum _{j\in \mathcal {R}(l)\backslash \{k\}}(-a_{lj})\right\} \nonumber \\&\quad =-\frac{a_{kl}}{u_{ll}}\left\{ r_l+(1-\alpha )\sum _{j\in \mathcal {R}(l)\backslash \{k\}}(-a_{lj})\right\} , \end{aligned}$$

(12)

where we used (9) in the last equality. We define $s_{lk}$ by

$$\begin{aligned} s_{lk}=\frac{1}{u_{ll}} \left\{ r_l+(1-\alpha )\sum _{j\in \mathcal {R}(l)\backslash \{k\}}(-a_{lj})\right\} . \end{aligned}$$

(13)

Then, since the initial row sum of row k is zero, the final row sum that takes into account the contributions from all $l\in \mathcal {L}(k)$ can be written as

$$\begin{aligned} r_k=\sum _{l\in \mathcal {L}(k)}(-a_{kl})s_{lk}. \end{aligned}$$

(14)

Equations (11), (13) and (14) describe the propagation of the change in the row sums during the process of relaxed modified incomplete factorization.

To prove the theorem, we show that for any $k\in \mathcal {I}(i)$ and $m\in \mathcal {R}(k)$,

$$\begin{aligned} s_{km} \ge (1-\alpha )(p-1)\xi . \end{aligned}$$

(15)

We proceed by induction. First consider the case of $\mathcal {D}(k)=0$. In this case, row k is not altered by any row since $\mathcal {I}(k)=\emptyset $. Hence, $u_{kk}=a_{kk}=1$ and $r_k=0$ from the condition (ii) of Theorem 1. Also, since row k is not altered by any row, it has no nonzero elements to the left of the diagonal and all of its p off-diagonal nonzero elements lie to the right of the diagonal. Thus, from Eq. (13), $s_{km}$ can be bounded from below as

$$\begin{aligned} s_{km} =(1-\alpha ) \sum _{j\in \mathcal {R}(k)\backslash \{m\}}(-a_{kj}) \ge (1-\alpha )(p-1)\xi \end{aligned}$$

(16)

and the claim holds. Next, assume that Eq. (15) holds for all rows in $\mathcal {I}(i)$ with depth smaller than d, where d is some positive integer. Now, pick up a row $k\in \mathcal {I}(i)$ with $\mathcal {D}(k)=d$. Then, the row l used in the elimination of row k satisfies $\mathcal {D}(l)<d$ and hence $s_{lk} \ge (1-\alpha )(p-1)\xi $ from the hypothesis. Thus, the row sum of row k after elimination can be evaluated as

$$\begin{aligned} r_k = \sum _{l\in \mathcal {L}(k)}(-a_{kl})s_{lk} \ge (1-\alpha )(p-1)\xi \sum _{l\in \mathcal {L}(k)}(-a_{kl}). \end{aligned}$$

(17)

On the other hand, putting (11) into the definition of $s_{km}$, Eq. (13), yields

$$\begin{aligned} s_{km}=\frac{(1-\alpha )\sum _{j\in \mathcal {R}(k) \backslash \{m\}}(-a_{kj})+r_k}{a_{kk} + \sum _{l\in \mathcal {L}(k)} a_{kl} + r_k}. \end{aligned}$$

(18)

We bound the right-hand side from below. Let $\mathcal {L}(k)=q$ and $\mathcal {R} (k)=p-q$. Since row k is updated by at least one row, $q\ge 1$. Also, $q\le p-1$, since otherwise $\mathcal {R} (k)=\emptyset $, which contradicts the assumption that $k\in \mathcal {I}(i)$. We note that the first term in the numerator of (18) can be bounded by the sum of the first and the second terms in the denominator as

$$\begin{aligned} (1-\alpha )\sum _{j\in \mathcal {R}(k) \backslash \{m\}}(-a_{kj}) \le \sum _{j\in \mathcal {R}(k)}(-a_{kl}) = a_{kk}+\sum _{j\in \mathcal {L}(k)}a_{kl}, \end{aligned}$$

(19)

where we used the zero row sum property in the last equality. Then, by applying the general inequality $\frac{b+\epsilon }{a+\epsilon }\ge \frac{b+\delta }{a+\delta }$ that holds for four positive numbers $a, b, \epsilon $ and $\delta $ with $a\ge b$ and $\epsilon \ge \delta $ to Eq. (18) (with $\epsilon =r_k$ and $\delta $ being the right-hand side of (17)), we have

$$\begin{aligned} s_{km}\ge \frac{(1-\alpha )\sum _{j\in \mathcal {R}(k)\backslash \{m\}}(-a_{kj})+(1-\alpha )(p-1)\xi \sum _ {l\in \mathcal {L}(k)}(-a_{kl})}{a_{kk} + \sum _{l\in \mathcal {L}(k)} a_{kl} + (1-\alpha )(p-1)\xi \sum _{l\in \mathcal {L}(k)}(-a_{kl})}. \end{aligned}$$

(20)

The numerator can be rewritten as

$$\begin{aligned}&(1-\alpha )(p-1)\xi \Biggl [\sum _{j\in \mathcal {L}(k)}(-a_{kj})+\sum _{j\in \mathcal {R}(k)\backslash \{m\}}(-a_{kj}) \nonumber \\&\qquad +\,\left\{ \frac{1}{(p-1)\xi }-1\right\} \sum _{j\in \mathcal {R}(k)\backslash \{m\}}(-a_{kj})\Biggr ] \nonumber \\&\quad = (1-\alpha )(p-1)\xi \Biggl [1+a_{km}+\left\{ \frac{1}{(p-1)\xi }-1\right\} \sum _{j\in \mathcal {R}(k)\backslash \{m\}}(-a_{kj})\Biggr ] \nonumber \\&\quad \ge (1-\alpha )(p-1)\xi \left[ 1-\eta +\left\{ \frac{1}{(p-1)\xi }-1\right\} (p-q-1)\xi \right] , \end{aligned}$$

(21)

where we used the zero row sum property in the first equality. The denominator can be bounded as

$$\begin{aligned} a_{kk} + \sum _{l\in \mathcal {L}(k)} a_{kl} + (1-\alpha )(p-1)\xi \sum _{l\in \mathcal {L}(k)}(-a_{kl}) \le 1-q\xi +(1-\alpha ), \end{aligned}$$

(22)

where we used $(p-1)\xi \le 1$ and $\sum _{l\in \mathcal {L}(k)}(-a_{kl})\le 1$. Inserting (21) and (22) into (20) gives

$$\begin{aligned} s_{km}\ge (1-\alpha )(p-1)\xi \cdot \frac{1-\eta +\left\{ \frac{1}{(p-1)\xi }-1\right\} (p-q-1)\xi }{1-q\xi +(1-\alpha )}. \end{aligned}$$

(23)

Noting that the denominator of the right-hand side is positive, we know that the claim (15) holds if

$$\begin{aligned} -\eta +\left\{ \frac{1}{(p-1)\xi }-1\right\} (p-q-1)\xi \ge -q\xi +(1-\alpha ), \end{aligned}$$

(24)

or

$$\begin{aligned} \eta \le \left( 2\xi -\frac{1}{p-1}\right) q+1-(p-1)\xi -(1-\alpha ). \end{aligned}$$

(25)

The right-hand side is a linear function of q and and its minimum in the interval $1\le q\le p-1$ is

$$\begin{aligned} \left\{ \begin{array}{lll} (p-1)\xi -(1-\alpha ) &{} \mathrm{at} \quad q=p-1 &{} \mathrm{when} \quad \xi \le \frac{1}{2(p-1)}, \\ \frac{p-2}{p-1}-(p-3)\xi -(1-\alpha ) &{} \mathrm{at} \quad q=1 &{} \mathrm{when} \quad \xi >\frac{1}{2(p-1)}. \end{array}\right. \end{aligned}$$

(26)

In either case, the minimum is larger than or equal to $\eta $ from the condition (7) and thus (15) holds also for k. Hence, the induction is complete and (15) holds for any $k\in \mathcal {I}(i)$ and $m\in \mathcal {R}(k)$.

Now that (15) has been established, we bound the pivots $u_{kk}$ for $k\in \mathcal {I}(i)\cup \{i\}$ from below. If $\mathcal {D}(k)=0$, row k is not eliminated by any row, so, from the scaling assumption, $u_{kk}=1> (1-\alpha )(p-1)\xi ^2$. Next, assume that $k\in \mathcal {I}(i)$ and $\mathcal {D}(k)>0$. Then, there is at least one nonzero element to the left of the diagonal. Thus, from (14) and (15), it follows that

$$\begin{aligned} r_k= \sum _{l\in \mathcal {L}(k)}(-a_{kl})s_{lk}\ge \xi \cdot (1-\alpha )(p-1)\xi \ge (1-\alpha )(p-1)\xi ^2. \end{aligned}$$

(27)

Combining this with (9), we have

$$\begin{aligned} u_{kk}=r_k-\sum _{l\in \mathcal {R}(k)} a_{kl} \ge r_k \ge (1-\alpha )(p-1)\xi ^2. \end{aligned}$$

(28)

Finally, consider the case of $k=i$. In this case, there is no nonzero element to the right of the diagonal and therefore $\sum _{l\in \mathcal {L}(i)}(-a_{il})=a_{ii}=1$. Hence,

$$\begin{aligned} r_i=\sum _{l\in \mathcal {L}(i)}(-a_{il})s_{li} \ge \left\{ \sum _{l\in \mathcal {L}(i)}(-a_{il})\right\} \cdot (1-\alpha )(p-1)\xi = (1-\alpha )(p-1)\xi \end{aligned}$$

(29)

and we have from (9),

$$\begin{aligned} u_{ii}=r_i\ge (1-\alpha )(p-1)\xi \ge (1-\alpha )(p-1)\xi ^2. \end{aligned}$$

(30)

This completes the proof. $\square $

About this article

Cite this article

Shioya, A., Yamamoto, Y. The danger of combining block red–black ordering with modified incomplete factorizations and its remedy by perturbation or relaxation. Japan J. Indust. Appl. Math. 35, 195–216 (2018). https://doi.org/10.1007/s13160-017-0277-5

Download citation

Received: 14 March 2017
Revised: 16 September 2017
Published: 11 October 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s13160-017-0277-5

Keywords

Mathematics Subject Classification

65F08

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The danger of combining block red–black ordering with modified incomplete factorizations and its remedy by perturbation or relaxation

Abstract

Access this article

Similar content being viewed by others

A Domain Decomposition Method for Nonconforming Finite Element Approximations of Eigenvalue Problems

An Efficient GIPM Algorithm for Computing the Smallest V-Singular Value of the Partially Symmetric Tensor

Finite Element Mesh Generation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Theorem 4

Proof

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

The danger of combining block red–black ordering with modified incomplete factorizations and its remedy by perturbation or relaxation

Abstract

Access this article

Similar content being viewed by others

A Domain Decomposition Method for Nonconforming Finite Element Approximations of Eigenvalue Problems

An Efficient GIPM Algorithm for Computing the Smallest V-Singular Value of the Partially Symmetric Tensor

Finite Element Mesh Generation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Theorem 4

Appendix: Proof of Theorem 4

Proof

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation