Abstract
Stencil computations on low dimensional grids are kernels of many scientific applications including finite difference methods used to solve partial differential equations. On typical modern computer architectures such stencil computations are limited by the performance of the memory subsystem, namely by the bandwidth between main memory and the cache. This work considers the computation of star stencils, like the 5-point and 7-point stencil, in the external memory model. The analysis focuses on the constant of the leading term of the non-compulsory I/Os. Optimizing stencil computations is an active field of research, but so far, there has been a significant gap between the lower bounds and the performance of the algorithms. In two dimensions, matching constants for lower and upper bounds are provided closing a gap of 4. In three dimensions, the bounds match up to a factor of \(\sqrt{2}\) improving the known results by a factor of 2\(\sqrt{3}\sqrt{B}\), where B is the block (cache line) size of the external memory model. For higher dimensions n, the presented lower bounds improve the previously known by a factor between 4 and 6 leaving a gap of \(\sqrt[n-1]{n!} \thickapprox{{n} \over{e}}\).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)
Arge, L., Goodrich, M.T., Nelson, M., Sitchinava, N.: Fundamental parallel algorithms for private-cache chip multiprocessors. In: Proc. of SPAA 2008. ACM (2008)
Ballard, G., Demmel, J., Holtz, O., Schwartz, O.: Minimizing communication in numerical linear algebra. SIAM J. Matrix Analysis Appl. 32(3), 866–901 (2011)
Ballard, G., Demmel, J., Holtz, O., Schwartz, O.: Graph expansion and communication costs of fast matrix multiplication. J. ACM 59(6), 32 (2012)
Bodlaender, H.L.: A partial k-arboretum of graphs with bounded treewidth. J. Algorithms, 1–16 (1998)
Bollobás, B., Leader, I.: An isoperimetric inequality on the discrete torus. SIAM J. Discret. Math. 3, 32–37 (1990)
Datta, K., Kamil, S., Williams, S., Oliker, L., Shalf, J., Yelick, K.: Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Rev. 51(1), 129–159 (2009)
Frigo, M., Strumpen, V.: Cache oblivious stencil computations. In: Proc. of 19th Annual ICS 2005, ICS 2005, pp. 361–366. ACM (2005)
Frigo, M., Strumpen, V.: The memory behavior of cache oblivious stencil computations. J. Supercomput. 39(2), 93–112 (2007)
Frumkin, M.A., Van der Wijngaart, R.F.: Tight bounds on cache use for stencil operations on rectangular grids. J. ACM 49, 434–453 (2002)
Hong, J.-W., Kung, H.T.: I/O complexity: The red-blue pebble game. In: Proceedings of STOC 1981, pp. 326–333. ACM, New York (1981)
Hupp, P., Jacob, R.: Tight bounds for low dimensional star stencils in the external memory model. CoRR, abs/1205.0606 (2012)
Irony, D., Toledo, S., Tiskin, A.: Communication lower bounds for distributed-memory matrix multiplication. J. Parallel Distrib. Comput. 64(9), 1017–1026 (2004)
Leopold, C.: An analytical evaluation of tiling for stencil codes with time loop. In: Proc. of the 16th IPDPS. IEEE Computer Society (2002)
Leopold, C.: On optimal locality of linear relaxation. In: Proc. Int. Symp. on Parallel and Distributed Computing and Network, IASTED, pp. 201–206 (2002)
Leopold, C.: Tight bounds on capacity misses for 3D stencil codes. In: Sloot, P.M.A., Tan, C.J.K., Dongarra, J., Hoekstra, A.G. (eds.) ICCS-ComputSci 2002, Part I. LNCS, vol. 2329, pp. 843–852. Springer, Heidelberg (2002)
Tang, Y., Chowdhury, R.A., Kuszmaul, B.C., Luk, C.-K., Leiserson, C.E.: The pochoir stencil compiler. In: Proceedings of SPAA 2011, pp. 117–128. ACM (2011)
Zeiser, T., Wellein, G., Nitsure, A., Iglberger, K., Rüde, U., Hager, G.: Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method. Progress in Computational Fluid Dynamics 8(1-4), 179–188 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hupp, P., Jacob, R. (2013). Tight Bounds for Low Dimensional Star Stencils in the External Memory Model. In: Dehne, F., Solis-Oba, R., Sack, JR. (eds) Algorithms and Data Structures. WADS 2013. Lecture Notes in Computer Science, vol 8037. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40104-6_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-40104-6_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40103-9
Online ISBN: 978-3-642-40104-6
eBook Packages: Computer ScienceComputer Science (R0)