On the Sublinear Processor Gap for Parallel Architectures

López-Ortiz, Alejandro; Salinger, Alejandro

doi:10.1007/978-3-642-38236-9_18

Alejandro López-Ortiz¹⁹ &
Alejandro Salinger¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7876))

Included in the following conference series:

International Conference on Theory and Applications of Models of Computation

895 Accesses

Abstract

In the past, parallel algorithms were developed, for the most part, under the assumption that the number of processors is Θ(n) (where n is the size of the input) and that if in practice the actual number was smaller, this could be resolved using Brent’s Lemma to simulate the highly parallel solution on a lower-degree parallel architecture. In this paper, however, we argue that design and implementation issues of algorithms and architectures are significantly different—both in theory and in practice—between computational models with high and low degrees of parallelism. We report an observed gap in the behavior of a parallel architecture depending on the number of processors. This gap appears repeatedly in both empirical cases, when studying practical aspects of architecture design and program implementation as well as in theoretical instances when studying the behaviour of various parallel algorithms. It separates the performance, design and analysis of systems with a sublinear number of processors and systems with linearly many processors. More specifically we observe that systems with either logarithmically many cores or with O(n ^α) cores (with α < 1) exhibit a qualitatively different behavior than a system with a linear number of cores on the size of the input, i.e., Θ(n). The evidence we present suggests the existence of a sharp theoretical gap between the classes of problems that can be efficiently parallelized with o(n) processors and with Θ(n) processors unless P = NC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ajwani, D., Sitchinava, N., Zeh, N.: Geometric algorithms for private-cache chip multiprocessors. In: de Berg, M., Meyer, U. (eds.) ESA 2010, Part II. LNCS, vol. 6347, pp. 75–86. Springer, Heidelberg (2010)
Chapter Google Scholar
Ajwani, D., Sitchinava, N., Zeh, N.: I/O-optimal distribution sweeping on private-cache chip multiprocessors. In: IPDPS, pp. 1114–1123. IEEE (2011)
Google Scholar
Arge, L., Goodrich, M.T., Nelson, M.J., Sitchinava, N.: Fundamental parallel algorithms for private-cache chip multiprocessors. In: SPAA, pp. 197–206 (2008)
Google Scholar
Arge, L., Goodrich, M.T., Sitchinava, N.: Parallel external memory graph algorithms. In: IPDPS, pp. 1–11. IEEE (2010)
Google Scholar
Arlazarov, V., Dinic, E., Kronrod, M., Faradzev, I.: On economic construction of the transitive closure of a directed graph. Dokl. Akad. Nauk SSSR 194, 487–488 (1970) (in Russian); English translation in Soviet Math. Dokl. 11, 1209–1210 (1975)
MathSciNet Google Scholar
Bender, M.A., Phillips, C.A.: Scheduling DAGs on asynchronous processors. In: SPAA, pp. 35–45. ACM (2007)
Google Scholar
Blelloch, G.E., Chowdhury, R.A., Gibbons, P.B., Ramachandran, V., Chen, S., Kozuch, M.: Provably good multicore cache performance for divide-and-conquer algorithms. In: SODA. ACM (2008)
Google Scholar
Blelloch, G.E., Gibbons, P.B.: Effectively sharing a cache among threads. In: SPAA, pp. 235–244. ACM (2004)
Google Scholar
Blelloch, G.E., Gibbons, P.B., Matias, Y.: Provably efficient scheduling for languages with fine-grained parallelism. J. ACM 46, 281–321 (1999)
Article MathSciNet MATH Google Scholar
Bose, P., Chen, E.Y., He, M., Maheshwari, A., Morin, P.: Succinct geometric indexes supporting point location queries. In: SODA, pp. 635–644. SIAM (2009)
Google Scholar
Brent, R.P.: The parallel evaluation of general arithmetic expressions. J. ACM 21(2), 201–206 (1974)
Article MathSciNet MATH Google Scholar
Burton, F.W., Sleep, M.R.: Executing functional programs on a virtual tree of processors. In: FPCA, pp. 187–194. ACM (1981)
Google Scholar
Chowdhury, R.A., Ramachandran, V.: Cache-efficient dynamic programming algorithms for multicores. In: SPAA, pp. 207–216. ACM (2008)
Google Scholar
Dorrigiv, R., López-Ortiz, A., Salinger, A.: Optimal speedup on a low-degree multi-core parallel architecture (LoPRAM). In: SPAA, pp. 185–187. ACM (2008)
Google Scholar
Dymond, P.W., Tompa, M.: Speedups of deterministic machines by synchronous parallel machines. In: STOC, pp. 336–343. ACM (1983)
Google Scholar
Feller, W.: An Introduction to Probability Theory and Its Applications, vol. 1. Wiley (1968)
Google Scholar
Fujiwara, A., Inoue, M., Masuzawa, T.: Parallelizability of some P-complete problems. In: Rolim, J.D.P. (ed.) IPDPS 2000 Workshops. LNCS, vol. 1800, pp. 116–122. Springer, Heidelberg (2000)
Google Scholar
Greenlaw, R., Hoover, H.J., Ruzzo, W.L.: Limits to parallel computation: P-completeness theory. Oxford University Press, Inc., New York (1995)
MATH Google Scholar
Hopcroft, J.E., Paul, W.J., Valiant, L.G.: On time versus space and related problems. In: FOCS, pp. 57–64. IEEE (1975)
Google Scholar
Kruskal, C.P., Rudolph, L., Snir, M.: A complexity theory of efficient parallel algorithms. Theor. Comput. Sci. 71(1), 95–132 (1990)
Article MathSciNet MATH Google Scholar
Munro, J.I.: Tables. In: Chandru, V., Vinay, V. (eds.) FSTTCS 1996. LNCS, vol. 1180, pp. 37–42. Springer, Heidelberg (1996)
Chapter Google Scholar
Raab, M., Steger, A.: “Balls into Bins” - a simple and tight analysis. In: Rolim, J.D.P., Serna, M., Luby, M. (eds.) RANDOM 1998. LNCS, vol. 1518, pp. 159–170. Springer, Heidelberg (1998)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

David R. Cheriton School of Computer Science, University of Waterloo, Canada
Alejandro López-Ortiz & Alejandro Salinger

Authors

Alejandro López-Ortiz
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Salinger
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The University of Hong Kong, Hong Kong
T-H. Hubert Chan
The Chinese University of Hong Kong, Hong Kong
Lap Chi Lau
Department of Computer Science, Stanford University, 94305, Stanford, CA, USA
Luca Trevisan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

López-Ortiz, A., Salinger, A. (2013). On the Sublinear Processor Gap for Parallel Architectures. In: Chan, TH.H., Lau, L.C., Trevisan, L. (eds) Theory and Applications of Models of Computation. TAMC 2013. Lecture Notes in Computer Science, vol 7876. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38236-9_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-38236-9_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38235-2
Online ISBN: 978-3-642-38236-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics