Abstract
Streaming is an important paradigm for handling massive graphs that are too large to fit in the main memory. In the streaming computational model, algorithms are restricted to use much less space than they would need to store the input. Furthermore, the input is accessed in a sequential fashion, therefore, can be viewed as a stream of data elements. The restriction limits the model and yet, algorithms exist for many graph problems in the streaming model. We survey a set of algorithms that compute graph statistics, matching and distance in a graph, and random walks. These are basic graph problems and the algorithms that compute them may be used as building blocks in graph-data management and mining.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
G. Aggarwal, M. Datar, S. Rajagopalan, and M. Ruhl. On the streaming model augmented with a sorting primitive. In IEEE Symposium on Foundations of Computer Science, pages 540–549, 2004.
N. Alon, S. Hoory, and N. Linial. The moore bound for irregular graphs. Graphs and Combinatorics, 18(1):53–57, 2002.
N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences, 58(1):137–147, 1999.
I. Althofer, G. Das, D. Dobkin, and D. Joseph. Generating sparse spanners for weighted graphs. In Proc. 2nd Scandinavian Workshop on Algorithm Theory, LNCS 447, pages 26–37, 1990.
B. Awerbuch, B. Berger, L. Cowen, and D. Peleg. Near-linear time construction of sparse neighborhood covers. SIAM Journal on Computing, 28(1):263–277, 1998.
Z. Bar-Yossef, R. Kumar, and D. Sivakumar. Reductions in streaming algorithms, with an application to counting triangles in graphs. In Proc. 13th ACM-SIAM Symposium on Discrete Algorithms, pages 623–632, 2002.
B. Bollobas. Extremal Graph Theory. Academic Press, New York, 1978.
L. S. Buriol, G. Frahling, S. Leonardi, A. Marchetti-Spaccamela, and C. Sohler. Counting triangles in data streams. In Proceedings of ACM Symposium on Principles of Database Systems, pages 253–262, 2006.
A. Chakrabarti, G. Cormode, and A. McGregor. A near-optimal algorithm for computing the entropy of a stream. In ACM-SIAM Symposium on Discrete Algorithms, pages 328–335, 2007.
M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. Theoretical Computer Science, 312, 2004.
E. Cohen. Fast algorithms for t-spanners and stretch-t paths. In Proc. 34th IEEE Symposium on Foundation of Computer Science, pages 648–658, 1993.
E. Cohen. Fast algorithms for constructing t-spanners and paths with stretch t. SIAM Journal on Computing, 28:210–236, 1998.
Cormode and Muthukrishnan. What’s hot and what’s not: Tracking most frequent items dynamically. ACM Transactions on Database Systems, 30, 2005.
G. Cormode and S. Muthukrishnan. Space efficient mining of multigraph streams. In Proceedings of ACM Symposium on Principles of Database Systems, pages 271–282, 2005.
C. Demetrescu, I. Finocchi, and A. Ribichini. Trading of space for passes in graph streaming problems. In ACM-SIAM Symposium on Discrete Algorithms, pages 714–723, 2006.
P. Drineas and R. Kannan. Pass efficient algorithms for approximating large matrices. In Proc. 14th ACM-SIAM Symposium on Discrete Algorithms, pages 223–232, 2003.
R. D. Dutton and R. C. Brigham. Edges in graphs with large girth. Graphs and Combinatorics, 7(4):315–321, 1991.
M. Elkin. Computing almost shortest paths. In Proc. 20th ACM Symposium on Principles of Distributed Computing, pages 53–62, 2001.
M. Elkin. A fast distributed protocol for constructing the minimum spanning tree. In Proc. 15th ACM-SIAM Symposium on Discrete Algorithms, pages 352–361, 2004.
M. Elkin. Streaming and fully dynamic centralized algorithms for constructing and maintaining sparse spanners. In International Col loquium on Automata, Languages and Programming, pages 716–727, 2007.
M. Elkin and J. Zhang. Efficient algorithms for constructing (1 + ε, β)-spanners in the distributed and streaming models. In Proc. 23rd ACM Symposium on Principles of Distributed Computing, pages 160–168, 2004.
J. Feigenbaum, S. Kannan, A. McGregor, S. Suri, and J. Zhang. On graph problems in a semi-streaming model. In Proc. 31st International Colloquium on Automata, Languages and Programming, LNCS 3142, pages 531–543, 2004.
J. Feigenbaum, S. Kannan, A. McGregor, S. Suri, and J. Zhang. Graph distances in the streaming model: The value of space. In Proc. 16th ACM-SIAM Symposium on Discrete Algorithms, pages 745–754, 2005.
J. Feigenbaum, S. Kannan, M. Strauss, and M. Viswanathan. An approximate L 1 difference algorithm for massive data streams. SIAM Journal on Computing, 32(1):131–151, 2002.
P. Flajolet and G. Martin. Probabilistic counting. In Proc. 24th IEEE Symposium on Foundation of Computer Science, pages 76–82, 1983.
A. C. Gilbert, S. Guha, P. Indyk, Y. Kotidis, S. Muthukrishnan, and M. Strauss. Fast, small-space algorithms for approximate histogram maintenance. In Proc. 34th ACM Symposium on Theory of Computing, pages 389–398, 2002.
S. Guha, N. Koudas, and K. Shim. Data-streams and histograms. In Proc. 33rd ACM Symposium on Theory of Computing, pages 471–475, 2001.
S. Guha, N. Mishra, R. Motwani, and L. O’Callaghan. Clustering data streams. In Proc. 41st IEEE Symposium on Foundations of Computer Science, pages 359–366, 2000.
M. R. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. Technical Report 1998–001, DEC Systems Research Center, 1998.
J. Hopcroft and J. Ullman. Some results on tape-bounded turing machines. Journal of the ACM, 16:160–177, 1969.
P. Indyk. Stable distributions, pseudorandom generators, embeddings and data stream computation. In Proc. 41st IEEE Symposium on Foundations of Computer Science, pages 189–197, 2000.
P. Indyk. Algorithms for dynamic geometric problems over data streams. In Proc. 36th ACM Symposium on Theory of Computing, pages 373–380, 2004.
Jowhari and Ghodsi. New streaming algorithms for counting triangles in graphs. In Annual International Conference on Computing and Combinatorics, pages 710–716, 2005.
L. Lovasz and M. Simonovits. The mixing rate of markov chains, an isoperimetric inequality, and computing the volume. In IEEE Symposium on Foundations of Computer Science, pages 346–354, 1990.
A. McGregor. Finding graph matchings in data streams. In APPROX-RANDOM, pages 170–181, 2005.
J. Munro and M. Paterson. Selection and sorting with limited storage. Theoretical Computer Science, 12:315–323, 1980.
S. Muthukrishnan. Data Streams: Algorithms and Applications. Now Publishers, 2006.
S. Muthukrishnan and M. Strauss. Rangesum histograms. In ACM-SIAM Symposium on Discrete Algorithms, pages 233–242, 2003.
D. Peleg and J. Ullman. An optimal synchronizer for the hypercube. SIAM Journal on Computing, 18:740–747, 1989.
A. D. Sarma, S. Gollapudi, and R. Panigrahy. Estimating pagerank on graph streams. In ACM Symposium on Principles of Database Systems, pages 69–78, 2008.
A. D. Sarma, S. Gollapudi, and R. Panigrahy. Sparse cut projections in graph streams. In European Symposium on Algorithms, 2009.
D. Spielman and S.-H. Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In ACM Symposium on Theory of Computing, pages 81–90, 2004.
J. Vitter. Random sampling with a reservoir. ACM Trans. Math. Softw, 11(1):37–57, 1985.
J. S. Vitter. External memory algorithms and data structures: Dealing with massive data. ACM Computing Surveys, 33(2):209–271, 2001.
M. Zelke. k-connectivity in the semi-streaming model. CoRR, cs/0608066, 2006.
M. Zelke. Weighted matching in the semi-streaming model. In Symposium on Theoretical Aspects of Computer Science, pages 669–680, 2008.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag US
About this chapter
Cite this chapter
Zhang, J. (2010). A Survey on Streaming Algorithms for Massive Graphs. In: Aggarwal, C., Wang, H. (eds) Managing and Mining Graph Data. Advances in Database Systems, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6045-0_13
Download citation
DOI: https://doi.org/10.1007/978-1-4419-6045-0_13
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-6044-3
Online ISBN: 978-1-4419-6045-0
eBook Packages: Computer ScienceComputer Science (R0)