Abstract
With the proliferation of data intensive applications, it has become necessary to develop new techniques to handle massive data sets. Traditional algorithmic techniques and data structures are not always suitable to handle the amount of data that is required and the fact that the data often streams by and cannot be accessed again. A field of research established over the past decade is that of handling massive data sets using data synopses, and developing algorithmic techniques for data stream models. We will discuss some of the research work that has been done in the field, and provide a decades’ perspective to data synopses and data streams.
Chapter PDF
Similar content being viewed by others
References
Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. of Computer and System Sciences 58, 137–147 (1999); STOC 1996 Special Issue
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proc. Symposium on Principles of Database Systems, pp. 1–16 (2002)
Gibbons, P.B., Matias, Y.: Synopses data structures for massive data sets. External memory algorithms, DIMACS Series Discrete Math. & TCS, AMSÂ 50 (1999), Also SODA 1999
Matias, Y.: Data streams and data synopses for massive data sets, http://www.cs.tau.ac.il/~matias/streams/
Muthukrishnan, S.: Data streams: Algorithms and applications, http://www.cs.rutgers.edu/~muthu/stream-1-1.ps
Vitter, J.S.: External memory algorithms and data structures. ACM Comput Surv. 33(2), 209–271 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Matias, Y. (2005). Data Streams and Data Synopses for Massive Data Sets (Invited Talk). In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds) Machine Learning: ECML 2005. ECML 2005. Lecture Notes in Computer Science(), vol 3720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564096_6
Download citation
DOI: https://doi.org/10.1007/11564096_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29243-2
Online ISBN: 978-3-540-31692-3
eBook Packages: Computer ScienceComputer Science (R0)