A synopsis of dataset D is an abstract of D. A sketch is also referred to an abstract of dataset D but is usually referred to an abstract in a sampling method.
Sketch/synopsis techniques have many applications. They are mainly used for statistics estimation in query processing optimization and for supporting on-line data analysis via approximate query processing. The goal is to develop effective and efficient techniques to build a small space synopsis while achieving high precision. For instance, a key component in query processing optimization is to estimate the result sizes of queries. Many techniques [1, 2] have been developed for this purpose, including histograms, wavelets, and join synopses.
In data stream applications, the space requirements of synopses/sketches are critical to keep them in memory for on-line query processing. Streams are usually massive in size and fast at arrival rates; consequently it may be infeasible to keep a whole...
- 1.Alon N, Gibbons PB, Matias Y, Szegedy M. Tracking join and self-join sizes in limited storage. In: Proceedings of the 18th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems; 1999.Google Scholar
- 2.Gibbons PB, Matias Y. Synopsis data structures for massive data sets. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms; 1999.Google Scholar
- 3.Zhang Y, Lin X, Xu J, Korn F, Wang W. Space-efficient relative error order sketch over data streams. In: Proceedings of the 22nd International Conference on Data Engineering; 2006.Google Scholar