Abstract
Advances in hardware technology have led to new ways of collecting data at a more rapid rate than before. For example, many transactions of everyday life, such as using a credit card or a phone, lead to automated data collection. Similarly, new ways of collecting data, such as wearable sensors and mobile devices, have added to the deluge of dynamically available data. An important assumption in these forms of data collection is that the data continuously accumulate over time at a rapid rate. These dynamic data sets are referred to as data streams.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This value is always at most 1, because \(k <1/\lambda \).
- 2.
In the event that each distinct element is associated with a nonnegative frequency, the count-min sketch can be updated with the frequency value. Only the simple case of unit updates is discussed here.
- 3.
It is exactly equal to \(n_s/m\), where \(n_s\) is the frequency of all items other than \(Y\). However, \(n_s\) is less than \(n_f\) by the frequency of \(Y\).
- 4.
The position of the least significant bit is 0, the next most significant bit is 1, and so on.
- 5.
This terminology is different from the \(k\)-medians approach introduced in Chap. 6. The relevant subroutines in the STREAM algorithm are more similar to a \(k\)-medoids algorithm. Nevertheless, the “\(k\)-medians” terminology is used here to ensure consistency with the original research paper describing STREAM [240].
- 6.
The argument also applies to general attributes by first transforming them to binary data with discretization and binarization.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Aggarwal, C. (2015). Mining Data Streams. In: Data Mining. Springer, Cham. https://doi.org/10.1007/978-3-319-14142-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-14142-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14141-1
Online ISBN: 978-3-319-14142-8
eBook Packages: Computer ScienceComputer Science (R0)