Mining Data Streams

Aggarwal, Charu C.

doi:10.1007/978-3-319-14142-8_12

Charu C. Aggarwal²

327k Accesses

Abstract

Advances in hardware technology have led to new ways of collecting data at a more rapid rate than before. For example, many transactions of everyday life, such as using a credit card or a phone, lead to automated data collection. Similarly, new ways of collecting data, such as wearable sensors and mobile devices, have added to the deluge of dynamically available data. An important assumption in these forms of data collection is that the data continuously accumulate over time at a rapid rate. These dynamic data sets are referred to as data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Hardcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This value is always at most 1, because \(k <1/\lambda \).
2.
In the event that each distinct element is associated with a nonnegative frequency, the count-min sketch can be updated with the frequency value. Only the simple case of unit updates is discussed here.
3.
It is exactly equal to \(n_s/m\), where \(n_s\) is the frequency of all items other than \(Y\). However, \(n_s\) is less than \(n_f\) by the frequency of \(Y\).
4.
The position of the least significant bit is 0, the next most significant bit is 1, and so on.
5.
This terminology is different from the \(k\)-medians approach introduced in Chap. 6. The relevant subroutines in the STREAM algorithm are more similar to a \(k\)-medoids algorithm. Nevertheless, the “\(k\)-medians” terminology is used here to ensure consistency with the original research paper describing STREAM [240].
6.
The argument also applies to general attributes by first transforming them to binary data with discretization and binarization.

Author information

Authors and Affiliations

IBM T.J. Watson Research Center, Yorktown Heights, New York, USA
Charu C. Aggarwal

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Charu C. Aggarwal .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C. (2015). Mining Data Streams. In: Data Mining. Springer, Cham. https://doi.org/10.1007/978-3-319-14142-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-14142-8_12
Published: 14 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14141-1
Online ISBN: 978-3-319-14142-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics