Skip to main content

Mining Data Streams

  • Chapter
  • First Online:
Data Mining
  • 327k Accesses

Abstract

Advances in hardware technology have led to new ways of collecting data at a more rapid rate than before. For example, many transactions of everyday life, such as using a credit card or a phone, lead to automated data collection. Similarly, new ways of collecting data, such as wearable sensors and mobile devices, have added to the deluge of dynamically available data. An important assumption in these forms of data collection is that the data continuously accumulate over time at a rapid rate. These dynamic data sets are referred to as data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This value is always at most 1, because \(k <1/\lambda \).

  2. 2.

    In the event that each distinct element is associated with a nonnegative frequency, the count-min sketch can be updated with the frequency value. Only the simple case of unit updates is discussed here.

  3. 3.

    It is exactly equal to \(n_s/m\), where \(n_s\) is the frequency of all items other than \(Y\). However, \(n_s\) is less than \(n_f\) by the frequency of \(Y\).

  4. 4.

    The position of the least significant bit is 0, the next most significant bit is 1, and so on.

  5. 5.

    This terminology is different from the \(k\)-medians approach introduced in Chap. 6. The relevant subroutines in the STREAM algorithm are more similar to a \(k\)-medoids algorithm. Nevertheless, the “\(k\)-medians” terminology is used here to ensure consistency with the original research paper describing STREAM [240].

  6. 6.

    The argument also applies to general attributes by first transforming them to binary data with discretization and binarization.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charu C. Aggarwal .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Aggarwal, C. (2015). Mining Data Streams. In: Data Mining. Springer, Cham. https://doi.org/10.1007/978-3-319-14142-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14142-8_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14141-1

  • Online ISBN: 978-3-319-14142-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics