Data Streams: An Overview and Scientific Applications

Aggarwal, Charu C.

doi:10.1007/978-3-642-02788-8_14

Charu C. Aggarwal²

3006 Accesses
6 Citations

Abstract

In recent years, advances in hardware technology have facilitated the ability to collect data continuously. Simple transactions of everyday life such as using a credit card, a phone, or browsing the web lead to automated data storage. Similarly, advances in information technology have lead to large flows of data across IP networks. In many cases, these large volumes of data can be mined for interesting and relevant information in a wide variety of applications. When the volume of the underlying data is very large, it leads to a number of computational and mining challenges: With increasing volume of the data, it is no longer possible to process the data efficiently by using multiple passes. Rather, one can process a data item at most once. This leads to constraints on the implementation of the underlying algorithms. Therefore, stream mining algorithms typically need to be designed so that the algorithms work with one pass of the data. In most cases, there is an inherent temporal component to the stream mining process. This is because the data may evolve over time. This behavior of data streams is referred to as temporal locality. Therefore, a straightforward adaptation of one-pass mining algorithms may not be an effective solution to the task. Stream mining algorithms need to be carefully designed with a clear focus on the evolution of the underlying data. Another important characteristic of data streams is that they are often mined in a distributed fashion. Furthermore, the individual processors may have limited processing and memory. Examples of such cases include sensor networks, in which it may be desirable to perform in-network processing of data stream with limited processing and memory [1, 2]. This chapter will provide an overview of the key challenges in stream mining algorithms which arise from the unique setup in which these problems are encountered. This chapter is organized as follows. In the next section, we will discuss the generic challenges that stream mining poses to a variety of data management and data mining problems. The next section also deals with several issues which arise in the context of data stream management. In Sect. 3, we discuss several mining algorithms on the data stream model. Section 4 discusses various scientific applications of data streams. Section 5 discusses the research directions and conclusions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

IBM T. J. Watson Research Center, New York, NY, USA
Charu C. Aggarwal

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Charu C. Aggarwal .

Editor information

Editors and Affiliations

Caulfield School of, Information Technology, Monash University, Caulfield East, Australia
Mohamed Medhat Gaber

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C.C. (2009). Data Streams: An Overview and Scientific Applications. In: Gaber, M. (eds) Scientific Data Mining and Knowledge Discovery. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02788-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-02788-8_14
Published: 31 July 2009
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02787-1
Online ISBN: 978-3-642-02788-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics