Abstract
In recent years, data streams have become ubiquitous in a variety of applications because of advances in hardware technology. Since data streams may be generated by applications which are time-changing in nature, it is often desirable to explore the underlying changing trends in the data. In this paper, we will explore and survey some of our recent methods for change detection. In particular, we will study methods for change detection which use clustering in order to provide a concise understanding of the underlying trends. We discuss our recent techniques which use micro-clustering in order to diagnose the changes in the underlying data. We also discuss the extension of this method to text and categorical data sets as well community detection in graph data streams.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.C.: A Framework for Diagnosing Changes in Evolving Data Streams. In: ACM SIGMOD Conference, pp. 575–586 (2003)
Aggarwal, C.C.: An Intuitive Frame work for Understanding Changes in Evolving Data Streams. In: ICDE Conference (2002)
Aggarwal, C.C., Han, J., Wang, J., Yu, P.: A Framework for Clustering Evolving Data Streams. In: VLDB Conference, pp. 81–92 (2003)
Aggarwal, C.C., Yu, P.S.: Online Analysis of Community Evolution in Data Streams. In: ACM SIAM Data Mining Conference (2006)
Aggarwal, C.C., Yu, P.S.: A Framework for Clustering Massive Text and Categorical Data Streams. In: ACM SIAM Data Mining Conference (2006)
Aggarwal, C., Han, J., Wang, J., Yu, P.: On-Demand Classification of Data Streams. In: ACM KDD Conference (2004)
Ahuja, R., Magnanti, T., Orlin, J.: Network Flows: Theory, Algorithms and Applications. Prentice Hall, Englewood Cliffs (1992)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: ACM PODS Conference, pp. 1–16 (2002)
Chawathe, S., Garcia-Molina, H.: Meaningful Change Detection in Structured Data. In: ACM SIGMOD Conference Proceedings (1997)
Cortes, C., Pregibon, D., Volinsky, C.: Communities of interest. In: Hoffmann, F., Adams, N., Fisher, D., Guimarães, G., Hand, D.J. (eds.) IDA 2001. LNCS, vol. 2189, p. 105. Springer, Heidelberg (2001)
Cortes, C., Pregibon, D., Volinsky, C.: Computational Methods for Dynamic Graphs. Journal of Computational and Graphical Statistics 12, 950–970 (2003)
Dasu, T., Krishnan, S., Venkatasubramaniam, S.: YiK.: An Information-Theoretic Approach to Detecting Changes in Multi-dimensional data Streams. Duke University Technical Report CS-2005-06 (2005)
Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: ACM SIGKDD Conference (2000)
Ganti, V., Gehrke, J., Ramakrishnan, R.: A Frame work for Measuring Changes in Data Characteristics. In: ACM PODS Conference, pp. 126–137 (1999)
Ganti, V., Gehrke, J., Ramakrishnan, R.: Mining and Monitoring Evolving Data. In: IEEE ICDE Conference, pp. 439–448 (2000)
Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS-Clustering Categorical Data Using Summaries. In: ACMKDD Conference, pp. 73–83 (1999)
Gibson, D., Kleinberg, J., Raghavan, P.: Inferring Web Communities from Link Topology. In: Proceedings of the 9th ACM Conference on Hypertext and Hypermedia (1998)
Hulten, G., Spencer, L., Domingos, P.: Mining Time Changing Data Streams. In: ACMKDD Conference (2001)
Imafuji, N., Kitsuregawa, M.: Finding a Web Community by Maximum Flow Algorithm with HITS Score Based Capacity. In: DASFAA, pp.101–106 (2003)
Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the Spread of Influence Through a Social Network. In: ACMKDD Conference (2003)
Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the Bursty Evolution of Blogspace. In: Proceedings of the WWW Conference (2003)
Mei, Q., Zhai, C.: Discovering evolutionary the me patterns from text: an exploration of temporal text mining. In: ACMKDD Conference, pp. 198–207 (2005)
Nasraoui, O., Cardona, C., Rojas, C., Gonzlez, F.: TECNO-STREAMS: Tracking Evolving Clusters in Noisy Data Streams with a Scalable Immune System Learning Model. In: ICDM Conference, pp. 235–242 (2003)
Rajagopalan, S., Kumar, R., Raghavan, P., Tomkins, A.: Trawling the Web for emergingcy ber-communities. In: Proceedings of the 8th WWW conference (1999)
Toyoda, M., Kitsuregawa, M.: Extracting evolution of web communities from aseries of web archives. Hypertext, 28–37 (2003)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An Efficient Data Cluster-ing Method for Very Large Databases. In: ACMSIGMOD Conference, pp. 103–114 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aggarwal, C.C., Yu, P.S. (2006). On Clustering Techniques for Change Diagnosis in Data Streams. In: Nasraoui, O., Zaïane, O., Spiliopoulou, M., Mobasher, B., Masand, B., Yu, P.S. (eds) Advances in Web Mining and Web Usage Analysis. WebKDD 2005. Lecture Notes in Computer Science(), vol 4198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11891321_8
Download citation
DOI: https://doi.org/10.1007/11891321_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46346-7
Online ISBN: 978-3-540-46348-1
eBook Packages: Computer ScienceComputer Science (R0)