Skip to main content

On Clustering Techniques for Change Diagnosis in Data Streams

  • Conference paper
Advances in Web Mining and Web Usage Analysis (WebKDD 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4198))

Included in the following conference series:

Abstract

In recent years, data streams have become ubiquitous in a variety of applications because of advances in hardware technology. Since data streams may be generated by applications which are time-changing in nature, it is often desirable to explore the underlying changing trends in the data. In this paper, we will explore and survey some of our recent methods for change detection. In particular, we will study methods for change detection which use clustering in order to provide a concise understanding of the underlying trends. We discuss our recent techniques which use micro-clustering in order to diagnose the changes in the underlying data. We also discuss the extension of this method to text and categorical data sets as well community detection in graph data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C.: A Framework for Diagnosing Changes in Evolving Data Streams. In: ACM SIGMOD Conference, pp. 575–586 (2003)

    Google Scholar 

  2. Aggarwal, C.C.: An Intuitive Frame work for Understanding Changes in Evolving Data Streams. In: ICDE Conference (2002)

    Google Scholar 

  3. Aggarwal, C.C., Han, J., Wang, J., Yu, P.: A Framework for Clustering Evolving Data Streams. In: VLDB Conference, pp. 81–92 (2003)

    Google Scholar 

  4. Aggarwal, C.C., Yu, P.S.: Online Analysis of Community Evolution in Data Streams. In: ACM SIAM Data Mining Conference (2006)

    Google Scholar 

  5. Aggarwal, C.C., Yu, P.S.: A Framework for Clustering Massive Text and Categorical Data Streams. In: ACM SIAM Data Mining Conference (2006)

    Google Scholar 

  6. Aggarwal, C., Han, J., Wang, J., Yu, P.: On-Demand Classification of Data Streams. In: ACM KDD Conference (2004)

    Google Scholar 

  7. Ahuja, R., Magnanti, T., Orlin, J.: Network Flows: Theory, Algorithms and Applications. Prentice Hall, Englewood Cliffs (1992)

    Google Scholar 

  8. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: ACM PODS Conference, pp. 1–16 (2002)

    Google Scholar 

  9. Chawathe, S., Garcia-Molina, H.: Meaningful Change Detection in Structured Data. In: ACM SIGMOD Conference Proceedings (1997)

    Google Scholar 

  10. Cortes, C., Pregibon, D., Volinsky, C.: Communities of interest. In: Hoffmann, F., Adams, N., Fisher, D., Guimarães, G., Hand, D.J. (eds.) IDA 2001. LNCS, vol. 2189, p. 105. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  11. Cortes, C., Pregibon, D., Volinsky, C.: Computational Methods for Dynamic Graphs. Journal of Computational and Graphical Statistics 12, 950–970 (2003)

    Article  MathSciNet  Google Scholar 

  12. Dasu, T., Krishnan, S., Venkatasubramaniam, S.: YiK.: An Information-Theoretic Approach to Detecting Changes in Multi-dimensional data Streams. Duke University Technical Report CS-2005-06 (2005)

    Google Scholar 

  13. Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: ACM SIGKDD Conference (2000)

    Google Scholar 

  14. Ganti, V., Gehrke, J., Ramakrishnan, R.: A Frame work for Measuring Changes in Data Characteristics. In: ACM PODS Conference, pp. 126–137 (1999)

    Google Scholar 

  15. Ganti, V., Gehrke, J., Ramakrishnan, R.: Mining and Monitoring Evolving Data. In: IEEE ICDE Conference, pp. 439–448 (2000)

    Google Scholar 

  16. Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS-Clustering Categorical Data Using Summaries. In: ACMKDD Conference, pp. 73–83 (1999)

    Google Scholar 

  17. Gibson, D., Kleinberg, J., Raghavan, P.: Inferring Web Communities from Link Topology. In: Proceedings of the 9th ACM Conference on Hypertext and Hypermedia (1998)

    Google Scholar 

  18. Hulten, G., Spencer, L., Domingos, P.: Mining Time Changing Data Streams. In: ACMKDD Conference (2001)

    Google Scholar 

  19. Imafuji, N., Kitsuregawa, M.: Finding a Web Community by Maximum Flow Algorithm with HITS Score Based Capacity. In: DASFAA, pp.101–106 (2003)

    Google Scholar 

  20. Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the Spread of Influence Through a Social Network. In: ACMKDD Conference (2003)

    Google Scholar 

  21. Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the Bursty Evolution of Blogspace. In: Proceedings of the WWW Conference (2003)

    Google Scholar 

  22. Mei, Q., Zhai, C.: Discovering evolutionary the me patterns from text: an exploration of temporal text mining. In: ACMKDD Conference, pp. 198–207 (2005)

    Google Scholar 

  23. Nasraoui, O., Cardona, C., Rojas, C., Gonzlez, F.: TECNO-STREAMS: Tracking Evolving Clusters in Noisy Data Streams with a Scalable Immune System Learning Model. In: ICDM Conference, pp. 235–242 (2003)

    Google Scholar 

  24. Rajagopalan, S., Kumar, R., Raghavan, P., Tomkins, A.: Trawling the Web for emergingcy ber-communities. In: Proceedings of the 8th WWW conference (1999)

    Google Scholar 

  25. Toyoda, M., Kitsuregawa, M.: Extracting evolution of web communities from aseries of web archives. Hypertext, 28–37 (2003)

    Google Scholar 

  26. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An Efficient Data Cluster-ing Method for Very Large Databases. In: ACMSIGMOD Conference, pp. 103–114 (1996)

    Google Scholar 

  27. http://www.ics.uci.edu/~mlearn

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Aggarwal, C.C., Yu, P.S. (2006). On Clustering Techniques for Change Diagnosis in Data Streams. In: Nasraoui, O., Zaïane, O., Spiliopoulou, M., Mobasher, B., Masand, B., Yu, P.S. (eds) Advances in Web Mining and Web Usage Analysis. WebKDD 2005. Lecture Notes in Computer Science(), vol 4198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11891321_8

Download citation

  • DOI: https://doi.org/10.1007/11891321_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-46346-7

  • Online ISBN: 978-3-540-46348-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics