Skip to main content

Data Stream Mining Using Granularity-Based Approach

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 206))

Summary

Significant applications require data stream mining algorithms to run in resource-constrained environments. Thus, adaptation is a key process to ensure the consistency and continuity of the running algorithms. This chapter provides a theoretical framework for applying the granularity-based approach in mining data streams. Our Algorithm Output Granularity (AOG) is explained in details providing practitioners the ability to use it for enabling resource-awareness and adaptability for their algorithms. Theoretically, AOG has been formalized using the Probably Approximately Correct (PAC) learning model allowing researchers to formalize the adaptability of their techniques. Finally, the integration of AOG with other adaptation strategies is provided.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Babcock, B., Datar, M., Motwani, R.: Load Shedding Techniques for Data Stream Systems (short paper). In: Proc. of the 2003 Workshop on Management and Processing of Data Streams (MPDS 2003) (June 2003)

    Google Scholar 

  2. Bhargava, R., Kargupta, H., Powers, M.: Energy Consumption in Data Analysis for On-board and Distributed Applications. In: Proceedings of the ICML 2003 workshop on Machine Learning Technologies for Autonomous Space Applications (2003)

    Google Scholar 

  3. Chi, Y., Yu, P.S., Wang, H., Muntz, R.R.: Loadstar: A Load Shedding Scheme for Classifying Data Streams. In: The 2005 SIAM International Conference on Data Mining (SIAM SDM 2005) (2005)

    Google Scholar 

  4. Coughlan, J.: Accelerating Scientific Discovery at NASA. In: Jonker, W., Petković, M. (eds.) SDM 2004. LNCS, vol. 3178. Springer, Heidelberg (2004)

    Google Scholar 

  5. Domingos, P., Hulten, G.: A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering. In: Proceedings of the Eighteenth International Conference on Machine Learning, 2001, pp. 106–113. Morgan Kaufmann, Williamstown (2001)

    Google Scholar 

  6. Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: Proceedings of the Association for Computing Machinery Sixth International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)

    Google Scholar 

  7. Gaber, M.M., Krishnaswamy, S., Zaslavsky, A.: On-board Mining of Data Streams in Sensor Networks. In: Badhyopadhyay, S., Maulik, U., Holder, L., Cook, D. (eds.) Advanced Methods of Knowledge Discovery from Complex Data, pp. 307–336. Springer, Heidelberg (2005) (forthcoming)

    Chapter  Google Scholar 

  8. Gaber, M.M., Yu, P.S.: A Holistic Approach for Resource-aware Adaptive Data Stream Mining. Journal of New Generation Computing, Special Issue on Knowledge Discovery from Data Streams (2006)

    Google Scholar 

  9. Gaber, M.M., Krishnaswamy, M., Zaslavsky, S.: Resource- Aware Mining of Data Streams. In: Aguilar-Ruiz, J.S., Gama, J. (eds.) Journal of Universal Computer Science, Special Issue on Knowledge Discovery in Data Streams, pp. 1440–1453 (August 2005)

    Google Scholar 

  10. Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining Data Streams: A Review. ACM SIGMOD Record 34(1) (June 2005) ISSN: 0163-5808

    Google Scholar 

  11. Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association (58), 13–30 (1963)

    Google Scholar 

  12. Muthukrishnan, S.: Data streams: algorithms and applications. In: Proceedings of the fourteenth annual ACM-SIAM symposium on discrete algorithms (2003)

    Google Scholar 

  13. Natarajan, B.K.: Machine learning: a theoretical approach. M. Kaufmann, San Mateo (1991)

    Google Scholar 

  14. Park, B.-H., Ostrouchov, G., Samatova, N.F., Geist, A.: Reservoir-Based Random Sampling with Replacement from Data Stream. In: Proceedings of SIAM International Conference on Data Mining 2004 (2004)

    Google Scholar 

  15. Roiger, R., Geatz, M.: Data mining: a tutorial-based primer. Addison Wesley, Boston (2003)

    Google Scholar 

  16. Sipser, M.: Introduction to the Theory of Computation. In: Part Two: Computability Theory, chs. 3-6, pp. 123–222. PWS Publishing (1997) ISBN 0-534-94728-X

    Google Scholar 

  17. Tatbul, N., Cetintemel, U., Zdonik, S., Cherniack, M., Stonebraker, M.: Load Shedding in a Data Stream Manager. In: Proceedings of the 29th International Conference on Very Large Data Bases, VLDB (September 2003)

    Google Scholar 

  18. Tatbul, N., Cetintemel, U., Zdonik, S., Cherniack, M., Stonebraker, M.: Load Shedding on Data Streams. In: Proceedings of the Workshop on Management and Processing of Data Streams (MPDS 2003), San Diego, CA, USA, June 8 (2003)

    Google Scholar 

  19. Shah, R., Krishnaswamy, S., Gaber, M.M.: Resource-Aware Very Fast K-Means for Ubiquitous Data Stream Mining. In: Proceedings of Second International Workshop on Knowledge Discovery in Data Streams, to be held in conjunction with the 16th European Conference on Machine Learning (ECML 2005) and the 9th European Conference on the Principals and Practice of Knowledge Discovery in Databases (PKDD 2005), Porto, Portugal, October 3-7 (2005)

    Google Scholar 

  20. Valiant, L.G.: A theory of the learnable. Communications of the ACM 27(11), 1134–1142 (1984)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Gaber, M.M. (2009). Data Stream Mining Using Granularity-Based Approach. In: Abraham, A., Hassanien, AE., de Leon F. de Carvalho, A.P., Snášel, V. (eds) Foundations of Computational, IntelligenceVolume 6. Studies in Computational Intelligence, vol 206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01091-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01091-0_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01090-3

  • Online ISBN: 978-3-642-01091-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics