Skip to main content

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 7790))

  • 702 Accesses

Abstract

A burst, i.e., an unusally high frequency of occurrence of an event in a time-window, is interesting in many monitoring applications that give rise to temporal data as it often indicates an abnormal activity. While the problem of detecting bursts from time-series data has been well addressed, the question of what choice of thresholds, on the number of events as well as on the window size, makes a window “unusally bursty” remains a relevant one. We consider the problem of finding critical values of both these thresholds. Since for most applications, we hardly have any apriori idea of what combination of thresholds is critical, the range of possible values for either threshold can be very large. We formulate finding the combination of critical thresholds as a two-dimensional search problem and design efficient deteministic and randomized divide-and-conquer heuristics. For the deterministic heuristic, we show that under some weak assumptions, the computational overhead is logarithmic in the sizes of the ranges. Under identical assumptions, the expected computational overhead of the randomized heuristic in the worst case is also logarithmic. Using data obtained from logs of medical equipment, we conduct extensive simulations that reinforce our theoretical results, and show that on average, the randomized heuristic beats its deteministic counterpart in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Angel, A., Koudas, N., Sarkas, N., Srivastava, D.: What’s on the grapevine? In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 1047–1050 (2009)

    Google Scholar 

  2. Barford, P., Crovella, M.: Generating representative web workloads for network and server performance evaluation. In: SIGMETRICS, pp. 151–160 (1998)

    Google Scholar 

  3. Beran, J.: Statistics for Long-Memory Processes. Chapman & Hall, New York (1994)

    MATH  Google Scholar 

  4. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. The MIT Press, McGraw-Hill Book Company (2009)

    Google Scholar 

  5. Cuzzocrea, A.: CAMS: OLAPing Multidimensional Data Streams Efficiently. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2009. LNCS, vol. 5691, pp. 48–62. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  6. Cuzzocrea, A.: Retrieving Accurate Estimates to OLAP Queries over Uncertain and Imprecise Multidimensional Data Streams. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 575–576. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  7. Cuzzocrea, A., Chakravarthy, S.: Event-based lossy compression for effective and efficient OLAP over data streams. Data and Knowledge Engineering 69(7), 678–708 (2010)

    Article  Google Scholar 

  8. Garrett, M.W., Willinger, W.: Analysis, modeling and generation of self-similar vbr video traffic. In: SIGCOMM, pp. 269–280 (1994)

    Google Scholar 

  9. Kleinberg, J.M.: Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery 7(4), 373–397 (2003)

    Article  MathSciNet  Google Scholar 

  10. Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the bursty evolution of blogspace. In: WWW, pp. 568–576 (2003)

    Google Scholar 

  11. Lahiri, B., Akrotirianakis, I., Moerchen, F.: Finding critical thresholds for defining bursts in event logs, http://home.eng.iastate.edu/~bibudh/techreport/burst_detection.pdf

  12. Leskovec, J., Backstrom, L., Kleinberg, J.M.: Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 497–506 (2009)

    Google Scholar 

  13. Mathioudakis, M., Koudas, N.: Twittermonitor: trend detection over the twitter stream. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 1155–1158 (2010)

    Google Scholar 

  14. Lithium, http://www.lithium.com/

  15. Google AdWords, http://www.google.com/ads/adwords2/

  16. Radian, http://www.radian6.com/

  17. Sysomos, http://www.sysomos.com/

  18. Thoora, http://thoora.com/

  19. Trendrr, http://trendrr.com/

  20. Twitscoop, http://www.twitscoop.com/

  21. Vlachos, M., Meek, C., Vagena, Z., Gunopulos, D.: Identifying similarities, periodicities and bursts for online search queries. In: SIGMOD Conference, pp. 131–142 (2004)

    Google Scholar 

  22. Wang, M., Chan, N.H., Papadimitriou, S., Faloutsos, C., Madhyastha, T.M.: Data mining meets performance evaluation: Fast algorithms for modeling bursty traffic. In: ICDE, pp. 507–516 (2002)

    Google Scholar 

  23. Wang, X., Zhai, C., Hu, X., Sproat, R.: Mining correlated bursty topic patterns from coordinated text streams. In: KDD, pp. 784–793 (2007)

    Google Scholar 

  24. Xu, K., Zhang, Z.L., Bhattacharyya, S.: Reducing unwanted traffic in a backbone network. Appeared in the Proceedings of the Steps to Reducing Unwanted Traffic on the Internet Workshop, SRUTI (2005)

    Google Scholar 

  25. Yuan, Z., Jia, Y., Yang, S.: Online burst detection over high speed short text streams. In: International Conference on Computational Science (ICCS), pp. 717–725 (2007)

    Google Scholar 

  26. Yuan, Z., Miao, J., Jia, Y., Wang, L.: Counting data stream based on improved counting bloom filter. In: Proceedings of the Ninth International Conference on Web-Age Information Management (WAIM), pp. 512–519 (2008)

    Google Scholar 

  27. Zhang, L., Guan, Y.: Detecting click fraud in pay-per-click streams of online advertising networks. In: ICDCS (2008)

    Google Scholar 

  28. Zhang, X., Shasha, D.: Better burst detection. In: Proceedings of the 22nd International Conference on Data Engineering (ICDE), p. 146 (2006)

    Google Scholar 

  29. Zhu, Y., Shasha, D.: Efficient elastic burst detection in data streams. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 336–345 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Lahiri, B., Akrotirianakis, I., Moerchen, F. (2013). Finding Critical Thresholds for Defining Bursts in Event Logs. In: Hameurlain, A., Küng, J., Wagner, R., Cuzzocrea, A., Dayal, U. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems VIII. Lecture Notes in Computer Science, vol 7790. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37574-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37574-3_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37573-6

  • Online ISBN: 978-3-642-37574-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics