Skip to main content

Big Continuous Data: Dealing with Velocity by Composing Event Streams

  • Chapter
  • First Online:
Book cover Big Data Concepts, Theories, and Applications

Abstract

The rate at which we produce data is growing steadily, thus creating even larger streams of continuously evolving data. Online news, micro-blogs, search queries are just a few examples of these continuous streams of user activities. The value of these streams relies in their freshness and relatedness to on-going events. Modern applications consuming these streams need to extract behaviour patterns that can be obtained by aggregating and mining statically and dynamically huge event histories. An event is the notification that a happening of interest has occurred. Event streams must be combined or aggregated to produce more meaningful information. By combining and aggregating them either from multiple producers, or from a single one during a given period of time, a limited set of events describing meaningful situations may be notified to consumers. Event streams with their volume and continuous production cope mainly with two of the characteristics given to Big Data by the 5V’s model: volume & velocity. Techniques such as complex pattern detection, event correlation, event aggregation, event mining and stream processing, have been used for composing events. Nevertheless, to the best of our knowledge, few approaches integrate different composition techniques (online and post-mortem) for dealing with Big Data velocity. This chapter gives an analytical overview of event stream processing and composition approaches: complex event languages, services and event querying systems on distributed logs. Our analysis underlines the challenges introduced by Big Data velocity and volume and use them as reference for identifying the scope and limitations of results stemming from different disciplines: networks, distributed systems, stream databases, event composition services, and data mining on traces.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.streambase.com.

  2. 2.

    http://www.dbms2.com/category/products-and-vendors/truviso/.

  3. 3.

    https://storm.apache.org.

  4. 4.

    E 1 |E 2 matches either events of type E 1 or E 2 .

  5. 5.

    E* is the concatenation of zero or more events of type E.

References

  1. Jagadish HV, Gehrke J, Labrinidis A, Papakonstantinou Y, Patel JM, Ramakrishnan R, Shahabi C (2014) Big Data and its technical challenges. Commun ACM 57:86–94

    Article  Google Scholar 

  2. Terry D, Goldberg D, Nichols D, Oki B (1992) Continuous queries over append-only databases. ACM SIGMOD Record

    Google Scholar 

  3. Zheng B, Lee DL (2001) Semantic caching in location-dependent query processing. In: Proceedings of the 7th international symposium on advances in spatial and temporal databases (SSTD), Redondo Beach, CA, USA

    Google Scholar 

  4. Urhan T, Franklin MJ (2000) Xjoin: a reactively-scheduled pipelined join operator. IEEE Data Eng Bull 23:27–33

    Google Scholar 

  5. De Francisci Morales G (2013) SAMOA: a platform for mining Big Data streams. In: Proceedings of the 22nd international conference on World Wide Web Companion, Geneva, Switzerland, pp 777–778

    Google Scholar 

  6. Adiba M, Castrejón JC, Espinosa-Oviedo JA, Vargas-Solar G, Zechinelli-Martini JL (2015) Big data management: challenges, approaches, tools and their limitations. Networking for big data

    Google Scholar 

  7. Abiteboul S, Manolescu I, Benjelloun O, Milo T, Cautis B, Preda N (2004) Lazy query evaluation for active xml. In: Proceedings of the SIGMOD international conference

    Google Scholar 

  8. Luckham D (2002) The power of events: an introduction to complex event processing in distributed systems. Addison Wesley Professional

    Google Scholar 

  9. Carney D, Centintemel U, Cherniack M, Convey C, Lee S, Seidman G, Stonebraker M, Tatbul N, Zdonik SB (2002) Monitoring streams: a new class of data management applications. In: Proceedings of the 28th international conference on very large data bases (VLDB), Hong Kong, China

    Google Scholar 

  10. Babu S, Widom J (2001) Continuous queries over data streams. SIGMOD Rec 30:109–120

    Article  Google Scholar 

  11. Liu L, Pu C, Tang W (1999) Continual queries for internet scale event-driven information delivery. IEEE Trans Knowl Data Eng 11:610–628

    Article  Google Scholar 

  12. Chen J, DeWitt DJ, Tian F, Wang Y (2000) NiagaraCQ: a scalable continuous query system for Internet databases. In: Proceedings of SIGMOD international conference on management of data, New York, USA

    Google Scholar 

  13. Dittrich J-P, Fischer PM, Kossmann D (2005) Agile: adaptive indexing for context-aware information filters. In: Proceedings of the SIGMOD international conference

    Google Scholar 

  14. Agarwal PK, Xie J, Yang J, Yu H (2006) Scalable continuous query processing by tracking hotspots. In: Proceedings of the 32nd international conference on very large data bases (VLDB), Seoul, Korea

    Google Scholar 

  15. Schreier U, Pirahesh H, Agrawal R, Mohan C (1991) Alert: an architecture for transforming a passive dbms into an active dbms. In: Proceedings international conference very large data bases

    Google Scholar 

  16. Cao H, Wolfson O, Xu B, Yin H (2005) Mobi-dic: mobile discovery of local resources in peer-to-peer wireless network. IEEE Data Eng Bull 28:11–18

    Google Scholar 

  17. Mokbel MF, Xiong X, Aref WG, Hambrusch S, Prabhakar S, Hammad M (2004) Place: a query processor for handling real-time spatio-temporal data streams (demo). In: Proceedings of the 30th conference on very large data bases (VLDB), Toronto, Canada

    Google Scholar 

  18. Hellerstein JM, Franklin MJ, Chandrasekaran S, Deshpande A, Hildrum K, Madden S, Raman V, Shah MA (2000) Adaptive query processing: technology in evolution. IEEE Data Eng Bull 23:7–18

    Google Scholar 

  19. Anicic D, Fodor P, Rudolph S, Stühmer R, Stojanovic N, Studer R (2010) A rule-based language for complex event processing and reasoning. In: Hitzler P, Lukasiewicz T (eds) Web reasoning and rule systems. Springer, Heidelberg

    Google Scholar 

  20. Hirzel M, Andrade H, Gedik B, Jacques-Silva G, Khandekar R, Kumar V, Mendell M, Nasgaard H, Schneider S, Soule R, Wu K-L (2013) IBM streams processing language: analyzing big data in motion. IBM J Res Dev 57:1–11

    Article  Google Scholar 

  21. Zikopoulos PC, Eaton C, DeRoos D, Deutsch T, Lapis G (2011) Understanding big data. McGraw-Hill, New York

    Google Scholar 

  22. Yao Y, Gehrke J (2003) Query processing in sensor networks. In: Proceedings of the first biennial conference on innovative data systems research (CIDR)

    Google Scholar 

  23. Zadorozhny V, Chrysanthis PK, Labrinidis A (2004) Algebraic optimization of data delivery patterns in mobile sensor networks. In: Proceedings of the 15th international workshop on database and expert systems applications (DEXA), Zaragoza, Spain

    Google Scholar 

  24. Li H-G, Chen S, Tatemura J, Agrawal D, Candan K, Hsiung W-P (2006) Safety guarantee of continuous join queries over punctuated data streams. In: Proceedings of the 32nd international conference on very large databases (VLDB), Seoul, Korea

    Google Scholar 

  25. Wolfson O, Sistla AP, Xu B, Zhou J, Chamberlain S (1999) Domino: databases for moving objects tracking. In: Proceedings of the SIGMOD international conference on management of data, Philadelphia, PA, USA

    Google Scholar 

  26. Avnur R, Hellerstein JM (2000) Eddies: continuously adaptive query processing. In: Proceedings of SIGMOD international conference on management of data, New York, USA

    Google Scholar 

  27. Chakravarthy S, Mishra D (1994) Snoop: an expressive event specification language for active databases. Data Knowl Eng 14:1–26

    Article  Google Scholar 

  28. Chakravarthy S, Krishnaprasad V, Anwar E, Kim SK (1994) Composite events for active databases: semantics, contexts and detection. In: Proceedings of the 20th international conference on very large data bases (VLDB), Santiago, Chile

    Google Scholar 

  29. Zheng Y, Capra L, Wolfson O, Yang H (2014) Urban computing: concepts methodologies and applications. ACM Trans Intell Syst Technol 5:1–55

    Google Scholar 

  30. Mansouri-Samani M, Sloman M (1997) GEM: a generalized event monitoring language for distributed systems. Distrib Eng J 4:96

    Article  Google Scholar 

  31. Rosenblum DS, Wolf AL (1997) A design framework for internet-scale event observation and notification. In: Proceedings of the 6th European software engineering conference, Zurich, Switzerland

    Google Scholar 

  32. Yuhara M, Bershad BN, Maeda C, Moss JEB (1994) Efficient packet demultiplexing for multiple endpoints and large messages. In: Proceedings of the 1994 winter USENIX conference

    Google Scholar 

  33. Bailey ML, Gopal B, Sarkar P, Pagels MA, Peterson LL (1994) Pathfinder: a pattern-based packet classifier. In: Proceedings of the 1st symposium on operating system design and implementation

    Google Scholar 

  34. Gatziu S, Dittrich KR (1994) Detecting composite events in active database systems using Petri nets. In: Proceedings of the 4th international workshop on research issues in data engineering: active database systems, Houston, TX, USA

    Google Scholar 

  35. Collet C, Coupaye T (1996) Primitive and composite events in NAOS. In: Proceedings of the 12th BDA Journées Bases de Données Avancées, Clermont-Ferrand, France

    Google Scholar 

  36. Bidoit N, Objois M (2007) Machine Flux de Données: comparaison de langages de requêtes continues. In: Proceedings of the 23rd BDA Journees Bases de Donnees Avancees, Marseille, France

    Google Scholar 

  37. Gehani NH, Jagadish HV, Shmueli O (1992) Event specification in an active object-oriented database. In: Proceedings of the ACM SIGMOD international conference on management of data

    Google Scholar 

  38. Pietzuch PR, Shand B, Bacon J (2004) Composite event detection as a generic middleware extension. IEEE Netw Mag Spec Issue Middlew Technol Future Commun Netw 18:44–55

    Google Scholar 

  39. Yoneki E, Bacon J (2005) Unified semantics for event correlation over time and space in hybrid network environments. In: Proceedings of the OTM conferences, pp 366–384

    Google Scholar 

  40. Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 11th international conference on data engineering, Taipei, Taiwan

    Google Scholar 

  41. Giordana A, Terenziani P, Botta M (2002) Recognizing and discovering complex events in sequences. In: Proceedings of the 13th international symposium on foundations of intelligent systems, London, UK

    Google Scholar 

  42. Wu E, Diao Y, Rizvi S (2006) High-performance complex event processing over streams. In: Proceedings of the ACM SIGMOD international conference on management of data, Chicago, IL, USA

    Google Scholar 

  43. Demers AJ, Gehrke J, Panda B, Riedewald M, Sharma V, White WM (2007) Cayuga: a general purpose event monitoring system. In: Proceedings of the conference on innovative data systems research (CIDR), pp 412–422

    Google Scholar 

  44. Balazinska M, Kwon Y, Kuchta N, Lee D (2007) Moirae: history-enhanced monitoring. In: Proceedings of the conference on innovative data systems research (CIDR)

    Google Scholar 

  45. Gehani NH, Jagadish HV (1991) Ode as an active database: constraints and triggers. In: Proceedings of the 17th international conference on very large data bases (VLDB), Barcelona, Spain

    Google Scholar 

  46. Gehani NH, Jugadish HV, Shmueli O (1992) Composite event specification in active databases: model & implementation. In: Proceedings of the 18th international conference on very large data bases, Vancouver, Canada

    Google Scholar 

  47. Gatziu S, Dittrich KR (1993) SAMOS: an active object-oriented database system. IEEE Q Bull Data Eng Spec Issue Act Databases

    Google Scholar 

  48. Adaikkalavan R (2002) Snoop event specification: formalization algorithms, and implementation using interval-based semantics. MS Thesis, University of Texas, Arlington

    Google Scholar 

  49. Chakravarthy S (1997) SENTINEL: an object-oriented DBMS with event-based rules. In: Proceedings of the ACM SIGMOD international conference on management of data, New York, USA

    Google Scholar 

  50. Jakobson G, Weissman MD (1993) Alarm correlation. IEEE Netw 7:52–59

    Article  Google Scholar 

  51. Liu G, Mok AK, Yang EJ (1999) Composite events for network event correlation. In: Proceedings of the 6th IFIP/IEEE international symposium on integrated network management, pp 247–260

    Google Scholar 

  52. Wu P, Bhatnagar R, Epshtein L, Bhandaru M, Shi Z (1998) Alarm correlation engine (ACE). In: Proceedings of the IEEE/IFIP network operation and management symposium, pp 733–742

    Google Scholar 

  53. Nygate YA (1995) Event correlation using rule and object based techniques. In: Proceedings of the IFIP/IEEE international symposium on integrated network management, pp 278–289

    Google Scholar 

  54. Appleby K, Goldszmidth G, Steinder M (2001) Yemanja – a layered event correlation engine for multi-domain server farms. Integr Netw Manag 7

    Google Scholar 

  55. Yemini SA, Kliger S, Mozes E, Yemini Y, Ohsie D (1996) High speed and robust event correlation. IEEE Commun Mag 34:82–90

    Article  Google Scholar 

  56. Roncancio CL (1998) Towards duration-based, constrained and dynamic event types. In: Proceedings of the 2nd international workshop on active, real-time, and temporal database systems

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Genoveva Vargas-Solar .

Editor information

Editors and Affiliations

Appendix 1

Appendix 1

The events e 1 and e 2 , used in the following definitions, are occurrences of the event types E 1 and E 2 respectively (with E 1  ≠ E 2 ) and can be any primitive or composite event type. An event is considered as durative, i.e., it has a duration going from the instant when it starts until the instant when it ends [56] and its occurrence time is represented by a time interval [startI-e, endI-e].

1.1.1 .1Binary Operators

Binary operators derive a new composite event from two input events (primitive or composite). The following binary operators are defined by most existing event models [35, 46, 47, 56]:

  • Disjunction: (E 1 | E 2 )

    There are two possible semantics for the disjunction operator “|”: exclusive-or and inclusive-or. Exclusive-or means that a composite event of type (E 1 | E 2 ) is initiated and terminated by the occurrence of e 1 of type E 1 or e 2 of type E 2 , whereas inclusive-or considers both events if they are simultaneous, i.e. they occur “at the same time”. In centralized systems, no couple of events can occur simultaneously and hence, the disjunction operator always corresponds to exclusive-or. In distributed systems, two events at different sites can occur simultaneously and hence, both exclusive-or and inclusive-or are applicable.

  • Conjunction: (E 1 , E 2 )

    A composite event of type (E 1 , E 2 ) occurs if both e 1 of type E 1 and e 2 of type E 2 occur, regardless their occurrence order. Event e 1 and e 2 may be produced at the same or at different sites. The event e 1 is the initiator of the composite event and the event e 2 is its terminator, or vice versa. Event e 1 and e 2 can overlap or they can be disjoint.

  • Sequence: (E 1 ; E 2 )

    A composite event of type (E 1 ; E 2 ) occurs when an e 2 of type E 2 occurs after e 1 of type E 1 has occurred. Then, sequence denotes that event e 1 “happens before” event e 2 . This implies that the end time of event e 1 is guaranteed to be less than the start time of event e 2 . However, the semantics of “happens before” differs, depending on whether composite event is a local or a global event. Therefore, although the syntax is the same for local and for global events, the two cases have to be considered separately.

  • Concurrency: (E 1 E 2 )

    A composite event of type (E 1 ║ E 2 ) occurs if both events e 1 of type E 1 and e 2 of type E 2 occur virtually simultaneously, i.e. “at the same time”. This implies that this operator applied to two distinct events is only applicable in global events; the events e 1 and e 2 occur at different sites and it is not possible to establish an order between them. The concurrency relation is commutative.

  • During: (E 2 during E 1 )

    The composite event of type (E 2 during E 1 ) occurs if an event e 2 of type E 2 happens during event e 1 of type E 1 , i.e. e 2 starts after the beginning of e 1 and ends before the end of e 1 .

  • Overlaps: (E 1 overlaps E 2 )

    The beginning of event e 1 of type E 1 is before the beginning of event e 2 of type E 2 and the end of e 1 is during e 2 or vice versa.

  • Meets: (E 1 meets E 2 )

    The beginning of event e 2 of type E 2 is immediately after the end of event e 1 of type E 1 .

  • Starts: (E 1 starts E 2 )

    The beginning of event e 1 of type E 1 and e 2 of type E 2 are simultaneous. The occurrence interval of (e 1 starts e 2 ) is [startT-e 1 , latest (endT-e 1 , endT-e 2 )].

  • Ends: (E 1 ends E 2 )

    The end of event e 1 of type E 1 and event e 2 of type E 2 are simultaneous. The ends relation is commutative. The occurrence interval of (e 1 ends e 2 ) is [earliest (startT-e 1 , startT-e 2 ), endT-e 2 ].

1.1.2 .2Selection Operators

Selection operators allow searching occurrences of an event type in the event history. The selection E [i] defines the occurrence of the i th element of a sequence of events of type E, iN; where N is a natural number greater than 0, during a predefined time interval I. The following selection operators are distinguished in event models such as SAMOS [34, 47]:

  • First occurrence: (* E in I)

    The event is produced after the first occurrence of an event of type E during the time interval I. The event will not be produced by all the other event occurrences of E during the interval.

  • History: (Times(n, E) in I)

    An event is produced when an event of type E has occurred with the specified frequency n during the time interval I.

  • Negation: (Not E in I)

    The event is produced if any occurrence of the event type E is not produced (i.e. the event did not occur) during the time interval I.

1.1.3 .3Temporal Operator

A composite event can be represented by the occurrence of an event and an offset (E + Δ), for example, E = E 1 + 00:15 to indicate fifteen minutes before the occurrence of an event of type E 1 . Thus, the occurrence time of E is [endT-e 1 , endT-e 1  + Δ].

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Vargas-Solar, G., Espinosa-Oviedo, J.A., Zechinelli-Martini, J.L. (2016). Big Continuous Data: Dealing with Velocity by Composing Event Streams. In: Yu, S., Guo, S. (eds) Big Data Concepts, Theories, and Applications . Springer, Cham. https://doi.org/10.1007/978-3-319-27763-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27763-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27761-5

  • Online ISBN: 978-3-319-27763-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics