Abstract
The rate at which we produce data is growing steadily, thus creating even larger streams of continuously evolving data. Online news, micro-blogs, search queries are just a few examples of these continuous streams of user activities. The value of these streams relies in their freshness and relatedness to on-going events. Modern applications consuming these streams need to extract behaviour patterns that can be obtained by aggregating and mining statically and dynamically huge event histories. An event is the notification that a happening of interest has occurred. Event streams must be combined or aggregated to produce more meaningful information. By combining and aggregating them either from multiple producers, or from a single one during a given period of time, a limited set of events describing meaningful situations may be notified to consumers. Event streams with their volume and continuous production cope mainly with two of the characteristics given to Big Data by the 5V’s model: volume & velocity. Techniques such as complex pattern detection, event correlation, event aggregation, event mining and stream processing, have been used for composing events. Nevertheless, to the best of our knowledge, few approaches integrate different composition techniques (online and post-mortem) for dealing with Big Data velocity. This chapter gives an analytical overview of event stream processing and composition approaches: complex event languages, services and event querying systems on distributed logs. Our analysis underlines the challenges introduced by Big Data velocity and volume and use them as reference for identifying the scope and limitations of results stemming from different disciplines: networks, distributed systems, stream databases, event composition services, and data mining on traces.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
E 1 |E 2 matches either events of type E 1 or E 2 .
- 5.
E* is the concatenation of zero or more events of type E.
References
Jagadish HV, Gehrke J, Labrinidis A, Papakonstantinou Y, Patel JM, Ramakrishnan R, Shahabi C (2014) Big Data and its technical challenges. Commun ACM 57:86–94
Terry D, Goldberg D, Nichols D, Oki B (1992) Continuous queries over append-only databases. ACM SIGMOD Record
Zheng B, Lee DL (2001) Semantic caching in location-dependent query processing. In: Proceedings of the 7th international symposium on advances in spatial and temporal databases (SSTD), Redondo Beach, CA, USA
Urhan T, Franklin MJ (2000) Xjoin: a reactively-scheduled pipelined join operator. IEEE Data Eng Bull 23:27–33
De Francisci Morales G (2013) SAMOA: a platform for mining Big Data streams. In: Proceedings of the 22nd international conference on World Wide Web Companion, Geneva, Switzerland, pp 777–778
Adiba M, Castrejón JC, Espinosa-Oviedo JA, Vargas-Solar G, Zechinelli-Martini JL (2015) Big data management: challenges, approaches, tools and their limitations. Networking for big data
Abiteboul S, Manolescu I, Benjelloun O, Milo T, Cautis B, Preda N (2004) Lazy query evaluation for active xml. In: Proceedings of the SIGMOD international conference
Luckham D (2002) The power of events: an introduction to complex event processing in distributed systems. Addison Wesley Professional
Carney D, Centintemel U, Cherniack M, Convey C, Lee S, Seidman G, Stonebraker M, Tatbul N, Zdonik SB (2002) Monitoring streams: a new class of data management applications. In: Proceedings of the 28th international conference on very large data bases (VLDB), Hong Kong, China
Babu S, Widom J (2001) Continuous queries over data streams. SIGMOD Rec 30:109–120
Liu L, Pu C, Tang W (1999) Continual queries for internet scale event-driven information delivery. IEEE Trans Knowl Data Eng 11:610–628
Chen J, DeWitt DJ, Tian F, Wang Y (2000) NiagaraCQ: a scalable continuous query system for Internet databases. In: Proceedings of SIGMOD international conference on management of data, New York, USA
Dittrich J-P, Fischer PM, Kossmann D (2005) Agile: adaptive indexing for context-aware information filters. In: Proceedings of the SIGMOD international conference
Agarwal PK, Xie J, Yang J, Yu H (2006) Scalable continuous query processing by tracking hotspots. In: Proceedings of the 32nd international conference on very large data bases (VLDB), Seoul, Korea
Schreier U, Pirahesh H, Agrawal R, Mohan C (1991) Alert: an architecture for transforming a passive dbms into an active dbms. In: Proceedings international conference very large data bases
Cao H, Wolfson O, Xu B, Yin H (2005) Mobi-dic: mobile discovery of local resources in peer-to-peer wireless network. IEEE Data Eng Bull 28:11–18
Mokbel MF, Xiong X, Aref WG, Hambrusch S, Prabhakar S, Hammad M (2004) Place: a query processor for handling real-time spatio-temporal data streams (demo). In: Proceedings of the 30th conference on very large data bases (VLDB), Toronto, Canada
Hellerstein JM, Franklin MJ, Chandrasekaran S, Deshpande A, Hildrum K, Madden S, Raman V, Shah MA (2000) Adaptive query processing: technology in evolution. IEEE Data Eng Bull 23:7–18
Anicic D, Fodor P, Rudolph S, Stühmer R, Stojanovic N, Studer R (2010) A rule-based language for complex event processing and reasoning. In: Hitzler P, Lukasiewicz T (eds) Web reasoning and rule systems. Springer, Heidelberg
Hirzel M, Andrade H, Gedik B, Jacques-Silva G, Khandekar R, Kumar V, Mendell M, Nasgaard H, Schneider S, Soule R, Wu K-L (2013) IBM streams processing language: analyzing big data in motion. IBM J Res Dev 57:1–11
Zikopoulos PC, Eaton C, DeRoos D, Deutsch T, Lapis G (2011) Understanding big data. McGraw-Hill, New York
Yao Y, Gehrke J (2003) Query processing in sensor networks. In: Proceedings of the first biennial conference on innovative data systems research (CIDR)
Zadorozhny V, Chrysanthis PK, Labrinidis A (2004) Algebraic optimization of data delivery patterns in mobile sensor networks. In: Proceedings of the 15th international workshop on database and expert systems applications (DEXA), Zaragoza, Spain
Li H-G, Chen S, Tatemura J, Agrawal D, Candan K, Hsiung W-P (2006) Safety guarantee of continuous join queries over punctuated data streams. In: Proceedings of the 32nd international conference on very large databases (VLDB), Seoul, Korea
Wolfson O, Sistla AP, Xu B, Zhou J, Chamberlain S (1999) Domino: databases for moving objects tracking. In: Proceedings of the SIGMOD international conference on management of data, Philadelphia, PA, USA
Avnur R, Hellerstein JM (2000) Eddies: continuously adaptive query processing. In: Proceedings of SIGMOD international conference on management of data, New York, USA
Chakravarthy S, Mishra D (1994) Snoop: an expressive event specification language for active databases. Data Knowl Eng 14:1–26
Chakravarthy S, Krishnaprasad V, Anwar E, Kim SK (1994) Composite events for active databases: semantics, contexts and detection. In: Proceedings of the 20th international conference on very large data bases (VLDB), Santiago, Chile
Zheng Y, Capra L, Wolfson O, Yang H (2014) Urban computing: concepts methodologies and applications. ACM Trans Intell Syst Technol 5:1–55
Mansouri-Samani M, Sloman M (1997) GEM: a generalized event monitoring language for distributed systems. Distrib Eng J 4:96
Rosenblum DS, Wolf AL (1997) A design framework for internet-scale event observation and notification. In: Proceedings of the 6th European software engineering conference, Zurich, Switzerland
Yuhara M, Bershad BN, Maeda C, Moss JEB (1994) Efficient packet demultiplexing for multiple endpoints and large messages. In: Proceedings of the 1994 winter USENIX conference
Bailey ML, Gopal B, Sarkar P, Pagels MA, Peterson LL (1994) Pathfinder: a pattern-based packet classifier. In: Proceedings of the 1st symposium on operating system design and implementation
Gatziu S, Dittrich KR (1994) Detecting composite events in active database systems using Petri nets. In: Proceedings of the 4th international workshop on research issues in data engineering: active database systems, Houston, TX, USA
Collet C, Coupaye T (1996) Primitive and composite events in NAOS. In: Proceedings of the 12th BDA Journées Bases de Données Avancées, Clermont-Ferrand, France
Bidoit N, Objois M (2007) Machine Flux de Données: comparaison de langages de requêtes continues. In: Proceedings of the 23rd BDA Journees Bases de Donnees Avancees, Marseille, France
Gehani NH, Jagadish HV, Shmueli O (1992) Event specification in an active object-oriented database. In: Proceedings of the ACM SIGMOD international conference on management of data
Pietzuch PR, Shand B, Bacon J (2004) Composite event detection as a generic middleware extension. IEEE Netw Mag Spec Issue Middlew Technol Future Commun Netw 18:44–55
Yoneki E, Bacon J (2005) Unified semantics for event correlation over time and space in hybrid network environments. In: Proceedings of the OTM conferences, pp 366–384
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 11th international conference on data engineering, Taipei, Taiwan
Giordana A, Terenziani P, Botta M (2002) Recognizing and discovering complex events in sequences. In: Proceedings of the 13th international symposium on foundations of intelligent systems, London, UK
Wu E, Diao Y, Rizvi S (2006) High-performance complex event processing over streams. In: Proceedings of the ACM SIGMOD international conference on management of data, Chicago, IL, USA
Demers AJ, Gehrke J, Panda B, Riedewald M, Sharma V, White WM (2007) Cayuga: a general purpose event monitoring system. In: Proceedings of the conference on innovative data systems research (CIDR), pp 412–422
Balazinska M, Kwon Y, Kuchta N, Lee D (2007) Moirae: history-enhanced monitoring. In: Proceedings of the conference on innovative data systems research (CIDR)
Gehani NH, Jagadish HV (1991) Ode as an active database: constraints and triggers. In: Proceedings of the 17th international conference on very large data bases (VLDB), Barcelona, Spain
Gehani NH, Jugadish HV, Shmueli O (1992) Composite event specification in active databases: model & implementation. In: Proceedings of the 18th international conference on very large data bases, Vancouver, Canada
Gatziu S, Dittrich KR (1993) SAMOS: an active object-oriented database system. IEEE Q Bull Data Eng Spec Issue Act Databases
Adaikkalavan R (2002) Snoop event specification: formalization algorithms, and implementation using interval-based semantics. MS Thesis, University of Texas, Arlington
Chakravarthy S (1997) SENTINEL: an object-oriented DBMS with event-based rules. In: Proceedings of the ACM SIGMOD international conference on management of data, New York, USA
Jakobson G, Weissman MD (1993) Alarm correlation. IEEE Netw 7:52–59
Liu G, Mok AK, Yang EJ (1999) Composite events for network event correlation. In: Proceedings of the 6th IFIP/IEEE international symposium on integrated network management, pp 247–260
Wu P, Bhatnagar R, Epshtein L, Bhandaru M, Shi Z (1998) Alarm correlation engine (ACE). In: Proceedings of the IEEE/IFIP network operation and management symposium, pp 733–742
Nygate YA (1995) Event correlation using rule and object based techniques. In: Proceedings of the IFIP/IEEE international symposium on integrated network management, pp 278–289
Appleby K, Goldszmidth G, Steinder M (2001) Yemanja – a layered event correlation engine for multi-domain server farms. Integr Netw Manag 7
Yemini SA, Kliger S, Mozes E, Yemini Y, Ohsie D (1996) High speed and robust event correlation. IEEE Commun Mag 34:82–90
Roncancio CL (1998) Towards duration-based, constrained and dynamic event types. In: Proceedings of the 2nd international workshop on active, real-time, and temporal database systems
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix 1
Appendix 1
The events e 1 and e 2 , used in the following definitions, are occurrences of the event types E 1 and E 2 respectively (with E 1 ≠ E 2 ) and can be any primitive or composite event type. An event is considered as durative, i.e., it has a duration going from the instant when it starts until the instant when it ends [56] and its occurrence time is represented by a time interval [startI-e, endI-e].
1.1.1 .1Binary Operators
Binary operators derive a new composite event from two input events (primitive or composite). The following binary operators are defined by most existing event models [35, 46, 47, 56]:
-
Disjunction: (E 1 | E 2 )
There are two possible semantics for the disjunction operator “|”: exclusive-or and inclusive-or. Exclusive-or means that a composite event of type (E 1 | E 2 ) is initiated and terminated by the occurrence of e 1 of type E 1 or e 2 of type E 2 , whereas inclusive-or considers both events if they are simultaneous, i.e. they occur “at the same time”. In centralized systems, no couple of events can occur simultaneously and hence, the disjunction operator always corresponds to exclusive-or. In distributed systems, two events at different sites can occur simultaneously and hence, both exclusive-or and inclusive-or are applicable.
-
Conjunction: (E 1 , E 2 )
A composite event of type (E 1 , E 2 ) occurs if both e 1 of type E 1 and e 2 of type E 2 occur, regardless their occurrence order. Event e 1 and e 2 may be produced at the same or at different sites. The event e 1 is the initiator of the composite event and the event e 2 is its terminator, or vice versa. Event e 1 and e 2 can overlap or they can be disjoint.
-
Sequence: (E 1 ; E 2 )
A composite event of type (E 1 ; E 2 ) occurs when an e 2 of type E 2 occurs after e 1 of type E 1 has occurred. Then, sequence denotes that event e 1 “happens before” event e 2 . This implies that the end time of event e 1 is guaranteed to be less than the start time of event e 2 . However, the semantics of “happens before” differs, depending on whether composite event is a local or a global event. Therefore, although the syntax is the same for local and for global events, the two cases have to be considered separately.
-
Concurrency: (E 1 ║ E 2 )
A composite event of type (E 1 ║ E 2 ) occurs if both events e 1 of type E 1 and e 2 of type E 2 occur virtually simultaneously, i.e. “at the same time”. This implies that this operator applied to two distinct events is only applicable in global events; the events e 1 and e 2 occur at different sites and it is not possible to establish an order between them. The concurrency relation is commutative.
-
During: (E 2 during E 1 )
The composite event of type (E 2 during E 1 ) occurs if an event e 2 of type E 2 happens during event e 1 of type E 1 , i.e. e 2 starts after the beginning of e 1 and ends before the end of e 1 .
-
Overlaps: (E 1 overlaps E 2 )
The beginning of event e 1 of type E 1 is before the beginning of event e 2 of type E 2 and the end of e 1 is during e 2 or vice versa.
-
Meets: (E 1 meets E 2 )
The beginning of event e 2 of type E 2 is immediately after the end of event e 1 of type E 1 .
-
Starts: (E 1 starts E 2 )
The beginning of event e 1 of type E 1 and e 2 of type E 2 are simultaneous. The occurrence interval of (e 1 starts e 2 ) is [startT-e 1 , latest (endT-e 1 , endT-e 2 )].
-
Ends: (E 1 ends E 2 )
The end of event e 1 of type E 1 and event e 2 of type E 2 are simultaneous. The ends relation is commutative. The occurrence interval of (e 1 ends e 2 ) is [earliest (startT-e 1 , startT-e 2 ), endT-e 2 ].
1.1.2 .2Selection Operators
Selection operators allow searching occurrences of an event type in the event history. The selection E [i] defines the occurrence of the i th element of a sequence of events of type E, i ∈ N; where N is a natural number greater than 0, during a predefined time interval I. The following selection operators are distinguished in event models such as SAMOS [34, 47]:
-
First occurrence: (* E in I)
The event is produced after the first occurrence of an event of type E during the time interval I. The event will not be produced by all the other event occurrences of E during the interval.
-
History: (Times(n, E) in I)
An event is produced when an event of type E has occurred with the specified frequency n during the time interval I.
-
Negation: (Not E in I)
The event is produced if any occurrence of the event type E is not produced (i.e. the event did not occur) during the time interval I.
1.1.3 .3Temporal Operator
A composite event can be represented by the occurrence of an event and an offset (E + Δ), for example, E = E 1 + 00:15 to indicate fifteen minutes before the occurrence of an event of type E 1 . Thus, the occurrence time of E is [endT-e 1 , endT-e 1 + Δ].
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Vargas-Solar, G., Espinosa-Oviedo, J.A., Zechinelli-Martini, J.L. (2016). Big Continuous Data: Dealing with Velocity by Composing Event Streams. In: Yu, S., Guo, S. (eds) Big Data Concepts, Theories, and Applications . Springer, Cham. https://doi.org/10.1007/978-3-319-27763-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-27763-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27761-5
Online ISBN: 978-3-319-27763-9
eBook Packages: Computer ScienceComputer Science (R0)