High-Level Automatic Event Detection and User Classification in a Social Network Context

Persia, Fabio; Helmer, Sven

doi:10.1007/978-3-030-36537-0_10

High-Level Automatic Event Detection and User Classification in a Social Network Context

Conference paper
First Online: 28 November 2019

354 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11720))

Abstract

We present a framework for high-level automatic event detection and user classification in a social network context based on a novel temporal extension of relational algebra, which improves and extends our earlier work in the video surveillance context. By means of intuitive and interactive graphical user interfaces, a user is able to gain insights into the inner workings of the system as well as create new event models and user categories on the fly and track their processing through the system in both offline and online modes. Compared to an earlier version, we extended our relational algebra framework with operators suited for processing data from a social network context. As a proof-of-concept we have predefined events and user categories, such as spamming and fake users, on both a synthetic and a real data set containing data related to the interactions of users with Facebook over a 2-year period.

This work was supported by an internal grant from the Free University of Bozen-Bolzano under IN2078 (HAMSIK - High-level AutoMatic event detection in a SocIal networK context).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/.
2.
https://www.govtrack.us/.
3.
We also provide an online help.
4.
https://www.dropbox.com/sh/um0yucb8810nrhu/AAAt5kbr9Tsz4moEgghKgxeja?dl=0.

References

Amato, F., et al.: Recognizing human behaviours in online social networks. Comput. Secur. 74, 355–370 (2018)
Article Google Scholar
Amato, F., De Santo, A., Moscato, V., Persia, F., Picariello, A.: Detecting unexplained human behaviors in social networks. In: Proceedings of the 2014 IEEE International Conference on Semantic Computing, ICSC 2014, pp. 143–150. IEEE, Newport Beach (2014)
Google Scholar
Benevenuto, F., Rodrigues, T., Cha, M., Almeida, V.: Characterizing user navigation and interactions in online social networks. Inf. Sci. 195, 1–24 (2012)
Article Google Scholar
Dignös, A., Böhlen, M., Gamper, J.: Overlap interval partition join. In: International Conference on Management of Data, SIGMOD 2014, pp. 1459–1470. ACM, Snowbird (2014)
Google Scholar
Helmer, S., Persia, F.: High-level surveillance event detection using an interval-based query language. In: Proceedings of 2016 IEEE International Conference on Semantic Computing, ICSC 2016, pp. 39–46. IEEE, Laguna Hills (2016)
Google Scholar
Helmer, S., Persia, F.: ISEQL: an interval-based surveillance event query language. Int. J. Multimed. Data Eng. Manag. (IJMDEM) 7(4), 1–21 (2016)
Article Google Scholar
Irwin, A.S.M.: Double-edged sword: dual-purpose cyber security methods. In: Prunckun, H. (ed.) Cyber Weaponry. ASTSA, pp. 101–112. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74107-9_8
Chapter Google Scholar
Persia, F., Bettini, F., Helmer, S.: An interactive framework for video surveillance event detection and modeling. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, pp. 2515–2518. ACM, Singapore (2017)
Google Scholar
Persia, F., Bettini, F., Helmer, S.: Labeling the frames of a video stream with interval events. In: Proceedings of the 2017 IEEE International Conference on Semantic Computing, ICSC 2017, pp. 204–211. IEEE, San Diego (2017)
Google Scholar
Persia, F., Helmer, S.: A framework for high-level event detection in a social network context via an extension of ISEQL. In: Proceedings of the 2018 IEEE International Conference on Semantic Computing, ICSC 2018, pp. 140–147. IEEE, Laguna Hills (2018)
Google Scholar
Piatov, D., Helmer, S., Dignös, A.: An interval join optimized for modern hardware. In: Proceedings of the 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp. 1098–1109. IEEE, Helsinki (2016)
Google Scholar
Schneider, F., Feldmann, A., Krishnamurthy, B., Willinger, W.: Understanding online social network usage from a network perspective. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement, pp. 35–48. ACM, New York (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Free University of Bozen-Bolzano, Piazza Domenicani 3, 39100, Bolzano, Italy
Fabio Persia
University of Zurich, Binzmühlestrasse 14, 8050, Zurich, Switzerland
Sven Helmer

Authors

Fabio Persia
View author publications
You can also search for this author in PubMed Google Scholar
Sven Helmer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabio Persia .

Editor information

Editors and Affiliations

George Mason University , Fairfax, VA, USA
Massimiliano Albanese
University of Luxembourg, Esch-sur-Alzette, Luxembourg
Ross Horne
Unitec Institute of Technology, Auckland, New Zealand
Christian W. Probst

Appendices

Appendix

A The Framework Functionalities

In this appendix we take a closer look at the main functionalities provided by the framework for high-level automatic event detection and user classification. More specifically, we list them below and give more details in the following sections.

Detection of Low-Level Annotations (Sect. A.1).
Detection of Medium-Level Annotations (Sect. A.2).
Detection of High-Level Event Occurrences (Sect. A.3).
Detection of User Classifications (Sect. A.4).
Automatic High-Level Event Detection (Sect. A.5).
Definition of a New Atomic Predicate (Sect. A.6).
Definition of a New Medium-Level Predicate (Sect. A.7).
Definition of a New High-Level Event Model (Sect. A.8).

1.1 A.1 Detection of Low-Level Annotations

This functionality allows to import all the low-level annotations occurring within a specified temporal window. So far, they can be imported from two different sources. These annotations are used as input for further processing steps. The first one is a real data set containing data related to the interactions of users with Facebook over a 2-year period, previously collected for [2]. The second one is a synthetic data set, generated by a specifically designated tool, whose size and the density of the events (i.e., the number of events per time unit) can be controlled via parameters by users. However, the system is flexible enough to easily allow in the future imports from other sources, including live data streams, compatibly with the related privacy policy.

1.2 A.2 Detection of Medium-Level Annotations

This functionality detects the interval labeling corresponding to the captured OSN Log (Fig. 1). More specifically, Fig. 7 shows the medium-level annotations corresponding to the OSN Log listed in Table 1. In Fig. 7 we use as source the synthetically generated data set unibz mentioned in Sect. A.1 and the medium-level predicates status&friends, messages, photos, session, shares, like, logout in offline mode. The collected interval labeling is shown on the right-hand side of Fig. 7 and can be further processed in order to infer both high-level events and user classifications.

1.3 A.3 Detection of High-Level Event Occurrences

This functionality detects occurrences of high-level events whose models are stored in the knowledge base. In this framework the knowledge base of event models is stored as set of stored procedures in the PostgreSQL database management system. As shown in Fig. 8, the user simply needs to select the event to be discovered (SPAM in this case), and the data set to be investigated (unibz in this case). Clearly, for a data set to be available, it has to be first processed, i.e., it has to be labeled via interval labeling). The result of the use case shown in Fig. 8 is an instance of the SPAM event detected for the user FABIO from 11:39:12 to 19:40:00.

1.4 A.4 Detection of User Classifications

Similarly to the procedure in Sect. A.3, this functionality allows us to discover the category to which each OSN user belongs. More specifically, a client interested in carrying out an OSN user classification has just to specify the following:

the particular OSN user to be analyzed;
the time window where he/she wants to classify the selected OSN user;
the data set taken as reference.

As a result, a specific category is assigned to the OSN user depending on the classification of his/her sessions (spamming, status&friends, messages, photos, like, and inactive) that appears most frequently. Thus, the categories to which the OSN user could belong are respectively Spammer, Interactive with Friends, Message Sender, Photo Poster, Like Adder, and Fake User. This is due to the fact that all the defined event models are flexible, so they can be also applied to classify users themselves, thus working at a lower (user) granularity.

1.5 A.5 Automatic High-Level Event Detection

This functionality allows to automatically carry out the overall process described in Sects. A.1, A.2, A.3, and A.4. As a result, the user just needs to specify all the inputs necessary in the previous sections once, and the whole process shown in Fig. 1 is performed; consequently, the output are the high-level events and the user classifications satisfying the inserted constraints.

The process can be run in both offline and online modes. For the online mode, we stream one of the data sets past the event detector, emitting the atomic events according to their timestamps.

1.6 A.6 Definition of a New Atomic Predicate

This functionality allows the user to add another atomic event to the set of atomic actions listed in Table 2. For instance, the user in the use case shown in Fig. 9 inserts the atomic action named Interact with Game.

1.7 A.7 Definition of a New Medium-Level Predicate

Similarly to adding atomic predicates as described in Sect. A.6, this functionality allows us to insert a new medium-level predicate into the set of categories listed in Table 2. More specifically, by means of another smart graphical user interface, the user is able to directly write the PL/pgSQL code of the new medium-level predicate, also specifying the relationships with the low-level atomic actions.

1.8 A.8 Definition of a New High-Level Event Model

As mentioned in Sect. 2, this functionality allows a user who is not familiar with relational algebra to easily define a high-level event model; Figs. 5 and 6 illustrate an example for using the smart graphical user interface for defining the Spamming event model.

In order to illustrate the advantages of the user interface, we describe the procedure for defining a new event model in the following. This is done in a step-by-step manner, by asking the user for (see Fig. 5):

the name of the new event (field Event Name);
the data set he or she would like to explore (from a list of available data sets) (field Data Set);
the medium-level predicates (or, as an alternative, already-defined events) associated with the intervals (operands) that he or she is currently adding to the global event (fields First Operand, Second Operand);
optional values for the arguments of the first/second operand in case of a medium-level predicate (field Argument, close to First/Second Operand); arguments can be easily added by clicking on the Add Argument button;
the possibility to carry out set operations between the two inserted interval predicates (field Operation);
drawing the two intervals (after clicking on the Draw Intervals button); then, the application core will capture the values of the left and right endpoints of both intervals (see for instance first and second lines of Fig. 6);
specifying how often the left/right interval (fields Left/Right Cardinality, respectively) has to appear in the result set. If the user selects YES, a pop-up window will ask to select among three options; at least k times (k to be specified), more than one tuple (*), or exactly one tuple (one) [10]; otherwise, no further constraints are added;
specifying the overlap percentage between the two intervals with respect to the left/right interval (fields Left/Right Overlap Percentage, respectively). In case the user selects YES, a pop-up window will ask for the overlap percentage (from \(0\%\) to \(100\%\)); otherwise, no further constraints are added;
whether he or she wants to take into account the relationships between the left/right endpoints (fields Left Side, Right Side);
the maximum distance between interval endpoints (fields Left/Right Threshold); in case of overlapping events checking whether to take into account the distance between the left endpoints of the first and second operand or between the right endpoints of the two operands. In case of non-overlapping events, a user has to specify whether to take into account the distance between the right endpoint of the first operand and the left endpoint of the second operand or between the left endpoint of the first operand and the right endpoint of the second operand. Depending on the information provided by the user, the application core infers the specific operator that will be applied.
the optional additional constraints between the first and second interval he or she would like to add, starting from the partial result set (clicking on Add EC, close to the External Conditions field, and then allowing the addition of constraints via a mask);
the fields he or she would like to project with reference to the current result set (field Field); the user just needs to select the fields to be projected, and click on Add i-th Field;
whether he or she wants to add more intervals to the complex event he or she is defining (field Add a new Sub-Event); in that case, the process is repeated starting from the third bullet point;
whether he or she wants to store the event model as a PL/pgSQL procedure (field Storing Procedure).

After each step the application core checks the consistency of the input. At the end of the procedure, a summary with the retrieved instances, if any, will be visible to the user.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Persia, F., Helmer, S. (2019). High-Level Automatic Event Detection and User Classification in a Social Network Context. In: Albanese, M., Horne, R., Probst, C. (eds) Graphical Models for Security. GraMSec 2019. Lecture Notes in Computer Science(), vol 11720. Springer, Cham. https://doi.org/10.1007/978-3-030-36537-0_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-36537-0_10
Published: 28 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36536-3
Online ISBN: 978-3-030-36537-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics