Incremental Evaluation of Continuous Analytic Queries in HIFUN

Zervoudakis, Petros; Kondylakis, Haridimos; Plexousakis, Dimitris; Spyratos, Nicolas

doi:10.1007/978-3-030-44900-1_4

Petros Zervoudakis¹¹,
Haridimos Kondylakis¹¹,
Dimitris Plexousakis¹¹ &
…
Nicolas Spyratos¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1197))

Included in the following conference series:

International Workshop on Information Search, Integration, and Personalization

296 Accesses

Abstract

A huge amount of data is generated each day from various sources. Analysis of these massive data is difficult, and requires new forms of processing to enable enhanced decision making, insight discovery and process optimization. In addition, besides their ever increasing volume, datasets change frequently, and as such, results to continuous queries have to be updated at short intervals. In this paper, we address the problem of evaluating continuous queries over big data streams that are frequently updated, adopting HIFUN, a high-level query language introduced recently. HIFUN offers a clear separation between the conceptual layer, where analytic queries are defined independently of the nature and location of data, and the physical layer where queries are evaluated, by encoding them as map-reduce jobs or as SQL group-by queries. Using HIFUN, we devise an algorithm for incremental processing of continuous queries, processing only the most recent data partition, and exploiting already computed information, without requiring evaluating the query over the complete dataset. Subsequently, we translate the generic algorithm to both SQL and MapReduce using SPARK, exploiting the query rewriting method provided by HIFUN. The experiments performed show the advantages of our solution in terms of query answering efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agathangelos, G., Troullinou, G., Kondylakis, H., Stefanidis, K., Plexousakis, D.: Incremental data partitioning of RDF Data in SPARK. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 11155, pp. 50–54. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98192-5_10
Chapter Google Scholar
Agathangelos, G., Troullinou, G., Kondylakis, H., et al.: RDF Query answering using apache spark: review and assessment. In: ICDE Workshops, pp. 54–59 (2018)
Google Scholar
White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc., Sebastopol (2009)
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2004)
Article Google Scholar
Zaharia, M.A., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. Ann. Emerg. Med. 39(6), 691–692 (2002)
Article Google Scholar
Karimov, J., Rabl, T., Katsifodimos, A., Samarev, R., Heiskanen, H., Markl, V.: Benchmarking distributed stream data processing systems. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 1507–1518 (2018). Author, F.: Contribution title. In: 9th International Proceedings on Proceedings, pp. 1–2. Publisher, Location (2010)
Google Scholar
Zaharia, M.A., Das, T., Li, D.H., Hunter, T., Shenker, S., Stoica, I.: Discretized streams: fault-tolerant streaming computation at scale. In: SOSP (2013)
Google Scholar
Armbrust, M., et al.: Structured streaming: a declarative API for real-time applications in apache spark. In: SIGMOD Conference (2018)
Google Scholar
Iqbal, M.S., Soomro, T.R.: Big data analysis: apache storm perspective. Int. J. Comput. Trends Technol. 19, 9–14 (2015)
Article Google Scholar
Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache Flink™: stream and batch processing in a single engine. IEEE Data Eng. Bull. 38, 28–38 (2015)
Google Scholar
Akidau, T., et al.: The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. PVLDB 8, 1792–1803 (2015)
Google Scholar
Babu, S., Widom, J.: Continuous queries over data streams. ACM SIGMOD Rec. 30, 109–120 (2001)
Article Google Scholar
Gupta, A., Mumick, I.S.: Materialized Views: Techniques, Implementations, and Applications. MIT Press, Cambridge (1999)
Book Google Scholar
Blakeley, J.A., Larson, P., Tompa, F.W.: Efficiently updating materialized views. ACM SIGMOD Rec. 15, 61–71 (1986)
Google Scholar
Ahmad, Y., Kennedy, O., Koch, C., Nikolic, M.: DBToaster: higher-order delta processing for dynamic, frequently fresh views. PVLDB 5, 968–979 (2012)
Google Scholar
Spyratos, N., Sugibuchi, T.: HIFUN - a high level functional query language for big data analytics. J. Intell. Inf. Syst. 51, 529–555 (2018). https://doi.org/10.1007/s10844-018-0495-6
Article Google Scholar
Spyratos, N., Sugibuchi, T.: A high-level query language for big data analytics (2014)
Google Scholar
Jesus, P., Baquero, C., Almeida, P.S.: A survey of distributed data aggregation algorithms. IEEE Commun. Surv. Tutorials 17, 381–404 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, FORTH, Heraklion, Greece
Petros Zervoudakis, Haridimos Kondylakis & Dimitris Plexousakis
Laboratoire de Recherche en Informatique, UMR8623 of CNRS, Universite Paris-Sud 11, Orsay, France
Nicolas Spyratos

Authors

Petros Zervoudakis
View author publications
You can also search for this author in PubMed Google Scholar
Haridimos Kondylakis
View author publications
You can also search for this author in PubMed Google Scholar
Dimitris Plexousakis
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Spyratos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Petros Zervoudakis .

Editor information

Editors and Affiliations

Foundation for Research and Technology Hellas, Heraklion, Greece
Giorgos Flouris
University of Cergy-Pontoise, Cergy Pontoise, France
Dominique Laurent
Foundation for Research and Technology Hellas, Heraklion, Greece
Dimitris Plexousakis
University of Paris-Sud, Orsay, France
Nicolas Spyratos
Hokkaido University, Sapporo, Japan
Yuzuru Tanaka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zervoudakis, P., Kondylakis, H., Plexousakis, D., Spyratos, N. (2020). Incremental Evaluation of Continuous Analytic Queries in HIFUN. In: Flouris, G., Laurent, D., Plexousakis, D., Spyratos, N., Tanaka, Y. (eds) Information Search, Integration, and Personalization. ISIP 2019. Communications in Computer and Information Science, vol 1197. Springer, Cham. https://doi.org/10.1007/978-3-030-44900-1_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-44900-1_4
Published: 27 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44899-8
Online ISBN: 978-3-030-44900-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics