Optimized Analytics Query Allocation at the Edge of the Network
The new era of the Internet of Things (IoT) provides the space where novel applications will play a significant role in people’s daily lives through the adoption of multiple services that facilitate everyday activities. The huge volumes of data produced by numerous IoT devices make the adoption of analytics imperative to produce knowledge and support efficient decision making. In this setting, one can identify two main problems, i.e., the time required to send the data to Cloud and wait for getting the final response and the distributed nature of data collection. Edge Computing (EC) can offer the necessary basis for storing locally the collected data and provide the required analytics on top of them limiting the response time. In this paper, we envision multiple edge nodes where data are stored being the subject of analytics queries. We propose a methodology for allocating queries, defined by end users or applications, to the appropriate edge nodes in order to save time and resources in the provision of responses. By adopting our scheme, we are able to ask the execution of queries only from a sub-set of the available nodes avoiding to demand processing activities that will lead to an increased response time. Our model envisions the allocation to specific epochs and manages a batch of queries at a time. We present the formulation of our problem and the proposed solution while providing results of an extensive evaluation process that reveals the pros and cons of the proposed model.
KeywordsInternet of Things Edge Computing Large scale data Queries management
This research received funding from the European’s Union Horizon 2020 research and innovation programme under the grant agreement No. 745829 & the Greek Secretariat for Research Funding under the project ENFORCE.
- 3.Bowden, D., et al.: A cloud-to-edge architecture for predictive analytics. In: Workshops of the EDBT/ICDT Conference (2019)Google Scholar
- 4.Chai, Z., et al.: Towards taming the resource and data heterogeneity in federated learning. In: USENIX Conference on Operational Machine Learning (2019)Google Scholar
- 6.Chatterjea, S., Havunga, P.: A taxonomy of distributed query management techniques for wireless sensor networks. IJCS 20(7), 889–908 (2007)Google Scholar
- 8.Condie, T., et al.: MapReduce online. In: The 7th Conference on Networked Systems Design and Implementation (2010)Google Scholar
- 9.Cummins, R., et al.: A Polya urn document language model for improved information retrieval. ACM TIS 9(4), 21 (2010)Google Scholar
- 11.Huang, Z., Zhong, A., Li, G.: On-demand processing for remote sensing big data analysis. In: IEEE ISPDPA (2017)Google Scholar
- 12.Jermaine, C., et al.: Scalable approximate query processing with the DBO engine. In: SIGMOD (2007)Google Scholar
- 15.Kolomvatsos, K., Anagnostopoulos, C.: An edge-centric ensemble scheme for queries assignment. In: 8th CIMA Workshop (2018)Google Scholar
- 21.Murphree, J.: Machine learning anomaly detection in large systems. In: IEEE AUTOTESTCON, pp. 1–9 (2016)Google Scholar
- 22.Phansalkar, S., Ahirrao, S.: Survey of data partitioning algorithms for big data stores. In: 4th ICPDGC (2016)Google Scholar