Improvement of Implemented Infrastructure for Streaming Outlier Detection in Big Data with ELK Stack

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 746)

Abstract

Nowadays the usage of internet is constantly increasing the amount of data. As a result the need for analyzing this data has recently emerged as we need to face a new phenomena known as the Big Data. This research is focused in finding appropriate architecture for real-time big data analytics and its main task is to detect anomalies in this real-time data. There are some tools that are used and analyzed by us in order to find the best one, but in this paper we use Timeline and compare it with Fluentd which is the tool we used in previous research [12]. Here we are going to show the reasons why Timelion is better than Fluentd. Anomaly detection in real-time big data is a problem that faces many organizations and it is a challenge for researchers as well. Our research deals with developing infrastructure for monitoring e-dnevnik (education national system in Macedonia) application server and to detect errors in order to scale up the performance. In order to enable this infrastructure to detect anomalies in streaming data we implement different algorithms for anomaly detection in Timelion. Another important thing is to know how to visualize the results. In this paper, we show the visualization of an e-dnevnik log by using Logstash, Elasticsearch, Kibana, and also how Timelion helps us to identify anomalies in real time.

Keywords

Log data Anomaly detection World Wide Web  Real-time big data Timelion Visualization Kibana CSV log data  Fluentd Logstash Elasticsearch 

References

  1. 1.
    Aggarwal, C.C.: Outlier Analysis. Springer Science+Business Media, New York (2013)CrossRefGoogle Scholar
  2. 2.
    Hasani, Z., Kon-Popovska, M., Velinov, G.: Survey of technologies for real-time big data streams analytic. In: 11th International Conference on Informatics and Information Technologies, Bitola, Macedonia, 11–13 April 2014Google Scholar
  3. 3.
    Hasani, Z., Kon-Popovska, M., Velinov, G.: Lambda architecture for real-time big data analytic. In: ICT Innovations 2014 Web Proceedings (2014). ISSN 1857-7288Google Scholar
  4. 4.
    Hasani, Z.: Performance comparison throws running job in Hadoop by defining the number of maps and reduces. In: 12th International Conference on Informatics and Information Technologies 2015, Bitola, Macedonia, 24–26 April 2015Google Scholar
  5. 5.
    Hasani, Z.: Virtuoso, system for saving semantic data. In: 12th International Conference on Informatics and Information Technologies 2015, Bitola, Macedonia, 24–26 April 2015Google Scholar
  6. 6.
    Hasani, Z.: Robust anomaly detection algorithms for real-time big data: comparison of algorithms. In: 6th Mediterranean Conference on Embedded Computing (MECO). IEEE (2017)Google Scholar
  7. 7.
  8. 8.
    Kibana Timelion - Anomaly Detection, 18 January 2017. https://rmoff.net/2017/01/18/kibana-timelion-anomaly-detection/. Accessed 28 July 2017
  9. 9.
  10. 10.
    Hasani, Z., Jakimovski, B., Kon-Popovska, M., Velinov, G.: Real-time analytics of SQL queries based on log analytic. In: ICT Innovations 2015 Web Proceedings (2015). http://proceedings.ictinnovations.org/attachment/conference/12/ict-innovations-2015-web-proceedings.pdf. ISSN 1857–7288
  11. 11.
    Tamura, K.: Elasticsearch, Fluentd, and Kibana: Open Source Log Search and Visualization. https://www.digitalocean.com/community/tutorials/elasticsearch-fluentd-and-kibana-open-source-log-search-and-visualization. Accessed 7 Jan 2016
  12. 12.
    Hasani, Z.: Implementation of infrastructure for streaming outlier detection in big data. In: Rocha, Á., Correia, A., Adeli, H., Reis, L., Costanzo, S. (eds.) Recent Advances in Information Systems and Technologies, WorldCIST 2017. Advances in Intelligent Systems and Computing, vol. 570. Springer, Cham (2017)Google Scholar
  13. 13.
    Kibana Timelion - Anomaly Detection, 18 January 2017. https://rmoff.net/2017/01/18/kibana-timelion-anomaly-detection/. Accessed 05 July 2017
  14. 14.
  15. 15.
    Comparison between Fluentd and Logstash. https://logz.io/blog/fluentd-logstash/. Accessed 20 Sept 2017

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Faculty of Computer ScienceUniversity of Prizren “Ukshin Hoti”PrizrenKosovo
  2. 2.Faculty of Computer Science and TechnologiesSouth East European UniversityTetovoMacedonia

Personalised recommendations