Enhancing Dependability in Big Data Analytics Enterprise Pipelines

  • Hira Zahid
  • Tariq MahmoodEmail author
  • Nassar Ikram
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11342)


Big Data Analytics (BDA) brings extensive opportunities to enterprises to extract valuable information from high volume, velocity and variety data streams. However, the BDA dynamics can lead to significant project failures due to high-risk factors in terms of data availability, reliability, integrity, security and resilience which are the key components of a dependable system and are strongly linked to BDA process execution. Specifically, the heterogeneity of big data sources, diverse set of challenges related to big data integration and processing, along with a rapidly-expanding landscape warrant the need to make dependable big data systems capable of providing standard analytical solutions. In this paper, we propose the first dependable pipeline architecture for the BDA process which has a layered front-end and back-end implementation, employs the standard lambda architecture in a DataOps analytical cycle, incorporates state-of-the-art tools which are all open-source, and is coded entirely in the standard Python language to remove cross-platform implementation dependencies. We have implemented this architecture in five enterprise BDA projects but we are unable to present implementation details and results due to space limitations.


Big Data Analytics Dependability DataOps Pipeline Enterprise 


  1. 1.
    Dimov, A., Davidovic, N., Stoimenov, L., Baylov, K.: Software dependability management in Big Data distributed stream computing systems (2017)Google Scholar
  2. 2.
    Anthony, A.: Mastering AWS Security: Create and Maintain a Secure Cloud Ecosystem, 1st edn. Packt Publishing - eBooks Account, Birmingham (2017)Google Scholar
  3. 3.
    Asay, M.: 85% of big data projects fail, but your developers can help yours succeed (2017).
  4. 4.
    Bahga, A., Madisetti, V.: Big Data Science & Analytics: A Hands-On ApproachGoogle Scholar
  5. 5.
    Celebi, O.F., et al.: On use of big data for enhancing network coverage analysis. In: ICT 2013. IEEE, May 2013Google Scholar
  6. 6.
    Chang, B.R., Tsai, H.F., Lin, Z.Y., Chen, C.M.: Access-controlled video/voice over IP in hadoop system with BPNN intelligent adaptation. In: 2012 International Conference on Information Security and Intelligence Control (ISIC), pp. 325–328. IEEE (2012)Google Scholar
  7. 7.
    Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)CrossRefGoogle Scholar
  8. 8.
  9. 9.
    Daki, H., El Hannani, A., Aqqal, A., Haidine, A., Dahbi, A., Ouahmane, H.: Towards adopting big data technologies by mobile networks operators: A moroccan case study. In: 2016 2nd International Conference on Cloud Computing Technologies and Applications (CloudTech), pp. 154–161. IEEE (2016)Google Scholar
  10. 10.
    Datafloq: Top reasons of Hadoop - big data project failures (2017).
  11. 11.
    Demirkan, H., Dal, B.: The data economy: Why do so many analytics projects fail? (2014).
  12. 12.
    George, J., Chen, C.A., Stoleru, R., Xie, G.: Hadoop MapReduce for mobile clouds. IEEE Trans. Cloud Comput., 1 (2016)Google Scholar
  13. 13.
  14. 14.
    He, Y., Yu, F.R., Zhao, N., Yin, H., Yao, H., Qiu, R.C.: Big data analytics in mobile cellular networks. IEEE Access 4, 1985–1996 (2016)CrossRefGoogle Scholar
  15. 15.
    Khan, N., et al.: Big data: survey, technologies, opportunities, and challenges. Sci. World J. 2014, 1–18 (2014)Google Scholar
  16. 16.
    Khatib, E.J., Barco, R., Muñoz, P., De La Bandera, I., Serrano, I.: Self-healing in mobile networks with big data. IEEE Commun. Mag. 54(1), 114–120 (2016)CrossRefGoogle Scholar
  17. 17.
    Liebowitz, J.: Big Data and Business Analytics, 1st edn. CRC Press, Boca Raton (2013)CrossRefGoogle Scholar
  18. 18.
    Liu, J., Liu, F., Ansari, N.: Monitoring and analyzing big traffic data of a large-scale cellular network with hadoop. IEEE Network 28(4), 32–39 (2014)CrossRefGoogle Scholar
  19. 19.
    Magnusson, J., Kvernvik, T.: Subscriber classification within telecom networks utilizing big data technologies and machine learning. In: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining Algorithms, Systems, Programming Models and Applications-BigMine 2012. ACM Press (2012)Google Scholar
  20. 20.
    Manyika, J., et al.: Big Data: The Next Frontier for Innovation, Competition and Productivity (2011)Google Scholar
  21. 21.
    Marz, N., Warren, J.: Big Data: Principles and Best Practices of Scalable Realtime Data Systems. Manning Publications Co., Shelter Island (2015)Google Scholar
  22. 22.
    Nachiappan, R., Javadi, B., Calheiros, R.N., Matawie, K.M.: Cloud storage reliability for big data applications: a state of the art survey. J. Netw. Comput. Appl. 97, 35–47 (2017)CrossRefGoogle Scholar
  23. 23.
    Ohlhorst, F.J.: Big Data Analytics: Turning Big Data into Big Money, 1st edn. Wiley, Hoboken (2012)CrossRefGoogle Scholar
  24. 24.
    Rathore, M., Paul, A., Ahmad, A., Imran, M., Guizani, M.: High-speed network traffic analysis: detecting VoIP calls in secure big data streaming. In: 2016 IEEE 41st Conference on Local Computer Networks (LCN). IEEE, November 2016Google Scholar
  25. 25.
    Redis: Using redis as an lru cache (2018).
  26. 26.
    Senbalci, C., Altuntas, S., Bozkus, Z., Arsan, T.: Big data platform development with a domain specific language for telecom industries. In: 2013 High Capacity Optical Networks and Emerging/Enabling Technologies. IEEE, December 2013Google Scholar
  27. 27.
    Singh, P.: 10 reasons why big data and analytics projects fail (2017).
  28. 28.
    Tseng, J.C., et al.: A successful application of big data storage techniques implemented to criminal investigation for telecom. In: Network Operations and Management Symposium, pp. 1–3. IEEE (2013)Google Scholar
  29. 29.
    Turck, M.: Firing on all cylinders: the 2017 big data landscape (2017).
  30. 30.
  31. 31.
    Weiss, G.: Data mining in the telecommunications industry. GI Global (2009)Google Scholar
  32. 32.
    Wu, D., Zhu, L., Xu, X., Sakr, S., Lu, Q., Sun, D.: A pipeline framework for heterogeneous execution environment of big data processing. IEEE Softw. 1 (2016)Google Scholar
  33. 33.
    Yang, R., Xu, J.: Computing at massive scale: scalability and dependability challenges. In: 2016 IEEE Symposium on Service-Oriented System Engineering (SOSE). IEEE, March 2016Google Scholar
  34. 34.
    Diogenes, Y., Shinder, T., Shinder, D.: Microsoft Azure Security Infrastructure (IT Best Practices - Microsoft Press), 1st edn. Microsoft Press, Redmond (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceInstitute of Business AdministrationKarachiPakistan
  2. 2.Department of Computer ScienceNational University of Science and TechnologyIslamabadPakistan

Personalised recommendations