Industrial Big Data Analytics: Challenges and Opportunities

  • Abdulrahman Al-Abassi
  • Hadis Karimipour
  • Hamed HaddadPajouh
  • Ali Dehghantanha
  • Reza M. PariziEmail author


Manufacturing industries generate a large amount of data from various devices, systems and applications. Challenges, including both data management and data analysis exist in Industry 4.0 with few solutions to handle processing large amounts of data. The data needs to be processed, analyzed and secured to help improve the systems efficiency, safety and scalability. Hence, a new approach is needed to support industrial big data analytics. Industry 4.0 is a new advanced manufacturing vision originated by the German government. Since it is a new concept, there are only several existing surveys that discuss the connection between cyber physical systems and industrial big data analytics. Therefore, this survey will present new concepts, methodologies and application scenarios to reach full industrial autonomy and bring more attention to existing challenges between big data analytics and cyber physical systems. Current solutions, implemented through cyber physical systems, are discussed to highlight desired future research directions.


Big data IT Cyber physical systems IoT Industry 4.0 


  1. 1.
    M. Brettel, N. Friederchsen, M. Keller, M. Rosenberg, How virtualization, decentralization and network building change the manufacturing landscape: An industry 4.0 perspective. World Acad. Sci. Eng. Technol. 8(1), 37–44 (2014)Google Scholar
  2. 2.
    L. Bassi, in Industry 4.0: Hope, Hype or Revolution? IEEE 3rd International Forum on Research and Technologies for Society and Industry (RTSI), (2017), pp. 1–6Google Scholar
  3. 3.
    L.D. Xu, L. Duan, Big data for cyber physical systems in industry 4.0: A survey. Enterp. Inf. Syst. 13(2), 148–169 (2019)MathSciNetCrossRefGoogle Scholar
  4. 4.
    S. Yin, O. Kaynak, Big data for modern industry: Challenges and trends [point of view]. Proc. IEEE 103(2), 143–146 (2015). CrossRefGoogle Scholar
  5. 5.
    Y. Lu, Cyber physical system (Cps)-based industry 4.0: A survey. J. Ind. Integr. Manag. 2(3) (2017b).
  6. 6.
    Y. Lu, Industry 4.0: A survey on Technologies, applications and open research issues. J. Ind. Inf. Integr. 6, 1–10 (2017). CrossRefGoogle Scholar
  7. 7.
    H. Lasi, P. Fettke, G. Kemper, T. Feld, M. Hoffmann, Industry 4.0. Bus. Inf. Syst. Eng. 6(4), 239–242 (2014). CrossRefGoogle Scholar
  8. 8.
    S. Li, L.D. Xu, S. Zhao, 5G internet of things: A survey. J. Ind. Inf. Integr. 10, 1–9 (2018). CrossRefGoogle Scholar
  9. 9.
    J. Wang, W. Zhang, Y. Shi, S. Duan, J. Liu, Industrial big data analytics: challenges, methodologies, and applications. IEEE Trans. Automat. Sci. Eng. 1–12 (2018)Google Scholar
  10. 10.
    S. Ganschar, M. Gerlach, T. Hammerle, S. Krause, in Arbeit der Zukunft – Mensch und. Produktionsarbeit Der Zukunft-Industrie 4.0, 2013, ed. by D. Spath, pp. 50–56Google Scholar
  11. 11.
    H. Chen, Applications of cyber-physical system: A literature review. J. Ind. Integr. Manag. 2(3), 2424–8622 (2017b). CrossRefGoogle Scholar
  12. 12.
    H. Chen, Theoretical foundations for cyber-physical systems: A literature review. J. Ind. Integr. Manag. 2(3), 2424–8630 (2017). CrossRefGoogle Scholar
  13. 13.
    J. Lee, H. Ardakani, S. Yang, B. Bagheri, Industrial big data analytics and cyber-physical Systems for Future Maintenance & service innovation. Proc. CIRP 38, 3–7 (2015). CrossRefGoogle Scholar
  14. 14.
    E. Lee, in Cyber Physical Systems: Design Challenges. Object Oriented Real-Time Distributed Computing (ISORC), (2008), pp. 363–369Google Scholar
  15. 15.
    L. Xu, Editorial: inaugural issue. Enterp. Inf. Syst. 1(1), 1–2 (2007). MathSciNetCrossRefGoogle Scholar
  16. 16.
    J. Lee, E. Lapira, B. Bagheri, H. Kao, Recent advances and trends in predictive manufacturing systems in big data environment. Manuf. Lett. 1(1), 38–41 (2013). CrossRefGoogle Scholar
  17. 17.
    M. Baily, J. Manyka, Is Manufacturing ‘Cool’ Again (McKinsey Global Institute, 2013), Retrieved 18 July 2019Google Scholar
  18. 18.
    Y. Chen, H. Chen, A. Gorkhali, Y. Lu, Y. Ma, L. Li, Big data analytics and big data science: A survey. J. Manag. Anal. 3(2), 1–42 (2016). CrossRefGoogle Scholar
  19. 19.
    The rise of industrial big data. (2012). GE Intelligent PlatformsGoogle Scholar
  20. 20.
    What is Big Data? | Big Data Definition | V’s of Big Data. (2018). Retrieved 7 18, 2019, from
  21. 21.
    D. Laney, 3-D Data Management: Controlling Data Volume, Velocity and Variety (META Group, 2001). Research NoteGoogle Scholar
  22. 22.
    A. Mauro, M. Greco, M. Grimaldi, A formal definition of big data based on its essential features. Libr. Rev. 65(3), 122–135 (2016). CrossRefGoogle Scholar
  23. 23.
    M. Schroeck, R. Shockley, J. Smart, D. Romero-Morrales, P. Tufano, Analytics: The Real-World Use of Big (IBM Global Business Services, 2012). Retrieved from
  24. 24.
    J. Dijcks, Oracle: Big Data for the Enterprise. Oracle White Paper, (2012), Retrieved from
  25. 25.
    H. Karimipour, A. Rahimnezhad, H. Rouzba, Smart households demand response management with micro grid. arXiv 1, –7 (2019c)Google Scholar
  26. 26.
    H. Karimipour, V. Dinavahi, Parallel domain decomposition based distributed state estimation for large-scale power systems. IEEE Trans. Ind. Appl. 52(2), 1265–1269 (2016)Google Scholar
  27. 27.
    H. Karimipour, V. Dinavahi, Extended Kalman filter based massively parallel dynamic state estimation. IEEE Trans. Smart Grid 6(3), 1539–1549 (2015)CrossRefGoogle Scholar
  28. 28.
    Y. Zhong, X. Xu, L. Wang, IoT-enabled smart factory visibility and traceability using laser-scanners. Proc. Manuf. 10, 1–14 (2017). CrossRefGoogle Scholar
  29. 29.
    Y. Zhang, T. Qu, O. Ho, G. Huang, Real-time work-in-progress management for smart object-enabled ubiquitous shop-floor environment. Int. J. Comput. Integr. Manuf. 24(5), 431–445 (2011). CrossRefGoogle Scholar
  30. 30.
    A. Dehghantanha, A. Azmoodeh, K. Choo, Robust malware detection for internet of (battlefield) things devices using deep eigenspace learning. IEEE Trans. Sustain. Comput. 4(1), 88–95 (2019a)CrossRefGoogle Scholar
  31. 31.
    H. Said, T. Nicoletti, P. Perez, Utilizing telematics data to support effective equipment Fleet-management decisions: utilization rate and Hazard functions. J. Comput. Civ. Eng., 1–11 (2015).
  32. 32.
    Y. Xu, M. Chen, Improving just-in-time manufacturing operations by using internet of things based solutions. Procedia CIRP 56, 326–331 (2016). CrossRefGoogle Scholar
  33. 33.
    A. Dehghantanha, T. Dargahi, S. Grooby, A bibliometric analysis of authentication and access control in IoT devices, in Handbook of big data and IoT security, (Springer, 2019b), pp. 25–51.
  34. 34.
    A. Dehghantanha, M. Conti, K.W. Franke, Internet of things security and forensics: Challenges and opportunities. Futur. Gener. Comput. Syst., 544–546 (2018a).
  35. 35.
    M. Friendly, The Golden age of statistical graphics. Stat. Sci. 23(4), 502–535 (2008). MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    K. Vassakis, E. Petrakis, I. Kopanakis, Big Data Analytics: Applications, Prospects and Challenges, in Mobile Big Data, (Emmanuel Petrakis’s Lab, 2017).
  37. 37.
    H. Karimipour, A. Dehghantanha, J. Sakhnini, in Smart Grid Cyber Attacks Detection Using Supervised Learning and Heuristic Feature Selection. IEEE Int. Conf. on Smart Energy Grid Engineering (SEGE) (2019a), pp. 1–5Google Scholar
  38. 38.
    H. Karimipour, S. Mohammadi, V. Desai, Multivariate mutual information feature selection for intrusion detection. IEEE Canada Electr. Power Energy Conf. (EPEC), 1–6 (2018)Google Scholar
  39. 39.
    A. Vijayaraghavan, W. Sobel, A. Fox, D. Dornfeld, P. Warndorf, in Improving Machine Tool Interoperability Using Standardized Interface Protocols: MT Connect. International Symposium on Flexible Automation, (2008), pp. 1–6Google Scholar
  40. 40.
    GilPress. (2017, 10 1). What’s The Big Data? (Venturebeat) Retrieved 08 13, 2019, from The Chatbots Landscape:
  41. 41.
    P. Gölzer, P. Cato, M. Amberg, Data Processing Requirements of Industry 4.0 - Use Cases for Big Data Applications (Association for Information Systems (AISeL), 2015)Google Scholar
  42. 42.
    E. Hewitt, Cassandra: The Definitive Guide (O’Reilly Media, Inc., Sebastopol, 2011)Google Scholar
  43. 43.
    E. Anderson, X. Li, M. Shah, J. Tucek, J. Wylie, What Consistency Does Your Key-Value Store Actually Provide? (Hewlett-Packard Laboratories, 2009), pp. 1–6Google Scholar
  44. 44.
    K. Chodorow, S. Bradshaw, MongoDB: The Definitive Guide, in Powerful and Scalable Data Storage, 3rd edn., (O’Reilly Media, 2019), p. 425Google Scholar
  45. 45.
    H. Kagermann, J. Helbig, A. Hellinger, W. Wahlster, Recommendations for Implementing the Strategic Initiative INDUSTRIE 4.0, in Securing the Future of German Manufacturing Industry, (Forschungsunion, Acatech, 2013)Google Scholar
  46. 46.
    M. Santos, B. Martinho, C. Costa, Modelling and implementing big data warehouses for decision support. J. Manag. Anal. 4(2), 111–129 (2016). CrossRefGoogle Scholar
  47. 47.
    L. Xu, N. Liang, Q. Gao, An integrated approach for agricultural ecosystem management - IEEE journals & magazine. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 38(4), 590–599 (2008). CrossRefGoogle Scholar
  48. 48.
    K. Shvachko, H. Kuang, S. Radia, R. Chansler, in The Hadoop Distributed File System. IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 2010, pp. 1–10. doi:
  49. 49.
    G. Jagannathan, R. Wright, in Research Track Poster Privacy-Preserving Distributed k-Means Clustering over Arbitrarily Partitioned Data *. Proceeding of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (2005), pp. 593–599.
  50. 50.
    Y. Yao, Q. Cao, A. Vasilakos, EDAL: An energy-efficient, delay-aware, and lifetime-balancing data collection protocol for heterogeneous wireless sensor networks. IEEE/ACM Trans. Networking 23(3), 810–823 (2015). CrossRefGoogle Scholar
  51. 51.
    A. Dehghantanha, O. Osanaiye, H. Cai, K.X. Choo, Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. EURASIP J. Wirel. Commun. Netw., 1–20 (2016).
  52. 52.
    F. Tao, L. Zhang, V. Venkatesh, Y. Luo, Y. Cheng, Cloud manufacturing: A computing and service-oriented manufacturing model. Proc. Inst. Mech. Eng. B J. Eng. 225(10), 1969–1976 (2011)CrossRefGoogle Scholar
  53. 53.
    X. Xu, From cloud computing to cloud manufacturing. Robot. Comput. Integr. Manuf. 28(1), 75–86 (2012). CrossRefGoogle Scholar
  54. 54.
    B. Daniel, Big data and analytics in higher education: Opportunities and challenges. Br. J. Educ. Technol. 46(5), 904–920 (2015). CrossRefGoogle Scholar
  55. 55.
    D. Delen, H. Demirkan, Data, information and analytics as services. Decis. Support. Syst. 55(1), 359–363 (2013). CrossRefGoogle Scholar
  56. 56.
    H.B. Karimipour, F. Derakhshan, in A Layered Intrusion Detection System for Critical Infrastructure Using Machine Learning. IEEE Int. Conf. on Smart Energy Grid Engineering (SEGE), (2019), pp. 1–5Google Scholar
  57. 57.
    W.P. Elderton, Tables for testing the goodness of fit of theory to observation. Biometrika 1(2), 155–163 (1902)Google Scholar
  58. 58.
    K. Pearson, Note on regression and inheritance in the case of two parents. Proc. R. Soc. Lond. 58, 240–242 (1895). CrossRefGoogle Scholar
  59. 59.
    A. Kramer, J. Green, J.T. Pollard, Causal analysis approaches in ingenuity pathway analysis | bioinformatics | Oxford Academic. Bioinformatics 30(4), 523–530 (2014). CrossRefGoogle Scholar
  60. 60.
    J. Pearl, Simpson’s paradox, confounding, and collapibility (Cambridge University Press, Cambridge, 2000), pp. 173–200Google Scholar
  61. 61.
    S. Kleinberg, B. Mishra, The Temporal Logic of Causal Structures, in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, (AUAI Press, 2009), pp. 303–312Google Scholar
  62. 62.
    R. Agrawal, R. Srikant, in Fast Algorithms for Mining Association Rules. Proceedings of 20th International Conference Very Large Data Bases, 15(1215), 487–499 (1994)Google Scholar
  63. 63.
    J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation. ACM Sigmod Rec. 29(2), 1–12 (2000)CrossRefGoogle Scholar
  64. 64.
    M. Zaki, Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000). CrossRefGoogle Scholar
  65. 65.
    L. Duan, W. Street, Finding maximal fully-correlated itemsets in large databases. ICDM 9, 770–775 (2009)Google Scholar
  66. 66.
    E.R. Lapira, Fault Detection in a Network of Similar Machines Using Clustering Approach. Doctoral Dissertation, University of Cincinnati, 2012Google Scholar
  67. 67.
    H. Karimipour, A. Dehghantanha, R. Parizi, K. Choo, H. Leung, A deep and scalable unsupervised machine learning system for cyber-attack detection in large-scale smart grids. IEEE Access 7 (2019b).
  68. 68.
    A. Jalowiechki, P. Klusek, W. Skarka, The methods of knowledge acquisition in the product lifecycle for a generative model’s creation process. Proc. Manuf. 11, 2219–2226 (2017). CrossRefGoogle Scholar
  69. 69.
    L. Alleman, L. Lamaison, P. Esperanza, PM10 metal concentrations and source identification using positive matrix factorization and wind sectoring in a French industrial zone. Atmos. Res. 96(4), 612–625 (2010). CrossRefGoogle Scholar
  70. 70.
    C.J. Kuo, D. Chen, L. Yang, H. Chen, Automatic machine status prediction in the era of industry 4.0: Case study of Machines in a Spring Factory. J. Syst. Archit. 81, 44–53 (2017). CrossRefGoogle Scholar
  71. 71.
    B. Bagheri, H. Ahmadi, R. Labbafi, Implementing discrete wavelet transform and artificial neural networks for acoustic condition monitoring of gearbox. Elixir Mech 35, 2909–2911 (2011)Google Scholar
  72. 72.
    J. Neter, M. Kutner, C. Nachtsheim, W. Wasserman, Applied Linear Statistical Models, 5th edn. (McGraw-Hill Irwin, New York, 1996), pp. 1–1415Google Scholar
  73. 73.
    D. Hosmer, S.S. Lemeshow, Applied Logistic Regression, 3rd edn. (Wiley, Hoboken, 2013)CrossRefGoogle Scholar
  74. 74.
    P. Domingos, M. Pazzani, On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29(2), 103–130 (1997). CrossRefzbMATHGoogle Scholar
  75. 75.
    N. Friedman, D. Geiger, M. Goldszmidt, Bayesian network classifiers. Mach. Learn. 29(2), 131–163 (1997). CrossRefzbMATHGoogle Scholar
  76. 76.
    M. Hagan, D. Howard, M. Beale, O. De Jesus, Neural Network Design, 2nd edn. (Martin Hagan, 2014)Google Scholar
  77. 77.
    A. Dehghantanha, H. Haddad Pajouh, R. Khayami, K. Choo, A deep recurrent neural network based approach for internet of things malware threat hunting. Futur. Gener. Comput. Syst. 85, 88–96 (2018b). CrossRefGoogle Scholar
  78. 78.
    J. Suykens, J. Vandewalle, Least squares support vector machine classifiers. Neural. Process. Lett. 9(3), 293–300 (1999). CrossRefGoogle Scholar
  79. 79.
    B. Boser, I. Guyon, V. Vapnik, in A Training Algorithm for Optimal Margin Classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory, (1992), pp. 144–152Google Scholar
  80. 80.
    M. Maggio, H. Hoffmann, A. Papadopoulos, Comparison of decision-making strategies for self-optimization in autonomic computing systems. ACM Trans. Auton. Adapt. Syst. 7(4) (2012).
  81. 81.
    P. Bogdan, in A Cyber-Physical Systems Approach to Personalized Medicine: Challenges and Opportunities for NoC-Based Multicore Platforms. Design, Automation & Test in Europe Conference & Exhibition (DATE), (2015), pp. 2553–2258.

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.School of Engineering, University of GuelphGuelphCanada
  2. 2.Cyber Science LabSchool of Computer Science, University of GuelphGuelphCanada
  3. 3.College of Computer and Software EngineeringKennesaw State UniversityMariettaUSA

Personalised recommendations