A big data analytical framework for analyzing solar energy receptors using evolutionary computing approach

  • Shahzad Yousaf
  • Imran Shafi
  • Sadia Din
  • Anand PaulEmail author
  • Jamil Ahmad
Original Research


Data science has been empowered with the emerging concept of big data enabling data scalability in many ways. Effective prediction systems for complex analytical problems dealing with big data can be created using evolutionary computing, associate feature selection and reduction techniques. In the current work, we put forward a big data analytical scheme to analyze solar energy receptors based on a set of features. Correct estimation of pressure loss coefficients (PLC) greatly improves the design of a solar collector. Evaluation of PLC is a time and resource consuming process as the flow rate and Reynolds number changes at every junction. Moreover, a suitable and appropriate algebraic expression is not yet defined in the laminar region of flow for approximation of the complex relationship among different geometrical features and flow variables. The overall heat gain of the solar receptor is dependent upon flow rates and flow distribution in risers. Also, the local disturbances during the flow division and combining process from manifold to risers affects the performance of the solar collector. Owing to these reasons, mostly they are calculated using experiments, primarily due to the complexity involved. The proposed big data framework involves acquiring huge feature sets at each point along the flow of thermal fluid. The data is experimentally acquired in a set of around forty features for large number of Reynolds number and discharge ratio variations. Reynolds number varies from 200 to 15,000 while discharge ratio variation is in the range of 0–1. Feature reduction in the big data set is done by calculating the relevancy score using ReliefF algorithm that extracts the most relevant features. Later, the framework employs a suitably selected optimal ANN architecture of layers, neurons and activation functions. The selected topology is trained using reduced features sets using Levenberg–Marquardt backpropagation algorithm. Test and validation results bespeaks the efficacy of the proposed strategy and indicate that future PLC values can be forecasted close to experimental data. The relative percent error is around 10% of the experimental data set and is found better than computational fluid dynamics based approaches in terms of memory and processing time.


Big data Artificial neural networks Computational fluid dynamics Solar collectors Pressure loss coefficients 



This study was also supported by the National Research Foundation of Korea (NRF) Grant funded by the Korean Government (NRF-2017R1C1B5017464).


  1. Abdulwahhab M, Injeti NK, Dakhil SF (2013) Numerical prediction of pressure loss of fluid in a T junction. Int J Energy Environ 4(2):253–264Google Scholar
  2. Ahmadi A, Han D, Karamouz M, Remesan R (2009) Input data selection for solar radiation estimation. Hydrol Process 23(19):2754–2764CrossRefGoogle Scholar
  3. Aladag CH (2011) A new architecture selection method based on tabu search for artificial neural networks. Expert Syst Appl 38(4):3287–3293CrossRefGoogle Scholar
  4. Al-Ayyoub M, Jararweh Y, Rabab’ah A, Aldwairi M (2017) Feature extraction and selection for Arabic tweets authorship authentication. J Ambient Intell Hum Comput 8(3):383–393CrossRefGoogle Scholar
  5. Al-Refaie A, Chen T, Al-Athamneh R, Wu HC (2016) Fuzzy neural network approach to optimizing process performance by using multiple responses. J Ambient Intell Hum Comput 7(6):801–816CrossRefGoogle Scholar
  6. Andreu J, Angelov P (2013) An evolving machine learning method for human activity recognition systems. J Ambient Intell Hum Comput 4(2):195–206CrossRefGoogle Scholar
  7. Azzini A (2006) A new generic approach for neural network design and optimization (Ph.D. Thesis), University of MilanGoogle Scholar
  8. Badar AW, Buchholz R, Lou Y, Ziegler F (2012) CFD based analysis of flow distribution in a coaxial vacuum tube solar collector with laminar flow conditions. Int J Energy Environ Eng 3(1):24CrossRefGoogle Scholar
  9. Bassett MD, Winterbone DE, Pearson RJ (2001) Calculation of steady flow pressure loss coefficients for pipe junctions. Proc Inst Mech Eng Part C J Mech Eng Sci 215(8):861–881CrossRefGoogle Scholar
  10. Bava F, Furbo S (2016) A numerical model for pressure drop and flow distribution in a solar collector with U-connected absorber pipes. Sol Energy 134:264–272CrossRefGoogle Scholar
  11. Beyer MA, Laney D (2012) The importance of ‘big data’: a definition. Gartner, Stamford, pp 2014–2018Google Scholar
  12. Bingham JF, Blair GP (1985) An improved branched pipe model for multi-cylinder automotive engine calculations. Proc Inst Mech Eng Part D Transp Eng 199(1):65–77CrossRefGoogle Scholar
  13. Caner M, Gedik E, Keçebaş A (2011) Investigation on thermal performance calculation of two type solar air collectors using artificial neural network. Expert Syst Appl 38(3):1668–1674CrossRefGoogle Scholar
  14. Chen CP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347CrossRefGoogle Scholar
  15. Ghemawat S, Gobioff H, Leung ST (2003) The google file system. In: SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles, vol 37, no 5. Bolton Landing, NY, USA, pp 29–43. CrossRefGoogle Scholar
  16. Glembin J, Rockendorf G, Scheuren J (2010) Internal thermal coupling in direct-flow coaxial vacuum tube collectors. Sol Energy 84(7):1137–1146CrossRefGoogle Scholar
  17. Gropp W, Lusk E, Sterling T (2012) Enabling technologies in Beowulf cluster computing with Linux, 2nd edn, vol 3, no 14. The MIT Press, CambridgeGoogle Scholar
  18. Hager WH (1984) An approximate treatment of flow in branches and bends. Proc Inst Mech Eng Part C J Mech Eng Sci 198(1):63–69CrossRefGoogle Scholar
  19. Hendrickson S (2010) Getting started with Hadoop with Amazon’s elastic MapReduce. EMR (1/43)Google Scholar
  20. Hilbert M, López P (2011) The world’s technological capacity to store, communicate, and compute information. Science 332(6025):60–65CrossRefGoogle Scholar
  21. Hoffman KA, Chiang ST (2000) Computational fluid dynamics for engineers. Engineering education system, 2nd edn.
  22. Houle ME, Kriegel HP, Kröger P, Schubert E, Zimek A (2010) Can shared-neighbor distances defeat the curse of dimensionality? In: Gertz M, Ludäscher B (eds) Scientific and statistical database management. Lecture notes in computer science, vol 6187. Springer, Berlin, pp 482–500. CrossRefGoogle Scholar
  23. Idelchik IE (2017) Flow resistance: a design guide for engineers. Routledge, LondonGoogle Scholar
  24. Jayalakshmi T, Santhakumaran A (2011) Statistical normalization and back propagation for classification. Int J Comput Theory Eng 3(1):1793–8201Google Scholar
  25. Jia J, Yang N, Zhang C, Yue A, Yang J, Zhu D (2013) Object-oriented feature selection of high spatial resolution images using an improved relief algorithm. Math Comput Model 58(3–4):619–626CrossRefGoogle Scholar
  26. Jones GF, Lior N (1994) Flow distribution in manifolded solar collectors with negligible buoyancy effects. Sol Energy 52(3):289–300CrossRefGoogle Scholar
  27. Kalogirou SA (2001) Artificial neural networks in renewable energy systems applications: a review. Renew Sustain Energy Rev 5(4):373–401CrossRefGoogle Scholar
  28. Kumar S, Kaur T (2016) Development of ANN based model for solar potential assessment using various meteorological parameters. Energy Procedia 90:587–592CrossRefGoogle Scholar
  29. Liu Y, Starzyk JA, Zhu Z (2008) Optimized approximation algorithm in neural networks without overfitting. IEEE Trans Neural Netw 19(6):983–995CrossRefGoogle Scholar
  30. Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers AH (2011) Big data: the next frontier for innovation, competition, and productivity. McKinseyGoogle Scholar
  31. Mokryani G, Siano P, Piccolo A (2013) Optimal allocation of wind turbines in microgrids by using genetic algorithm. J Ambient Intell Hum Comput 4(6):613–619CrossRefGoogle Scholar
  32. Montgomery DC (2014) Big data and the quality profession. Qual Reliab Eng Int 30(4):447CrossRefGoogle Scholar
  33. Moujaes SF, Deshmukh S (2006) Three-dimensional CFD predications and experimental comparison of pressure drop of some common pipe fittings in turbulent flow. J Energy Eng 132(2):61–66CrossRefGoogle Scholar
  34. Müller E, Schiffer M, Seidl T (2011) Statistical selection of relevant subspace projections for outlier ranking. In: 2011 IEEE 27th international conference on data engineering. IEEE, Hannover, Germany, pp 434–445CrossRefGoogle Scholar
  35. Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, CambridgezbMATHGoogle Scholar
  36. Olken F, Gruenwald L (2008) Data stream management: aggregation, classification, modeling, and operator placement. IEEE Internet Comput 12(6):9–12CrossRefGoogle Scholar
  37. Paul A, Jeyaraj R (2019) Internet of things: a primer. Hum Behav Emerg Technol 1(1):37–47CrossRefGoogle Scholar
  38. Paul A, Victoire TAA, Jeyakumar AE (2003) Particle swarm approach for retiming in VLSI. In: 2003 46th midwest symposium on circuits and systems, vol 3. IEEE, Cairo, Egypt, pp 1532–1535CrossRefGoogle Scholar
  39. Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69zbMATHCrossRefGoogle Scholar
  40. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326CrossRefGoogle Scholar
  41. Şahin M, Kaya Y, Uyar M (2013) Comparison of ANN and MLR models for estimating solar radiation in Turkey using NOAA/AVHRR data. Adv Space Res 51(5):891–904CrossRefGoogle Scholar
  42. Salmasi F, Yıldırım G, Masoodi A, Parsamehr P (2013) Predicting discharge coefficient of compound broad-crested weir by using genetic programming (GP) and artificial neural network (ANN) techniques. Arab J Geosci 6(7):2709–2717CrossRefGoogle Scholar
  43. Schubert E, Zimek A, Kriegel HP (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Discov 28(1):190–237MathSciNetzbMATHCrossRefGoogle Scholar
  44. Shafi I, Ahmad J, Shah SI, Kashif FM (2007) Evolutionary time–frequency distributions using Bayesian regularised neural network model. IET Signal Proc 1(2):97–106CrossRefGoogle Scholar
  45. Shafi I, Ahmad J, Shah SI, Kashif FM (2008) Computing deblurred time-frequency distributions using artificial neural networks. Circuits Syst Signal Process 27(3):277–294CrossRefGoogle Scholar
  46. Stanczyk U (2014) RELIEF-based selection of decision rules. Procedia Comput Sci 35:299–308CrossRefGoogle Scholar
  47. Stone R (2001) Design techniques for engine manifolds: wave action methods for IC engines/theory of engine manifold design: wave action methods for IC engines. Proc Inst Mech Eng 215(3):403Google Scholar
  48. Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. In: Data classification: algorithms and applications. CRC Press, pp 37–64Google Scholar
  49. Voyant C, Notton G, Kalogirou S, Nivet ML, Paoli C, Motte F, Fouilloy A (2017) Machine learning methods for solar radiation forecasting: a review. Renew Energy 105:569–582CrossRefGoogle Scholar
  50. Weitbrecht V, Lehmann D, Richter A (2002) Flow distribution in solar collectors with laminar flow conditions. Sol Energy 73(6):433–441CrossRefGoogle Scholar
  51. Wu X, Zhu X, Wu GQ, Ding W (2013) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107Google Scholar
  52. Yousaf S, Shafi I, Ahmad J (2018) Calculation of pressure loss coefficients in combining flows of a solar collector using artificial neural networks. Int J Adv Comput Sci Appl 9(9):555Google Scholar
  53. Zafra A, Pechenizkiy M, Ventura S (2010) Feature selection is the ReliefF for multiple instance learning. In: Intelligent systems design and applications (ISDA), 2010 10th international conference on. IEEE, pp 525–532Google Scholar
  54. Zhai Y, Ong YS, Tsang IW (2014) The emerging “big dimensionality”. IEEE Comput Intell Mag 9(3):14–26. CrossRefGoogle Scholar
  55. Zimek A, Schubert E, Kriegel HP (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min ASA Data Sci J 5(5):363–387MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Shahzad Yousaf
    • 1
  • Imran Shafi
    • 2
  • Sadia Din
    • 3
  • Anand Paul
    • 3
    Email author
  • Jamil Ahmad
    • 2
  1. 1.Lahor UniversityIslamabadPakistan
  2. 2.Abasyn UniversityIslamabadPakistan
  3. 3.Kyungpook National UniversityDaeguSouth Korea

Personalised recommendations