Skip to main content

GRNN++: A Parallel and Distributed Version of GRNN Under Apache Spark for Big Data Regression

  • Conference paper
  • First Online:
Data Management, Analytics and Innovation

Abstract

Among the neural network architectures for prediction, multi-layer perceptron (MLP), radial basis function (RBF), wavelet neural network (WNN), general regression neural network (GRNN), and group method of data handling (GMDH) are popular. Out of these architectures, GRNN is preferable because it involves single-pass learning and produces reasonably good results. Although GRNN involves single-pass learning, it cannot handle big datasets because a pattern layer is required to store all the cluster centers after clustering all the samples. Therefore, this paper proposes a hybrid architecture, GRNN++, which makes GRNN scalable for big data by invoking a parallel distributed version of K-means++, namely, K-means||, in the pattern layer of GRNN. The whole architecture is implemented in the distributed parallel computational architecture of Apache Spark with HDFS. The performance of the GRNN++ was measured on gas sensor dataset which has 613 MB of data under a ten-fold cross-validation setup. The proposed GRNN++ produces very low mean squared error (MSE). It is worthwhile to mention that the primary motivation of this article is to present a distributed and parallel version of the traditional GRNN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kusakunniran, W., Wu, Q., Zhang, J., Li, H.: Multi-view gait recognition based on motion regression using multilayer perceptron. In: 2010 20th International Conference on Pattern Recognition, pp 2186–2189. IEEE, Istanbul (2010)

    Google Scholar 

  2. Agirre-Basurko, E., Ibarra-Berastegi, G., Madariaga, I.: Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area. Environ. Model Softw. 21, 430–446 (2006)

    Article  Google Scholar 

  3. Gaudart, J., Giusiano, B., Huiart, L.: Comparison of the performance of multi-layer perceptron and linear regression for epidemiological data. Comput. Stat. Data Anal. 44, 547–570 (2004)

    Article  MathSciNet  Google Scholar 

  4. Mignon, A., Jurie, F.: Reconstructing faces from their signatures using RBF regression. In: Procedings of the British Machine Vision Conference 2013, pp 103.1–103.11. British Machine Vision Association, Bristol (2013)

    Google Scholar 

  5. Hannan, S.A., Manza, R.R., Ramteke, R.J.: Generalized regression neural network and radial basis function for heart disease diagnosis. Int. J. Comput. Appl. 7, 7–13 (2010)

    Google Scholar 

  6. Taki, M., Rohani, A., Soheili-Fard, F., Abdeshahi, A.: Assessment of energy consumption and modeling of output energy for wheat production by neural network (MLP and RBF) and Gaussian process regression (GPR) models. J. Clean. Prod. 172, 3028–3041 (2018)

    Article  Google Scholar 

  7. Budu, K.: Comparison of wavelet-based ANN and regression models for reservoir inflow forecasting. J. Hydrol. Eng. 19, 1385–1400 (2014)

    Article  Google Scholar 

  8. Vinaykumar, K., Ravi, V., Carr, M., Rajkiran, N.: Software development cost estimation using wavelet neural networks. J. Syst. Softw. 81, 1853–1867 (2008)

    Article  Google Scholar 

  9. Chauhan, N., Ravi, V., Karthik Chandra, D.: Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks. Expert Syst. Appl. 36, 7659–7665 (2009)

    Article  Google Scholar 

  10. Rajkiran, N., Ravi, V.: Software reliability prediction using wavelet neural networks. In: International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), pp 195–199. IEEE, Sivakasi (2007)

    Google Scholar 

  11. Astakhov, V.P., Galitsky, V.V.: Tool life testing in gundrilling: an application of the group method of data handling (GMDH). Int. J. Mach. Tools Manuf 45, 509–517 (2005)

    Article  Google Scholar 

  12. Elattar, E.E., Goulermas, J.Y., Wu, Q.H.: Generalized locally weighted GMDH for short term load forecasting. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42, 345–356 (2012)

    Article  Google Scholar 

  13. Srinivasan, D.: Energy demand prediction using GMDH networks. Neurocomputing 72, 625–629 (2008)

    Article  Google Scholar 

  14. Ravisankar, P., Ravi, V.: Financial distress prediction in banks using group method of data handling neural network, counter propagation neural network and fuzzy ARTMAP. Knowl. Based Syst. 23, 823–831 (2010)

    Article  Google Scholar 

  15. Mohanty, R., Ravi, V., Patra, M.R.: Software reliability prediction using group method of data handling. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. RSFDGrC 2009, pp 344–351. Springer, Berlin (2009)

    Google Scholar 

  16. Reddy, K.N., Ravi, V.: Kernel group method of data handling: application to regression problems. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds.) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2012, pp 74–81. Springer, Berlin (2012)

    Chapter  Google Scholar 

  17. Ahad, N., Qadir, J., Ahsan, N.: Neural networks in wireless networks: techniques, applications and guidelines. J. Netw. Comput. Appl. 68, 1–27 (2016)

    Article  Google Scholar 

  18. Jin, L., Li, S., Yu, J., He, J.: Robot manipulator control using neural networks: A survey. Neurocomputing 285, 23–34 (2018)

    Article  Google Scholar 

  19. Marugán, A.P., Márquez, F.P.G., Perez, J.M.P., Ruiz-Hernández, D.: A survey of artificial neural network in wind energy systems. Appl. Energy 228, 1822–1836 (2018)

    Article  Google Scholar 

  20. Agrawal, S., Agrawal, J.: Neural network techniques for cancer prediction: a survey. Proc. Comput. Sci. 60, 769–774 (2015)

    Article  Google Scholar 

  21. Khoshroo, A., Emrouznejad, A., Ghaffarizadeh, A., Kasraei, M., Omid, M.: Sensitivity analysis of energy inputs in crop production using artificial neural networks. J. Clean. Prod. 197(Part 1), 992–998 (2018)

    Article  Google Scholar 

  22. Tkáč, M., Verner, R.: Artificial neural networks in business: two decades of research. Appl. Soft Comput. 38, 788–804 (2016)

    Article  Google Scholar 

  23. Specht, D.F.: A general regression neural network. IEEE Trans. Neural Netw. 2, 568–576 (1991)

    Article  Google Scholar 

  24. Bahmani, B., Moseley, B., Vattani, A., Kumar, R., Vassilvitskii, S.: Scalable K-means++. Proc. VLDB Endow. 5, 622–633 (2012)

    Article  Google Scholar 

  25. Arthur, D., Vassilvitskii, S.: k-means ++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp 1027–1035 (2007)

    Google Scholar 

  26. Zhao, W., Ma, H., He, Q.: Parallel K-means clustering based on MapReduce. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) Cloud Computing, pp. 674–679. Springer, Berlin (2009)

    Chapter  Google Scholar 

  27. Liao, Q., Yang, F., Zhao, J.: An improved parallel K-means clustering algorithm with MapReduce. In: 2013 15th IEEE International Conference on Communication Technology, pp 764–768. IEEE (2013)

    Google Scholar 

  28. Kamaruddin, S., Ravi, V., Mayank, P.: Parallel evolving clustering method for big data analytics using apache spark: applications to banking and physics. In: Reddy, P., Sureka, A., Chakravarthy, S., Bhalla, S. (eds.) Lecture Notes in Computer Science. Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, pp. 278–292. Springer, Cham (2017)

    Google Scholar 

  29. Leung, M.T., Chen, A.-S., Daouk, H.: Forecasting exchange rates using general regression neural networks. Comput. Oper. Res. 27, 1093–1110 (2000)

    Article  Google Scholar 

  30. Kayaer, K., Yildirim, T.: Medical diagnosis on Pima Indian diabetes using general regression neural networks. In: Proceedings of the International Conference on Artificial Neural Networks and Neural Information Processing (ICANN/ICONIP), pp 181–184 (2003)

    Google Scholar 

  31. Li, C., Bovik, A.C., Wu, X.: Blind image quality assessment using a general regression neural network. IEEE Trans. Neural Netw. 22, 793–799 (2011)

    Article  Google Scholar 

  32. Li, H., Guo, S., Li, C., Sun, J.: A hybrid annual power load forecasting model based on generalized regression neural network with fruit fly optimization algorithm. Knowl. Based Syst. 37, 378–387 (2013)

    Article  Google Scholar 

  33. Ravi, V., Krishna, M.: A new online data imputation method based on general regression auto associative neural network. Neurocomputing 138, 106–113 (2014)

    Article  Google Scholar 

  34. Tejasviram, V., Solanki, H., Ravi, V., Kamaruddin, S.: Auto associative extreme learning machine based non-linear principal component regression for big data applications. In: 2015 Tenth International Conference on Digital Information Management (ICDIM), pp 223–228. IEEE, Jeju (2015)

    Google Scholar 

  35. Kamaruddin, S., Ravi, V.: Credit card fraud detection using big data analytics: use of PSOAANN based one-class classification. In: Proceedings of the International Conference on Informatics and Analytics—ICIA-16, pp 1–8. ACM Press, Pondicherry (2016)

    Google Scholar 

  36. Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3, 32–57 (1973)

    Article  MathSciNet  Google Scholar 

  37. Bezdek, J.C., Pal, N.R.: Some new indices of cluster validity. IEEE Trans. Syst. Man Cybern. B Cybern. 28, 301–315 (1998)

    Article  Google Scholar 

  38. Fonollosa, J., Sheik, S., Huerta, R., Marco, S.: Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring. Sensors Actuators B Chem. 215, 618–629 (2015)

    Article  Google Scholar 

  39. Gas sensor array under dynamic gas mixtures Data Set, https://archive.ics.uci.edu/ml/datasets/Gas+sensor+array+under+dynamic+gas+mixtures

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vadlamani Ravi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kamaruddin, S., Ravi, V. (2020). GRNN++: A Parallel and Distributed Version of GRNN Under Apache Spark for Big Data Regression. In: Sharma, N., Chakrabarti, A., Balas, V. (eds) Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-32-9949-8_16

Download citation

  • DOI: https://doi.org/10.1007/978-981-32-9949-8_16

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-32-9948-1

  • Online ISBN: 978-981-32-9949-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics