Skip to main content

A Novel Approach for Gigantic Data Examination Utilizing the Apache Spark and Significant Learning

  • Conference paper
  • First Online:
Book cover Inventive Computation Technologies (ICICIT 2019)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 98))

Included in the following conference series:

Abstract

With the spreading certainty of Gigantic Data, particular prompts and advancements are made in this area of Gigantic Data and systems, for example, the Apache Hadoop as well as the Apache Spark are very much widely used and spread in industry and a part of balance over the previous decades and have wrapped up gigantically phenomenal, particularly in affiliations. It is finding the opportunity to be interminably clear that profitable tremendous data evaluation is fundamental to perception artificial experiences issues. All things considered, a diversified-calculation repository executed inside the Apache Spark structure, it is MLlib. Disregarding the way that this library reinforces different AI figurings, there’s still expansion to utilize the Spark course of action ably for out and out time-genuine also, computationally absurd methods like essential acquisition of knowledge. We are trying to put forward an effective structure which consolidations the separative assessment cutoff purposes of the Apache Spark and the pushed AI plan for a fundamental multilayer perceptron (MLP), utilizing pervasive thought of cascade learning. We lead observational evaluation of our structure on two veritable famous datasets. The outcomes around are attracting and show our proposed structure, accordingly sketching out that it is an alter over routine goliath data evaluation strategies that utilization either Spark or Significant learning as individual parts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D. (eds.) HotCloud, vol. 10, no. 10, p. 95 (2010)

    Google Scholar 

  2. Freeman, D.T., Amde, M., Owen, S., et al.: Mllib: machine learning in apache spark. J. Mach. Learn. Res. 17(34), 1–7 (2016)

    MathSciNet  MATH  Google Scholar 

  3. Fu, J., Sun, J., Wang, K.: Spark–a big data processing platform for machine learning. In: 2016 International Conference on Industrial Informatics-Computing Technology, Intelligent Technology, Industrial Information Integration (ICIICII), pp. 48–51. IEEE (2016)

    Google Scholar 

  4. Nair, L.R., Shetty, S.D.: Streaming twitter data analysis using spark for effective job search. J. Theor. Appl. Inf. Technol. 80(2), 349 (2015)

    Google Scholar 

  5. Nodarakis, N., Sioutas, S., Tsakalidis, A.K., Tzimas, G.: Large scale sentiment analysis on Twitter with spark. In: EDBT/ICDT Workshops, pp. 1–8 (2016)

    Google Scholar 

  6. Kotsiantis, S., Kanellopoulos, D., Pintelas, P., et al.: Handling imbalanced datasets: a review. GESTS Int. Trans. Comput. Sci. Eng. 30(1), 25–36 (2006)

    Google Scholar 

  7. Sonak, A., Patankar, R., Pise, N.: A new approach for handling imbalanced dataset using ann and genetic algorithm. In: 2016 International Conference on Communication and Signal Processing (ICCSP), pp. 1987–1990. IEEE (2016)

    Google Scholar 

  8. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)

    Google Scholar 

  9. Popescu, M.C., Sasu, L.M.: Feature extraction, feature selection and machine learning for image classification: a case study. In: 2014 International Conference on Optimization of Electrical and Electronic Equipment (OPTIM), pp. 968–973. IEEE (2014)

    Google Scholar 

  10. Dey, K., Shrivastava, R., Kaushik, S.: A paraphrase and semantic similarity detection system for user generated short-text content on microblogs. In: COLING, pp. 2880–2890 (2016)

    Google Scholar 

  11. Lavrač, N., Fürnkranz, J., Gamberger, D.: Explicit feature construction and manipulation for covering rule learning algorithms. In: Advances in Machine Learning I, pp. 121–146. Springer (2010)

    Google Scholar 

  12. Silva, L.M., de Sa, J.M., Alexandre, L.A.: Data classification with multilayer perceptrons using a generalized error function. Neural Netw. 21(9), 1302–1310 (2008)

    Article  Google Scholar 

  13. Sharma, C.: Big data analytics using neural networks (2014)

    Google Scholar 

  14. Hu, Y.-C.: Pattern classification by multi-layer perceptron using fuzzy integral-based activation function. Appl. Soft Comput. 10(3), 813–819 (2010)

    Article  MathSciNet  Google Scholar 

  15. Pal, S.K., Mitra, S.: Multilayer perceptron, fuzzy sets, and classifica-tion. IEEE Trans. Neural Networks 3(5), 683–697 (1992)

    Article  Google Scholar 

  16. Sarwar, S.M., Hasan, M., Ignatov, D.I.: Two-stage cascaded classifier for purchase prediction. arXiv preprint arXiv:1508.03856 (2015)

  17. Simonovsky, M., Komodakis, N.: Onionnet: sharing features in cascaded deep classifiers. arXiv preprint arXiv:1608.02728 (2016)

  18. Christ, P.F., Elshaer, M.E.A., Ettlinger, F., Tatavarty, S., Bickel, M., Bilic, M.R., Armbruster, M., Hofmann, F., DAnastasi, M., et al.: Automatic liver and lesion segmentation in ct using cascaded fully convolutional neural networks and 3d conditional random fields. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 415–423. Springer (2016)

    Google Scholar 

  19. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anilkumar V. Brahmane .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brahmane, A.V., Chaitanya Krishna, B. (2020). A Novel Approach for Gigantic Data Examination Utilizing the Apache Spark and Significant Learning. In: Smys, S., Bestak, R., Rocha, Á. (eds) Inventive Computation Technologies. ICICIT 2019. Lecture Notes in Networks and Systems, vol 98. Springer, Cham. https://doi.org/10.1007/978-3-030-33846-6_95

Download citation

Publish with us

Policies and ethics