Abstract
Smart data analysis has become a challenging task in today’s environment where disparate data set is generated across the globe with enormous volume. So there is an absolute need of parallel and distributed framework along with appropriate algorithms which can handle these challenges. Various machine learning algorithms can be deployed effectively in this environment as they can work with minimal manual intervention. The objective of this chapter is first to present various issues faced in storing and processing big data and available tools, technologies and algorithms to deal with those problems along with one case study which describes an application in healthcare analytics. In the subsequent section it discusses few distributed algorithms which are widely used in the data mining domain. Finally it focuses on various machine learning algorithms and their roles in the big data analytics world.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Clifton, D.A., Niehaus, K.E., Charlton, P., Colopy, G.W.: Health informatics via machine learning for the clinical management of patients. Yearbook Med. Inform. 10(1), 38 (2015)
Moazeni, M.: Parallel Algorithms for Medical Informatics on Data-Parallel Many-Core Processors (2013)
Acharjee, S., Ray, R., Chakraborty, S., Nath, S., Dey, N.: Watermarking in motion vector for security enhancement of medical videos. In: 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), pp. 532–537. IEEE (2014, July)
Bose, S., Acharjee, S., Chowdhury, S. R., Chakraborty, S., Dey, N.: Effect of watermarking in vector quantization based image compression. In: 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), pp. 503–508. IEEE (2014, July)
Rathi, S.C., Inamdar, V.S.: Analysis of watermarking techniques for medical images preserving ROI. In: Computer Science & Information Technology (CS & IT 05)-open access-Computer Science Conference Proceedings (CSCP), pp. 297–308 (2012)
Coatrieux, G., Lecornu, L., Sankur, B., Roux, C.: A review of image watermarking applications in healthcare. In: Engineering in Medicine and Biology Society, 2006. EMBS’06. 28th Annual International Conference of the IEEE, pp. 4691–4694. IEEE (2006, August)
Abd-Eldayem, M.M.: A proposed security technique based on watermarking and encryption for digital imaging and communications in medicine. Egypt. Inform. J. 14(1), 1–13 (2013)
Suri, J., Dey, N., Bose, S., Das, A., Chaudhuri, S.S., Saba, L., Nicolaides, A.: 2084743 diagnostic preservation of atherosclerotic ultrasound video for stroke telemedicine in watermarking framework. Ultrasound Med. Biol. 41(4), S133 (2015)
Pal, A.K., Dey, N., Samanta, S., Das, A., Chaudhuri, S.S.: A hybrid reversible watermarking technique for color biomedical images. In: 2013 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pp. 1–6. IEEE (2013, December)
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A. H.:. Big Data: The Next Frontier for Innovation, Competition, and Productivity (2011)
Kamal, S., Ripon, S.H., Dey, N., Ashour, A.S., Santhi, V.: A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset. Comput. Methods Programs Biomed. 131, 191–206 (2016)
A presentation on MapReduce. http://www.slideshare.net/nishantgandhi99/map-reduce-programming-model-to-solve-graph-problems
A tutorial on “Introduction to Hadoop”. http://www.tutorialspoint.com/hadoop/hadoop_introduction.htm
A whitepaper on “Graph Database”. http://lambdazen.blogspot.com/2014/01/from-entity-relationship-to-property.html
Sidhu, S., Meena, U.K., Nawani, A., Gupta, H., Thakur, N.: FP Growth algorithm implementation. Int. J. Comput. Appl. 93(8) (2014)
A whitepaper on “Data Mining Algorithms In R/Frequent Pattern Mining/The FP-Growth Algorithm”. https://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Frequent_Pattern_Mining/The_FP-Growth_Algorithm
Verhein, F.: Frequent Pattern Growth (FP-Growth) Algorithm. School of Information Studies, The University of Sydney, Australia (2008)
Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R., Muharemagic, E.: Deep learning applications and challenges in big data analytics. J. Big Data 2(1), 1 (2015)
Brownlee, J.: A Tour of Machine Learning Algorithms. A post available at http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/
Qiu, J., Wu, Q., Ding, G., Xu, Y., Feng, S.: A survey of machine learning for big data processing. EURASIP J. Adv. Signal Process. 2016(1), 1–16 (2016)
Oberlin, S.: Machine learning, cognition, and big data. CA Technology Exchange, 44 (2012)
A learning material on “Machine Learning 101: General Concepts”. http://www.astroml.org/sklearn_tutorial/general_concepts.html
Machine Learning—What it is & Why it Matters. http://www.sas.com/en_id/insights/analytics/machine-learning.html
Machine Learning, Part I: Supervised and Unsupervised Learning. http://www.aihorizon.com/essays/generalai/supervised_unsupervised_machine_learning.htm
Supervised Learning Workflow and Algorithms. http://in.mathworks.com/help/stats/supervised-learning-machine-learning-workflow-and-algorithms.html?requestedDomain=www.mathworks.com
A blog on “Understanding Support Vector Machine Algorithm from Examples”. http://www.analyticsvidhya.com/blog/2014/10/support-vector-machine-simplified/
A lecture note on “Machine Learning: Decision Trees”. http://pages.cs.wisc.edu/~jerryzhu/cs540/handouts/dt.pdf
Ray, S.: Essentials of Machine Learning Algorithms (with Python and R Codes). A post at AnalyticsVidhya available at http://www.analyticsvidhya.com/blog/2015/08/common-machine-learning-algorithms/
Cios, K.J., Swiniarski, R.W., Pedrycz, W., Kurgan, L.A.: Unsupervised learning: clustering. In: Data Mining, pp. 257–288. Springer US (2007)
Yau, K.L.A., Komisarczuk, P., Teal, P.D.: Reinforcement learning for context awareness and intelligence in wireless networks: review, new features and open issues. J. Netw. Comput. Appl. 35(1), 253–267 (2012)
Wang, L. (ed.): Support Vector Machines: Theory and Applications, vol. 177. Springer Science & Business Media (2005)
Martín-Guerrero, J.D., Soria-Olivas, E., Martínez-Sober, M., Serrrano-López, A.J., Magdalena-Benedito, R., Gómez-Sanchis, J.: Use of reinforcement learning in two real applications. In: European Workshop on Reinforcement Learning, pp. 191–204. Springer Berlin Heidelberg (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Desarkar, A., Das, A. (2017). Big-Data Analytics, Machine Learning Algorithms and Scalable/Parallel/Distributed Algorithms. In: Bhatt, C., Dey, N., Ashour, A. (eds) Internet of Things and Big Data Technologies for Next Generation Healthcare. Studies in Big Data, vol 23. Springer, Cham. https://doi.org/10.1007/978-3-319-49736-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-49736-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49735-8
Online ISBN: 978-3-319-49736-5
eBook Packages: EngineeringEngineering (R0)