Encyclopedia of Big Data Technologies

2019 Edition
| Editors: Sherif Sakr, Albert Y. Zomaya

Cloud Computing for Big Data Analysis

  • Fabrizio MarozzoEmail author
  • Loris Belcastro
Reference work entry
DOI: https://doi.org/10.1007/978-3-319-77525-8_136


Cloud computing is a model that enables convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction (Mell and Grance 2011).


In the last decade, the ability to produce and gather data has increased exponentially. For example, huge amounts of digital data are generated by and collected from several sources, such as sensors, web applications, and services. Moreover, thanks to the growth of social networks (e.g., Facebook, Twitter, Pinterest, Instagram, Foursquare, etc.) and the widespread diffusion of mobile phones, every day millions of people share information about their interests and activities. The amount of data generated, the speed at which it is produced, and its heterogeneity in terms of format represent a challenge to the current storage, process, and analysis...

This is a preview of subscription content, log in to check access.


  1. Agapito G, Cannataro M, Guzzi PH, Marozzo F, Talia D, Trunfio P (2013) Cloud4snp: distributed analysis of SNP microarray data on the cloud. In: Proceedings of the ACM conference on bioinformatics, computational biology and biomedical informatics 2013 (ACM BCB 2013). ACM, Washington, DC, p 468. ISBN:978-1-4503-2434-2Google Scholar
  2. Altomare A, Cesario E, Comito C, Marozzo F, Talia D (2017) Trajectory pattern mining for urban computing in the cloud. Trans Parallel Distrib Syst 28(2):586–599. ISSN:1045-9219Google Scholar
  3. Belcastro L, Marozzo F, Talia D, Trunfio P (2016) Using scalable data mining for predicting flight delays. ACM Trans Intell Syst Technol. ACM, New York, 8(1): 5:1–5:20Google Scholar
  4. Belcastro L, Marozzo F, Talia D, Trunfio P (2016, to appear) Using scalable data mining for predicting flight delays. ACM Trans Intell Syst Technol (ACM TIST)Google Scholar
  5. Belcastro L, Marozzo F, Talia D, Trunfio P (2017) A parallel library for social media analytics. In: The 2017 international conference on high performance computing & simulation (HPCS 2017), Genoa, pp 683–690. ISBN:978-1-5386-3250-5Google Scholar
  6. Dean J, Ghemawat S (2004) Mapreduce: simplified data processing on large clusters. In: Proceedings of the 6th conference on symposium on operating systems design & implementation, OSDI’04, Berkeley, vol 6, pp 10–10Google Scholar
  7. Gu Y, Grossman RL (2009) Sector and sphere: the design and implementation of a high-performance data cloud. Philos Trans R Soc Lond A Math Phys Eng Sci 367(1897):2429–2445CrossRefGoogle Scholar
  8. Hiden H, Woodman S, Watson P, Cala J (2013) Developing cloud applications using the e-science central platform. Philos Trans R Soc A 371(1983):20120085CrossRefGoogle Scholar
  9. Kang U, Chau DH, Faloutsos C (2012) Pegasus: mining billion-scale graphs in the cloud. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 5341–5344.  https://doi.org/10.1109/ICASSP.2012.6289127
  10. Langmead B, Hansen KD, Leek JT (2010) Cloud-scale rna-sequencing differential expression analysis with Myrna. Genome Biol 11(8):R83CrossRefGoogle Scholar
  11. Li A, Yang X, Kandula S, Zhang M (2010) Cloudcmp: comparing public cloud providers. In: Proceedings of the 10th ACM SIGCOMM conference on Internet measurement. ACM, pp 1–14Google Scholar
  12. Lordan F, Tejedor E, Ejarque J, Rafanell R, Álvarez J, Marozzo F, Lezzi D, Sirvent R, Talia D, Badia R (2014) Servicess: an interoperable programming framework for the cloud. J Grid Comput 12(1):67–91CrossRefGoogle Scholar
  13. Marozzo F, Talia D, Trunfio P (2015) Js4cloud: script-based workflow programming for scalable data analysis on cloud platforms. Concurr Comput Pract Exp 27(17):5214–5237CrossRefGoogle Scholar
  14. Marozzo F, Talia D, Trunfio P (2016) A workflow management system for scalable data mining on clouds. IEEE Trans Serv Comput, vol PP(99), p 1Google Scholar
  15. Martin A, Brito A, Fetzer C (2016) Real-time social network graph analysis using streammine3g. In: Proceedings of the 10th ACM international conference on distributed and event-based systems, DEBS’16. ACM, New York, pp 322–329Google Scholar
  16. Mavroidis I, Papaefstathiou I, Lavagno L, Nikolopoulos DS, Koch D, Goodacre J, Sourdis I, Papaefstathiou V, Coppola M, Palomino M (2016) Ecoscale: reconfigurable computing and runtime system for future exascale systems. In: 2016 design, automation test in Europe conference exhibition (DATE), pp 696–701Google Scholar
  17. Mell PM, Grance T (2011) Sp 800-145. The nist definition of cloud computing. Technical report, National Institute of Standards & Technology, GaithersburgGoogle Scholar
  18. Richardson L, Ruby S (2008) RESTful web services. O’Reilly Media, Inc., NewtonGoogle Scholar
  19. Talia D, Trunfio P, Marozzo F (2015) Data analysis in the cloud. Elsevier. ISBN:978-0-12-802881-0Google Scholar
  20. Tan KL, Cai Q, Ooi BC, Wong WF, Yao C, Zhang H (2015) In-memory databases: challenges and opportunities from software and hardware perspectives. SIGMOD Rec 44(2):35–40CrossRefGoogle Scholar
  21. Wang C, Li X, Chen P, Wang A, Zhou X, Yu H (2015) Heterogeneous cloud framework for big data genome sequencing. IEEE/ACM Trans Comput Biol Bioinform 12(1):166–178.  https://doi.org/10.1109/TCBB.2014.2351800CrossRefGoogle Scholar
  22. You L, Motta G, Sacco D, Ma T (2014) Social data analysis framework in cloud and mobility analyzer for smarter cities. In: 2014 IEEE international conference on service operations and logistics, and informatics (SOLI), pp 96–101Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.DIMESUniversity of CalabriaRendeItaly