Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data

  • Eslam Hussein
  • Ronewa Sadiki
  • Yahlieel Jafta
  • Muhammad Mujahid Sungay
  • Olasupo Ajayi
  • Antoine BagulaEmail author
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 311)


Meteorology is a branch of science which can be leveraged to gain useful insight into many phenomenon that have significant impacts on our daily lives such as weather precipitation, cyclones, thunderstorms, climate change. It is a highly data-driven field that involves large datasets of images captured from both radar and satellite, thus requiring efficient technologies for storing, processing and data mining to find hidden patterns in these datasets. Different big data tools and ecosystems, most of them integrating Hadoop and Spark, have been designed to address big data issues. However, despite its importance, only few works have been done on the application of these tools and ecosystems for solving meteorology issues. This paper proposes and evaluate the performance of a precipitation data processing system that builds upon the Cloudera ecosystem to analyse large datasets of images as a classification problem. The system can be used as a replacement to machine learning techniques when the classification problem consists of finding zones of high, moderate and low precipitations in satellite images.


Hadoop MapReduce Spark Hive Meteorology Big data 


  1. 1.
    GmbH, J.: Joint Aviation Authorities Airline Transport Pilot’s Licence Theoretical Knowledge Manual. Oxford Aviation Training (2001)Google Scholar
  2. 2.
    Ahrens, C.D.: Meteorology Today: An Introduction to Weather, Climate, and the Environment. Cengage Learning, Boston (2012)Google Scholar
  3. 3.
    Swails, B., Berlinger, J.: Tropical cyclone kenneth death toll rises to 38 in mozambique, officials say (2019)Google Scholar
  4. 4.
    Shi, E., Li, Q., Gu, D., Zhao, Z.: A method of weather radar echo extrapolation based on convolutional neural networks. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10704, pp. 16–28. Springer, Cham (2018). Scholar
  5. 5.
    Kamilaris, A., Prenafeta-Boldú, F.X.: Deep learning in agriculture: a survey. Comput. Electron. Agric. 147, 70–90 (2018)CrossRefGoogle Scholar
  6. 6.
    Al-Jarrah, O.Y., Yoo, P.D., Muhaidat, S., Karagiannidis, G.K., Taha, K.: Efficient machine learning for big data: a review. Big Data Res. 2(3), 87–93 (2015)CrossRefGoogle Scholar
  7. 7.
    Dagade, V., Lagali, M., Avadhani, S., Kalekar, P.: Big data weather analytics using hadoop. Int. J. Emerg. Technol. Comput. Sci. Electron. (IJETCSE) ISSN, 0976–1353 (2015)Google Scholar
  8. 8.
    Chen, C.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)CrossRefGoogle Scholar
  9. 9.
    Ibrahim, G., et al.: Big data techniques: hadoop and mapreduce for weather forecasting. Int. J. Latest Trends Eng. Technol. 194–199 (2016)Google Scholar
  10. 10.
    Pandey, A., Agrawal, C., Agrawal, M.: A hadoop based weather prediction model for classification of weather data. In: 2017 Second International Conference on Electrical, Computer and Communication Technologies (ICECCT), pp. 1–5. IEEE (2017)Google Scholar
  11. 11.
    Riyaz, P., Varghese, S.M.: Leveraging map reduce with hadoop for weather data analytics. J. Comput. Eng. 17(3), 6–12 (2015)Google Scholar
  12. 12.
    Oury, D.T.M., Singh, A.: Data analysis of weather data using hadoop technology. In: Satapathy, S.C., Bhateja, V., Das, S. (eds.) Smart Computing and Informatics. SIST, vol. 77, pp. 723–730. Springer, Singapore (2018). Scholar
  13. 13.
    Jayanthi, D., Sumathi, G.: Weather data analysis using spark-an in-memory computing framework. In: 2017 Innovations in Power and Advanced Computing Technologies (i-PACT), pp. 1–5. IEEE (2017)Google Scholar
  14. 14.
    White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc., Newton (2012)Google Scholar

Copyright information

© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2020

Authors and Affiliations

  • Eslam Hussein
    • 2
  • Ronewa Sadiki
    • 2
  • Yahlieel Jafta
    • 2
  • Muhammad Mujahid Sungay
    • 2
  • Olasupo Ajayi
    • 1
    • 2
  • Antoine Bagula
    • 1
    • 2
    Email author
  1. 1.ISAT LaboratoryUniversity of the Western CapeCape TownSouth Africa
  2. 2.Department of Computer ScienceUniversity of the Western CapeCape TownSouth Africa

Personalised recommendations