Advertisement

Modelling and Designing Spatial and Temporal Big Data for Analytics

  • Sinan Keskin
  • Adnan Yazıcı
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 935)

Abstract

The main purpose of this paper is to introduce a new approach with a new data model and architecture that supports spatial and temporal data analytics for meteorological big data applications. The architecture is designed with the recent advances in the field of spatial data warehousing (SDW) and spatial and temporal big data analytics. Measured meteorological data is stored in a big database (NoSQL database) and analyzed using Hadoop big data environment. SDW provides a structured approach for manipulating, analyzing and visualizing the huge volume of data. Therefore, the main focus of our study is to design a Spatial OLAP-based system to visualize the results of big data analytics for daily measured meteorological data by using the characteristic features of Spatial Online Analytical Processing (SOLAP), SDW, and the big data environment (Apache Hadoop). In this study we use daily collected real meteorological data from various stations distributed over the regions. Thus, we enable to do spatial and temporal data analytics by employing spatial data-mining tasks including spatial classification and prediction, spatial association rule mining, and spatial cluster analysis. Furthermore, a fuzzy logic extension for data analytics is injected to the big data environment.

Keywords

Meteorological big data analytics DWH SOLAP Hadoop 

References

  1. 1.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers, USA (2012) Google Scholar
  2. 2.
    Liang, Z., Xinming, T., Wenliang, J.: Temporal association rule mining based on t-Apriori algorithm and its typical application. In: International Symposium on Spatial-Temporal Modeling Analysis, vol. 5, issue 2 (2005)Google Scholar
  3. 3.
    Huang, Y.P., Kao, L.J., Sandnes, F.E.: Predicting ocean salinity and temperature variations using data mining and fuzzy inference. Int. J. Fuzzy Syst. 9(3), 143–151 (2007) Google Scholar
  4. 4.
    Kotsiantis, S., Kostoulas, A., Lykoudis, S., Argiriou, A., Menagias, K.: A hybrid data mining technique for estimating mean daily temperature values. IJICT 1(5), 54–59 (2007)Google Scholar
  5. 5.
    Kohail, S.N., El-Halees, A.M.: Implementation of data mining techniques for meteorological data analysis. Int. J. Inf. Commun. Technol. Res. (JICT) 1(3) (2011)Google Scholar
  6. 6.
    Sivaramakrishnan, T.R., Meganathan, S.: Association rule mining and classifier approach for quantitative spot rainfall prediction. J. Theor. Appl. Inf. Technol. 34(2), 173–177 (2011)Google Scholar
  7. 7.
    Weka is a collection of machine learning algorithms for data mining tasks. https://www.cs.waikato.ac.nz/ml/weka/
  8. 8.
    RapidMiner is a software platform for data science teams that unites data prep, machine learning, and predictive model deployment. https://rapidminer.com
  9. 9.
    ArcMap is the main component of Esri’s ArcGIS suite of geospatial processing programs. http://desktop.arcgis.com/en/arcmap/
  10. 10.
    PostGIS is a spatial database extender for PostgreSQL object-relational database. It adds support for geographic objects allowing location queries to be run in SQL. https://postgis.net/
  11. 11.
    Mondrian Schema Workbench is a designer interface that creates and tests Mondrian OLAP cube schemas visually. https://mondrian.pentaho.com/documentation/workbench.php
  12. 12.
    GeoMondrian is an open source Spatial OnLine Analytical Processing (Spatial OLAP or SOLAP) server, a spatially-enabled version of Pentaho Analysis Services. http://www.spatialytics.org/blog/geomondrian-1-0-is-available-for-download/
  13. 13.
    Geovisualization tool for spatial data. http://www.spatialytics.org/
  14. 14.
    The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. http://hadoop.apache.org/
  15. 15.
    The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. https://hive.apache.org/
  16. 16.
    Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3, 32–57 (1973)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York (1981)CrossRefGoogle Scholar
  18. 18.
    Gelenbe, E., Hebrail, G.: A probability model of uncertainty in data bases. In: ICDE, pp. 328–333 (1986)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Computer EngineeringMiddle East Technical UniversityAnkaraTurkey
  2. 2.School of Science and TechnologyNazarbayev UniversityAstanaRepublic of Kazakhstan

Personalised recommendations