Spatio-temporal Data Mining

  • Tao ChengEmail author
  • James Haworth
  • Berk Anbaroglu
  • Garavig Tanaksaranond
  • Jiaqiu Wang
Living reference work entry


As the volume, variety, and veracity of spatio-temporal datasets increase, traditional statistical methods for dealing with such data are becoming overwhelmed. Nevertheless, spatio-temporal data are rich sources of information and knowledge, waiting to be discovered. The field of spatio-temporal data mining emerged out of a need to create effective and efficient techniques in order to turn big spatio-temporal data into meaningful information and knowledge. This chapter reviews the state of the art in spatio-temporal data mining research and applications, from conventional statistical methods to machine learning approaches in the big data era, with emphasis placed on three key areas: prediction, clustering/classification, and visualization.


Spatio-temporal Data mining Prediction Clustering Visualization Artificial intelligence Machine learning 

1 Introduction

Data mining is the process of discovering patterns in large datasets involving methods, techniques, and algorithms at the intersection of statistics, artificial intelligence, machine learning, and database systems. It should be noted that the term data mining is a misnomer, since the objective is to extract patterns and knowledge, and not mining of data itself. The term is also a buzzword, frequently applied to any form of large-scale information processing. The actual data mining task involves the semiautomatic or automatic analysis of large quantities of data to extract previously unknown patterns such as clusters of data, anomalies, and dependences in data.

With automatic sensor networks, the Internet of things, and crowd sourcing now being used extensively to monitor a diverse range of phenomena, the volume, variety, and veracity of spatio-temporal data being routinely collected have increased dramatically (Li et al. 2016). Examples include daily temperature series at weather stations, crime counts in census tracts, geotagged social media, and traffic flows on urban roads. Spatio-temporal data mining (STDM) is that subfield of data mining that focuses on the process of discovering patterns in large spatio-temporal (geolocated and time-stamped) datasets with the overall objective of extracting information and transforming it into knowledge to enable decision making. The major tasks of STDM include spatio-temporal prediction, clustering, classification, and visualization. The methods designed to carry out these tasks must account for spatio-temporal autocorrelation and heterogeneity, which sets them apart from traditional data mining methods.

Early methods for identifying patterns in spatio-temporal data include statistical models from the fields of time series analysis, spatial analysis, and econometrics. These approaches were typically concerned with teasing scarce information from homogeneous datasets, with an emphasis on statistical significance and explanatory power. Such approaches are appropriate in specific cases where the model assumptions hold but are not generally applicable when the size and diversity of spatio-temporal data become large. Increasingly, researchers and practitioners turned toward less conventional techniques that are better suited to deal with the heterogeneous, nonlinear, and multi-scale properties of large-scale, streaming spatio-temporal datasets. For instance, methods from artificial intelligence, machine learning, and, increasingly, deep learning are now being successfully applied to STDM tasks. Recent surveys have reviewed the literature on STDM and applications (Atluri et al. 2018).

This chapter begins in Sect. 2 with an introduction to spatio-temporal autocorrelation, the presence of which necessitates specialized STDM approaches. The remainder is organized around three main tasks of STDM: Sect. 3 is devoted to spatio-temporal modeling and prediction, by either statistical (parametric) approaches or machine learning (nonparametric) approaches. Sect. 4 reviews spatio-temporal clustering, which is followed by an introduction to spatio-temporal visualization in Sect. 5. The final section summarizes the future directions of research in STDM.

2 Spatio-temporal Autocorrelation

An observation from nature is that near things tend to be more similar than distant things both in space and time. For instance, the weather tomorrow is likely to be more similar to today’s weather than the weather a week ago or a month ago and so on. Similarly the weather 1 mile away is likely to be more similar than the weather 10 miles away or 100 miles away. These phenomena are referred to, respectively, as temporal and spatial dependence. Such dependence can be present both within and between datasets (cross- or co-dependence). The presence of dependence in spatial and temporal data violates the stationarity assumption of classic statistical models such as ordinary least squares and necessitates the use of specialized spatio-temporal analysis techniques. However, its presence also enables future events to be predicted based on knowledge of past events. The first step in a STDM workflow should be to identify whether STDM methods are appropriate to the task at hand. This involves examining the dataset for evidence of spatio-temporal patterns. In this section, we review the identification of spatio-temporal dependence and association from two viewpoints: statistical tests of spatial dependence and identification of association rules.

2.1 Statistical Approaches

Testing for dependence is typically accomplished using autocorrelation analysis. Autocorrelation is the cross-correlation of a signal with itself and can be measured in temporal data using the temporal autocorrelation function or in spatial data using an index such as the familiar Moran’s I coefficient (Anselin 1988). Both of these are based on Pearson correlation. The spatio-temporal autocorrelation function, also Pearson based, measures autocorrelation in spatio-temporal data at given spatial and temporal lags (Pfeifer and Deutsch 1981). Pearson-based measures are appropriate for data represented as graphs using a spatial weight matrix, examples of which include administrative polygons and road networks. Indices have also been devised for point attribute data, including space-time (semi)variograms (Heuvelink et al. 2015).

The aforementioned measures are global, implying a degree of fixity in the level of autocorrelation across the space/time such that it can be described by a single parameter. However, this is often unrealistic. Time series may exhibit nonstationarity, which means that their mean and variance are not constant in time. The analogue of this in spatial data is referred to as heterogeneity. Heterogeneity has two distinct aspects: structural instability as expressed by changing functional forms or varying parameters and heteroskedasticity that leads to error terms with nonconstant variance (Anselin 1988). Ignoring nonstationarity or heterogeneity can have serious consequences including biased parameter estimates, misleading significance levels, and poor predictive power. Anselin (1988) provides some methods for testing for heterogeneity in spatial data. Additionally, a number of local indicators of spatial association have been devised (see the chapter “Exploratory Spatial Data Analysis” by Symompik in this handbook). These include a local variant of Moran’s I and Getis and Ord’s Gi and \( {\mathrm{G}}_{\mathrm{i}}^{\ast } \)statistics, which measure the extent to which high and low values are clustered together. A simple method that can be applied to spatio-temporal data is the cross-correlation function, which measures temporal autocorrelation between two series (Cheng et al. 2012; Figs. 1 and 2).
Fig. 1

(a) cross-correlation function (CCF) and (b) coefficient of determination (CoD) between unit journey times of three pairs of road links in central London in the AM peak period (7–10 am) (Cheng et al. 2012)

Fig. 2

Average cross-correlation function (CCF) between links and their first-order neighbours at temporal lag zero in (a) the AM peak; (b) interpeak; and (c) PM peak (Cheng et al. 2012)

Autocorrelation measures deal with correlation in attributes observed at spatio-temporal locations. Often, we are interested in the spatio-temporal arrangement of data points. For example, we may be interested in whether the spatio-temporal distribution of crime locations is random. In such cases, rather than testing for the presence of autocorrelation, we can test against an assumption of complete spatio-temporal randomness or can employ indicators of spatio-temporal clustering such as the spatio-temporal Ripley’s K function. See González et al. (2016) for a review. See also the chapter “Spatio-temporal Point Pattern Processes and Models” by Lomax in this handbook.

2.2 Data Mining Approaches

Association (or co-location) rule mining is an approach that has its roots in the data mining community. It involves the inference of the presence of spatio-temporal features in the neighborhood of other spatio-temporal features. A spatio-temporal co-location rule implies a strong association between locations A and B; if the attributes of A take some specific value at a point in time, then with a certain probability, at the same point in time, the attributes of B will take some specific value. A related STDM task is mixed drove co-occurrence pattern mining. Mixed drove co-occurrence patterns are subsets of two or more different object types whose instances are often located close to one another in space and time. A review of these approaches can be found in Shekhar et al. (2015). The drawback of these methods is that only contemporaneous associations are considered so they do not account for the evolution of a spatial process over time.

A logical extension to association mining is to analyze spatio-temporal sequential patterns. This involves finding sequences of events (an ordered list of item sets) that occur frequently in spatio-temporal datasets. Sequential pattern mining algorithms were first introduced to extract patterns from customer transaction databases. A spatio-temporal sequential pattern means that if at some point in time and space, the attributes in A take some specific value, then with a certain probability at some later point in time, attributes at B will take some specific value. Sequential pattern mining implicitly incorporates the notion of spatio-temporal dependence; that the events at one location at one time can have some causal influence on the events at another location at a subsequent time. A similar concept to sequential patterns are cascading spatio-temporal patterns, which are ordered subsets of events that are located close together and occur in a cascading sequence (Shekhar et al. 2015).

2.3 Summary

The methods described in this section help identify spatio-temporal dependence and structure in datasets. They can be used to inform whether STDM is an appropriate approach and the type of model that is required. If global spatio-temporal autocorrelation is present, then a global parametric model specification may be sufficient. However, if nonstationary or heterogeneous behavior is observed, then traditional parametric approaches may not be sufficient, and appropriate STDM methods must be employed. These approaches are discussed in the remainder of this chapter.

3 Space-Time Forecasting and Prediction

Space-time models must account for the problem of spatio-temporal autocorrelation. Uptake of spatio-temporal models has traditionally been limited by the scarcity of large-scale spatio-temporal datasets. This is a situation that has been reversed over recent decades, and we are now inundated with data and require methods to deal with them quickly and effectively. The methods that are currently applied to space-time data can be broadly divided into two categories: statistical (parametric) methods and machine learning (nonparametric) methods. These are described in turn in the following subsections.

3.1 Statistical (Parametric) Models

The state of the art in statistical modeling of spatio-temporal processes represents the outcome of several decades of cross-pollination of research between the fields of time series analysis, spatial statistics, and econometrics. Some of the methods commonly used in the literature include space-time autoregressive integrated moving average (STARIMA) models (see the chapter “Spatial Dynamics and Space-Time Data Analysis” by Rey in this handbook) and variants, spatial panel data models, geographically and temporally weighted regression, eigenvector spatial filtering (see the chapter “Spatial Autocorrelation and Spatial Filtering” by Griffith and Chun in this handbook), and spatio-temporal Bayesian hierarchical models.

3.1.1 Space-Time Autoregressive Integrated Moving Average

Space-time autoregressive integrated moving average (STARIMA) represents a family of models that extend the ARIMA time series model to space-time data (Pfeifer and Deutsch 1980). STARIMA explicitly takes into account the spatial structure in the data through the use of a spatial weight matrix (for the definition of spatial weight matrix, see the chapter “Cross-Section Spatial Regression Models” by Le Gallo in this handbook). The general STARIMA model expresses an observation of a spatial process as a weighted linear combination of past observations and errors lagged in both space and time. A fitted STARIMA model is usually described as a STARIMA (p,d,q) model, where p indicates the autoregressive order, d the order of differencing, and q the moving average order of the model. The application of STARIMA models has been fairly limited in the literature, with examples existing in traffic prediction and temperature forecasting.

Some important special cases of the general STARIMA model should be noted; when d= 0, the model reduces to a STARMA model; furthermore, a STARMA model with q= 0 is a STAR model and with p=0 is a STMA model. STARIMA models are useful tools for modeling space-time processes that are stationary (or weak stationary) in space and time (i.e., spatial-temporal autocorrelation is low). Although the STARIMA model family accounts for spatio-temporal autocorrelation, it has not yet been adequately adapted to deal with spatial heterogeneity as parameter estimates are global. Recent approaches have addressed this by allowing local and time-varying weight matrices (Cheng et al. 2014).

3.1.2 Spatial Panel Data Models

Panel data is a term used in the econometrics literature for multidimensional data. A panel contains observations on multiple phenomena (cross-sections) over multiple time periods. Spatial panels typically refer to data containing time series observation of spatial units such as zip codes, regions, cities, states, and countries. Panel data are more informative and contain more variation and less colinearity among variables. The use of panel data increases the efficiency in model estimation and allows for specifying more complicated hypotheses (Elhorst 2010). Two types of data models using spatial panels may be distinguished. The first type relates to static models. These models just pool time series cross-section data but more often control for fixed or random spatial and/or time period-specific effects. The second type relates to dynamic spatial panel data models, i.e., spatial panel data models with mixed dynamics in both space and time that have received increasing attention in the spatial econometrics literature in recent years. The chapter “Spatial Panel Models” by Elhorst in this handbook provides a survey of the existing literature on spatial panel models, considering both static and dynamic models and discussing specification and estimation issues.

3.1.3 Space-Time Geographically Weighted Regression

Recently, there has been a great deal of interest in extending geographically weighted regression (GWR; see the chapter “Geographically Weighted Regression” by Wheeler in this handbook) to the temporal dimension. In their geographically and temporally weighted regression model, Huang et al. (2010) incorporate both the spatial and temporal dimensions into the weight matrix to account for spatial and temporal nonstationarity. The technique was applied to a case study of residential housing sales in the city of Calgary from 2002 to 2004 and found to outperform GWR as well as temporally weighted regression. A spatio-temporal kernel function is proposed to replace the spatio-temporal weight matrix as a further extension of GWR (Fotheringham et al. 2015), and a further mixed geographically and temporally weighted regression can explore both global and local spatio-temporal heterogeneity.

3.1.4 Space-Time Geostatistics

Space-time geostatistics is concerned with the statistical modeling of geostatistical data that vary in space and time. It estimates geostatistical concepts such as covariance structures and semivariograms to the space-time domains (Heuvelink et al. 2015). The aim is to build a process that mimics some patterns of the observed spatio-temporal variability.

The first step usually involves separating the deterministic component m(u, t) of space-time coordinates u and t. Following this, a covariance structure is fitted to the residuals. The simplest approach is to separate space and time and consider the space-time covariance to be either a sum (zonal anisotropy model) or product (separable model) of separate spatial and temporal covariance functions. Although simple to implement, these models have the disadvantage that they do not consider space-time interaction. They assume a fixed temporal pattern across locations and a fixed spatial pattern across time. Additionally, it is not straightforward to separate the component structures from the experimental covariances.

The other approach is to model a joint space-time covariance structure. This approach is generally accepted to be more appropriate. Once an appropriate space-time covariance structure has been defined, one can use standard Kriging techniques for interpolation and prediction; space-time geostatistical techniques are best applied to stationary space-time processes. Highly nonstationary spatio-temporal relationships require a very complicated space-time covariance structure to be modeled for accurate prediction to be possible. Despite being spatio-temporal in nature, the main function of space-time geostatistical models is space-time interpolation, and they encounter problems in forecasting scenarios where extrapolation is required. Recently, geostatistical models have been implemented in a Bayesian framework, which enables exact inference and quantification of uncertainty. They have also been adapted to be applied to network datasets, with case studies in transport.

3.2 Machine Learning (Nonparametric) Approaches

In parallel to the development of statistical space-time models, there was a multidisciplinary explosion of interest in nonparametric machine learning methods. Machine learning is a subfield of soft computing that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. Machine learning may be defined as an application of artificial intelligence with the primary objective of allowing computing systems to learn automatically without human interaction or assistance and adjust actions accordingly. Learning may be supervised, semi-supervised, and unsupervised. The field of machine learning is huge, and a comprehensive review of methods such as self-organizing maps, principal components analysis, classification and regression trees, and random forests is beyond the scope of this chapter. The book Machine Learning for Spatial Environmental Data by Kanevski et al. (2009) provides a good introduction into the field with special emphasis on the use of spatial data.

3.2.1 Artificial Neural Networks

Artificial neural networks – vaguely inspired by information processing and communication patterns in biological nervous systems – are a specific set of algorithms that have revolutionized machine learning. Artificial neural networks may be viewed as general function approximations which is why they can be applied to many domains such as classification and regression. In a neural network, data are passing through interconnected layers of nodes, classifying characteristics and information of a layer before passing the results onto other nodes in subsequent layers. Neural networks and deep learning differ only by the number of network layers. A typical neural network may have two to three layers; deep learning networks might have dozens or hundreds.

There are supervised and unsupervised models using neural networks. The most generally known is the feedforward neural network whose architecture is a connected and directed graph of neurons (processing units) with no cycles that is trained using an algorithm called backpropagation. The reader is encouraged to review the key concepts of neural network modeling covered in Fischer (2015).

Artificial neural networks have been widely applied in spatial and temporal analysis. For example, Kanevski et al. (2009) have applied various types of neural network architectures to spatial and environmental modeling problems including radial basis function neural networks, general regression neural networks, probabilistic neural networks, and neural network residual Kriging models and have gained excellent results. The strength of the neural networks is that they learn from empirical data and can be applied to almost any machine learning processes about learning complex mapping from the input to the output space. This makes them particularly effective in accounting for dependency structures present in space-time data that cannot be explicitly modeled.

Artificial neural networks have been applied to spatio-temporal pattern recognition problems since the 1980s, and many of the algorithms are now regaining popularity in the guise of deep learning, the latest advance of machine learning. Recently, deep learning has been used for predictive learning of spatio-temporal data. Deep learning networks can automatically extract latent spatial and/or temporal features to model the underlying spatio-temporal dependence in the data. Spatio-temporal deep learning models combine neural architectures with a large set of hidden layers for application to spatio-temporal datasets. Convolutional neural networks can be combined with a recurrent neural network or long-short-term memory networks for grid-based spatio-temporal predictive learning.

3.2.2 Kernel Methods

In machine learning, kernel methods represent a class of methods and algorithms for pattern analysis, i.e., finding types of relationships (clusters, principal components, classifications, correlations). The methods use kernel functions that allow them to operate in a high-dimensional, implicit feature space without calculating the coordinates of data in the space, but computing the inner products between the images of all the pairs in the feature space. This operation is termed kernel trick (or kernel substitution) and turns any linear model into a nonlinear model by replacing its features (predictors) by a kernel function. This technique, for example, applied to principal components analysis leads to a nonlinear variant of principal components analysis. Additional information on kernel methods can be found in Chapter 6 of Bishop (2006).

The most prominent member of the class of kernel methods is the support vector machine. A support vector machine constructs a hyperplane or sets of hyperplanes for solving problems in classification, regression, and outlier detection. An important characteristic of support vector machines is that the determination of model parameters corresponds to a convex optimization problem so that any local solution found is also a global optimum (Bishop 2006, 325).

Kernel methods in general and support vector machines in particular have been applied to spatial time series in a number of application areas including traffic flow and travel time forecasting (see, e.g., Haworth et al. 2014) and modeling exposure to ambient black carbon (Abu Awad et al. 2017).

3.3 Summary

In this section, the nonlinear, nonstationary properties of spatio-temporal data and their implications for space-time models were outlined. The question is which model should one choose for a given STDM task? The answer to this depends on the data. In the literature, space-time analysis has historically been applied to data with low spatial and/or temporal resolution. In the tradition of spatial analysis, the use of such data is to elicit causal relationships between variables that can give some valuable insights into the underlying processes. In this case, the use of parametric statistical models may be preferable because of their explanatory power and interpretability.

However, these days, more and more data sources are becoming available in (near) real time at high spatial and temporal resolutions. Extracting meaningful relationships from such data is a task that is secondary to forecasting, and machine learning approaches, with their greater flexibility, will play an ever-increasing role. Generally, machine learning methods have a wider field of application than traditional geostatistics due to their ability to deal with multidimensional nonlinear data. They are also well suited to dealing with large databases and long periods of observation. In particular, support vector machines appear to be favorable because it avoids the curse of dimensionality faced by other methods. Currently, deep learning is also gaining momentum in STDM. One of the future research directions in this area lies in improving the interpretability of the structure and output of machine learning algorithms. Another way is to use a hybrid framework that combines statistical and machine learning approaches (Cheng et al. 2011).

4 Space-Time Clustering

Clustering is an unsupervised learning task that involves grouping unlabeled objects that share similar characteristics. In general, the goal is to maximize the difference between clusters while simultaneously maximizing the similarity within classes. There are five main types of clustering algorithm: partitioning, model-based, hierarchical, density-based, and grid-based. In addition to these, statistical tests can also be used to detect clusters. These are discussed in the following subsections.

4.1 Partitioning and Model-Based Clustering

Perhaps the most well-known clustering algorithm, k-means, is a partitioning-based algorithm. This type works well for attribute clustering, with many applications in geo-demographics, but is less suitable for identifying interesting spatio-temporal structures such as hotspots because each point must be assigned to a cluster. Model-based approaches, such as Gaussian mixture models, are similar to partitioning approaches but have the benefit of assigning probabilities of class membership.

4.2 Hierarchical Clustering

Hierarchical clustering is a tree-based method that is related to decision tree approaches. Like partitioning and model-based clustering, hierarchical clustering is effective in clustering attributes. An advantage over the former two methods is that the tree can be cut at different levels to provide different numbers of clusters. However, it also has the drawback that every point must be assigned to a cluster, which limits its application to finding structure in the arrangement of spatio-temporal data.

4.3 Density- and Grid-Based Clustering

Density-based clustering methods are particularly suited to spatio-temporal data because they can identify clusters of arbitrary shape and duration without the need to specify the number of clusters sought. They work by connecting points that are located close to one another as clusters; disconnected points are considered to be noise or outliers. ST-DBSCAN (spatio-temporal density-based spatial clustering of applications with noise) is a notable example of density-based clustering (Birant and Kut 2007). OPTICS (ordering points to identify the clustering structure) is a related approach that enables clusters of varying density to be found, which allows the multi-scale issue of spatio-temporal data to be tackled. Although most applications use point datasets, density-based methods can also be applied to other spatio-temporal data types such as trajectories.

Grid-based clustering is related to density-based clustering but involves first partitioning the spatio-temporal region into a multidimensional grid in which locally dense regions are sought. An example of this is ST-GRID (Wang et al. 2006). Grid-based approaches are computationally more efficient than density-based approaches but are dependent on the definition of the grid.

4.4 Statistical Tests for Clustering

There are various statistical tests for clustering in spatio-temporal data, such as the spatio-temporal Ripley’s K function, but most of these are global indices that do not identify the spatio-temporal location of clusters. A notable exception is space-time scan statistics, which is a statistical test for clustering that was originally devised to detect disease outbreaks (Neill 2009). The method has since been applied to crime hotspot detection (Cheng and Adepeju 2014), among others. The goal of space-time scan statistics is to automatically detect spatio-temporal regions that are anomalous, unexpected, or otherwise interesting. Spatial and temporal proximities are explored by scanning the study region via overlapping cylindrical or rectangular space-time regions of varying sizes.

The observed value of a space-time region is compared with its expected value based upon historical data. Statistical significance is evaluated using Monte Carlo simulation. If the space-time region is found to be significant at this stage, then a significant cluster is found (Neill 2009). Space-time scan statistics has the significant drawback that the entire study region has to be scanned at all space-time region sizes, which is computationally intensive and limits scalability. The assumption that a cluster is a regular geometrical shape is also not realistic (e.g., disease might have spread via the river, thus affecting the people along the river’s course) and remains a limitation of the method. This problem could be tackled by generating irregularly shaped space-time regions or applying the method to other spatial configurations such as networks.

4.5 Summary

This section introduced methods for spatio-temporal cluster detection. Depending on the clustering task, different methods are more appropriate. Partitioning, model-based, and hierarchical approaches are most suited to attribute data, while statistical, density-based, and grid-based methods can identify the spatio-temporal location of clusters. Once clusters are found, it is the task of the practitioner to assign meaning to them.

5 Spatio-temporal Visualization

Mining interesting patterns, rules, and structures from spatio-temporal data is only part of the task of STDM. The results are not useful if they are not easily understood. For instance, finding a spatio-temporal cluster in a patient register dataset is not useful in itself. On the other hand, confirming this spatio-temporal cluster as a disease outbreak and visualizing it using a platform that epidemiologists and medical professionals can understand is very useful indeed. As a result, space-time visualization has emerged as another important facet of STDM. However, visualizing spatio-temporal data is an inherently difficult task because representing time on a map is challenging. Geographic visualization, often termed geovisualization, enhances traditional cartography by providing dynamic and interactive maps. Many new techniques for visualizing time on maps have been proposed. These techniques can be divided into three broad types: (static) 2D and 3D maps and animations. Here we also cover the process of using visualization as an integral part of the STDM process, which is termed geovisual analytics.

5.1 Two-Dimensional Maps

There are various ways to represent time on static two-dimensional (2D) maps, either as a single static map or multiple snap shots. Since all time steps are shown at the same time, the map-reader doesn’t need to retain events temporarily in mind, which prevents critical information being missed. However, 2D maps can only present a few time steps at a time due to the limitation of the available map media (computer screen, paper, etc.). This means that 2D mapping approaches must make some sacrifices in terms of the information they display. Intelligent use of visual variables (colors, sizes, texture) can add additional information to 2D maps. The classic example of this technique is Minard’s map, which shows Napoleon’s doomed campaign to Moscow in 1869. Time was displayed as an axis on the map (parallel to the axis of the geographical position), and the number of remaining soldiers was shown by the thickness of the lines.

Monmonier (1990) presented a range of methods for visualizing time on maps, including dance maps, chess maps, and change maps. Dance maps are used to visualize the movement of spatial objects by drawing movement paths or pinpoints of objects on a 2D plane. A chess map is a series of maps laid out in a chess board manner, with each map representing a single time slice. Interpretation of changes is left to the user. A change map shows changes or differences against a reference time period, as an absolute value or percentage. Change maps are effective for representing quantitative attributes as users do not have to calculate the amount of change by themselves. An alternative is to place charts on maps to show attribute change (Andrienko et al. 2007). This approach has the drawback that the map display can become overcrowded.

Some approaches, known as cartograms, transform or distort geographic space in order to enhance information communication, with the London Underground map being a famous example. The spatial treemap is an example of a cartogram approach that divides geographic space into rectangular units within which temporal attribute information can be visualized (Wood and Dykes 2008). This technique allows the visualization of a large number of time points, but the user must relate the visualization to the underlying geography, which can be difficult. Cartograms can also be used to distort geographic space to represent times. For example, travel time on transportation networks can be represented as distances in map units. Alternatively, isochrones can be drawn on a map to represent lines of equal travel time from a given origin. Isochrones have the advantage that they do not distort the underlying geography.

5.2 Three-Dimensional Visualization

Three-dimensional (3D) visualization is a natural choice for spatio-temporal data because time can easily be represented as the third dimension, providing a simple way to show multiple times in a single view. The pioneering example of this is the space-time cube (Hägerstrand 1970). A space-time cube consists of two dimensions of geographic locations on a horizontal plane and a time dimension in the vertical plane (or axis). Space-time cubes can be used to visualize trajectories of objects in 3D, or “space-time paths.” The field of time geography has emerged from the study of trajectories in space-time cubes (Miller 2005).

The space-time cube has two main limitations. Firstly, the 3D display makes it difficult to refer space-time paths to geo-locations and time. Secondly, the space-time cube has difficulty in displaying large amounts of data. However, interactive techniques and data aggregation can be used to reduce cluttering when displaying multiple objects. Alternatively, groups of objects can be represented as densities and visualized as isosurfaces. An isosurface is a three-dimensional analogy of an isoline that represents a surface of constant value within a volume of space. As well as density, isosurfaces can represent interpolated attributes such as travel times on a road network (Cheng et al. 2013). Isosurfaces work well in Euclidean space but are not as appropriate for network datasets. The 3D wall map is a 2D road map with an additional time dimension to display the change. Each layer represents the situation at a time, with color representing the attribute value (Cheng et al. 2013; Fig. 3).
Fig. 3

Wall map of travel delay (mins/km) of outbound roads during the morning peak on 5,12,19, 26 October 2009 (Cheng et al. 2013)

5.3 Animated Maps

With improvements in computing power and Internet technology in recent decades, animated maps have become a very active area of research and are now distributed widely on the Internet. The basic principle is to show a sequence of 2D maps, emphasizing dynamic variables such as duration, rate of change, order of change, frequency, display time, and synchronization (MacEachren et al. 2004). Weather maps and traffic maps are two of the many examples. Animated maps can use motion to emphasize key attributes by using, for example, blinking symbols to attract attention to a certain location on the map. The drawback of animated maps is that they do not incorporate interactivity, which limits their use for analysis purposes.

5.4 Visual Analytics: The Current Visualization Trend

Visual analytics is an outgrowth of the field of scientific and information visualization. It refers to “the science of analytical reasoning facilitated by interactive visual interfaces” (Thomas and Cook 2005). In the context of STDM, visual analytics is referred to as geovisual analytics. Geovisual analytics is an iterative process that integrates visualization at various stages of the STDM process, including information gathering, data pre-processing, knowledge representation, and decision making. Normally, unknown data are visualized in order to give a basic view, and then users will use their perception (intuition) to gain further insights from the images produced by visualization. The insights generated by are then transformed into knowledge. After users have gained certain knowledge, they can generate hypotheses that will be used to carry out further analysis using available data analysis and exploration techniques. The results from analytical process will be visualized for presentation and further gain in knowledge.

In recent years, governments and businesses have made more of their data available through APIs (application programming interfaces). This has opened data sources to the public and enabled the process of visual analytics to be applied to streaming datasets in the form of dashboards and interactives. Compared with static 2D and 3D maps, dashboards are more effective in displaying and revealing spatio-temporal information due to their customization ability. Different aspects of the data can be displayed in different parts of the screen and dynamically linked to user input. Dashboards and interactives can be used primarily for visualization or facilitating the STDM workflow by producing interpretable outputs from massive datasets.

Several toolkits can be used to build interactive dashboards. For example, Shiny is a package in R programming that allows users to create an interactive web-based dashboard simply with R scripts. Visualization modules such as interactive maps, plots, tables, and textboxes can be integrated and organized in one framework. Code-free applications and platforms are also available to visualize spatio-temporal information, such as Tableau ( and ArcGIS operational dashboards (

5.5 Summary

Spatio-temporal visualization is an integral part of the STDM process. Advances in Internet technology and computing facilities have made visualization of large streaming datasets more accessible than ever. However, despite significant progress, visualizing large volumes of data in real time and making effective use of the third dimension remain a challenge. In particular, the results of complex analyses should also be visualized in a geovisual analytics workflow in order to add value to spatio-temporal data.

6 Conclusion

Since the concept of knowledge discovery from databases was proposed in 1988, tremendous progress has been made in data mining and spatial data mining (Miller and Han 2009; Li et al. 2016). STDM is only possible based upon the progress in those areas, along with GIS and geocomputation (see the chapter “Geographic Information Science” by Goodchild and Longley in this handbook). This chapter introduces the fundamentals of STDM, which consists of space-time prediction, clustering, and visualization. However, the field of STDM is far from mature, and further research is needed in the following areas:
  • New methods are needed for mining crowd-sourced data. This data is often extremely noisy, biased, and nonstationary. Examples include geotagged social media and trajectory data obtained from smartphones. This area is relevant to the recent development of citizen science, volunteered geographic information, and the Internet of things in particular.

  • Methods are required to extract meaningful patterns from individual sources and integrate them in complex network frameworks. In particular, the linkage between different networks such as transport, infrastructure, social networks, and the economy is a significant challenge. Using network frameworks, the interactions within and between systems can be considered in mining spatio-temporal patterns. This aspect is relevant to complexity theory and network dynamics in particular.

  • STDM techniques are required for emergence and tipping point detection. For example, how do we find emergent patterns and tipping points in the performance of the economy or spread of a disease that can generate actionable insights. It is important to find outliers, but more important is finding the critical points before the system breaks down so that mitigating action can be taken to avoid the worst scenarios such as traffic congestion and epidemic transmission. This aspect is relevant to the latest development in system optimization and in reinforcement learning to improve the operation of systems.

  • Another challenge of STDM is how to calibrate, explain, and validate the knowledge extracted. A good example of this is the calibration of spatial (or spatio-temporal) autocorrelation. Higher-order spatial autocorrelation models have been developed, but the pitfalls have also been found (LeSage and Pace 2011). Nonstationarity and autocorrelation are fundamental to our observation (or our empirical test) of reality; it is hard to prove that the higher-order autocorrelation comes from the first to the second, and then to the third, or from the first to the third directly, which makes the explanation unconvincing. Furthermore, validation is difficult – so far Monte Carlo simulation is the main tool for simulation, which is also based upon a statistical distributions. This makes machine learning more promising in future STDM.

  • Technically, grid computation and cloud computation allow data mining to be implemented at multiple computer sources. But how to scale algorithms to larger datasets will always be a challenge for data mining given the increase of data volume is far quicker than the improvement in the performance of data processors. Online machine learning could be a potential solution on this.

Please notice that the content of this chapter is mainly around spatial data in point, networks, and grids, but not on remote sensing data, which is another broad area of research.

7 Cross-References


  1. Abu Awad Y, Koutrakis P, Coull BA, Schwartz J (2017) A spatio-temporal prediction model based on support vector machine regression: Ambient Black Carbon in three New England States. Environ Res 159:427–434CrossRefGoogle Scholar
  2. Andrienko G, Andrienko N, Jankowski P, Keim D, Kraak M-J, MacEachren A, Wrobel S (2007) Geovisual analytics for spatial decision support: setting the research agenda. Int J Geogr Inf Sci 21(8):839–857CrossRefGoogle Scholar
  3. Anselin L (1988) Spatial econometrics: methods and models. Springer, DordrechtCrossRefGoogle Scholar
  4. Atluri G, Karpatne A, Kumar V (2018) Spatio-temporal data mining: a survey of problems and methods. ACM Comput Surv 51(4):83:1–83:41CrossRefGoogle Scholar
  5. Birant D, Kut A (2007) ST-DBSCAN: an algorithm for clustering spatial–temporal data. Data Knowl Eng 60(1):208–221CrossRefGoogle Scholar
  6. Bishop C (2006) Pattern recognition and machine learning. Springer, New YorkGoogle Scholar
  7. Cheng T, Adepeju M (2014) Modifiable temporal unit problem (MTUP) and its effect on space-time cluster detection. PLoS ONE 9(6):e100465CrossRefGoogle Scholar
  8. Cheng T, Wang J, Li X (2011) A hybrid framework for space–time modeling of environmental data. 环境数据时空建模的混合框架. Geogr Anal 43(2):188–210CrossRefGoogle Scholar
  9. Cheng T, Haworth J, Wang J (2012) Spatio-temporal autocorrelation of road network data. J Geogr Syst 14(4):389–413CrossRefGoogle Scholar
  10. Cheng T, Tanaksaranond G, Brunsdon C, Haworth J (2013) Exploratory visualisation of congestion evolutions on urban transport networks. Transp Res C Emerg Technol 36:296–306CrossRefGoogle Scholar
  11. Cheng T, Wang J, Haworth J, Heydecker B, Chow A (2014) A dynamic spatial weight matrix and localized space–time autoregressive integrated moving average for network modeling. Geogr Anal 46(1):75–97CrossRefGoogle Scholar
  12. Elhorst JP (2010) Spatial panel data models. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis. Software, tools, methods and applications. Springer, Berlin/Heidelberg, pp 172–192Google Scholar
  13. Fischer MM (2015) Neural networks. A class of flexible non-linear models for regression and classification. In: Karlsson C, Andersson M, Norman T (eds) Handbook of research methods and applications in economic geography. Elgar, Cheltenham, pp 172–192Google Scholar
  14. Fotheringham AS, Crespo R, Yao J (2015) Geographical and temporal weighted regression (GTWR). Geogr Anal 47(4):431–452CrossRefGoogle Scholar
  15. González JA, Rodríguez-Cortés FJ, Cronie O, Mateu J (2016) Spatio-temporal point process statistics: a review. Spat Stat 18(Part B):505–544CrossRefGoogle Scholar
  16. Hägerstrand T (1970) What about people in regional science? Pap Reg Sci 24(1):7–24CrossRefGoogle Scholar
  17. Haworth J, Shawe-Taylor J, Cheng T, Wang J (2014) Local online kernel ridge regression for forecasting of urban travel times. Transp Res Part C Emerg Technol 46:151–178CrossRefGoogle Scholar
  18. Heuvelink GBM, Pebesma E, Gräler B (2015) Space-time geostatistics. In: Shekhar S, Xiong H, Zhou X (eds) Encyclopedia of GIS. Springer, Cham, pp 1–7Google Scholar
  19. Huang B, Wu B, Barry M (2010) Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. Int J Geogr Inf Sci 24(3):383–401CrossRefGoogle Scholar
  20. Kanevski M, Timonin V, Pozdnukhov A (2009) Machine learning for spatial environmental data: theory, applications, and software, Har/Cdr. EFPL Press, LausanneGoogle Scholar
  21. LeSage JP, Pace RK (2011) Pitfalls in higher order model extensions of basic spatial regression methodology. Rev Reg Stud 41(1):13–26Google Scholar
  22. Li S, Dragicevic S, Castro FA, Sester M, Winter S, Coltekin A, Pettit C, Jiang B, Haworth J, Stein A, Cheng T (2016) Geospatial big data handling theory and methods: a review and research challenges. ISPRS J Photogramm Remote Sens 115:119–133CrossRefGoogle Scholar
  23. MacEachren AM, Gahegan M, Pike W, Brewer I, Cai G, Lengerich E, Hardisty F (2004) Geovisualization for knowledge construction and decision support. IEEE Comput Graph Appl 24(1):13–17CrossRefGoogle Scholar
  24. Miller HJ (2005) A measurement theory for time geography. Geogr Anal 37(1):17–45CrossRefGoogle Scholar
  25. Miller HJ, Han J (2009) Geographic data mining and knowledge discovery, second edition. CRC Press, Boca RatonCrossRefGoogle Scholar
  26. Monmonier M (1990) Strategies for the visualization of geographic time-series data. Cartographica 27(1):30–45CrossRefGoogle Scholar
  27. Neill DB (2009) Expectation-based scan statistics for monitoring spatial time series data. Int J Forecast 25(2009):498–517CrossRefGoogle Scholar
  28. Pfeifer PE, Deutsch SJ (1980) A three-stage iterative procedure for space-time modelling. Technometrics 22(1):35–47CrossRefGoogle Scholar
  29. Pfeifer PE, Deutsch SJ (1981) Variance of the sample space-time autocorrelation function. J R Stat Soc Ser B Methodol 43(1):28–33Google Scholar
  30. Shekhar S, Jiang Z, Ali RY et al (2015) Spatiotemporal data mining: a computational perspective. ISPRS Int J Geo-Inf 4(4):2306–2338CrossRefGoogle Scholar
  31. Thomas JJ, Cook KA (2005) Illuminating the path: the research and development agenda for visual analytics. National Visualization and Analytics Center, Lausanne.
  32. Wang M, Wang A, Li A (2006) Mining spatial-temporal clusters from geo-databases. In: Li X, Zaïane OR, Li Z (eds) Advanced data mining and applications. Springer, Berlin/Heidelberg, pp 263–270CrossRefGoogle Scholar
  33. Wood J, Dykes J (2008) Spatially ordered treemaps. IEEE Trans Vis Comput Graph 14(6):1348–1355CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Tao Cheng
    • 1
    Email author
  • James Haworth
    • 1
  • Berk Anbaroglu
    • 2
  • Garavig Tanaksaranond
    • 3
  • Jiaqiu Wang
    • 4
  1. 1.SpaceTimeLabDepartment of Civil, Environmental and Geomatic Engineering, University CollegeLondonUK
  2. 2.Department of Geomatics EngineeringHacettepe UniversityAnkaraTurkey
  3. 3.Department of Survey EngineeringChulalongkorn UniversityBangkokThailand
  4. 4.National Institute for Cardiovascular Outcomes ResearchBarts NHS TrustLondonUK

Personalised recommendations