1 Introduction

The amount of environmental information that can be used for the assessment of air quality in urban areas has increased substantially during the recent years. Sensor network are becoming denser and regional air quality forecasts are freely available online. However, for a citizen that is interested in the urban air quality the measurements of a single sensor, even the closest one, may require interpretation and expertise. The pollutant concentrations in urban areas show great geographical variability to the extent that even the close by buildings, roads and measurement height can significantly affect the pollutant concentrations measured by the sensors. It therefore makes sense to fuse all the available information with methods that can take the above mentioned aspects into account and assess the air quality in the whole city. Given that weather forecasts and regional air quality forecasts are available, it is also possible to forecast the air quality in the near future based on the current state, future meteorology and modelled long range transportation of pollutants. The fusion of measured air quality data is especially demanding, since each data point is also subject to meteorological conditions, human activities and background concentrations. Furthermore, the individual sensor quality must be considered and if modelled air quality data is also included the fusion the limitations and imprecision of the included modelled data must be taken into account.

2 Methods

The FMI-ENFUSER is a hybrid model that combines statistical air quality modelling, Gaussian dispersion modelling techniques and information fusion algorithms (Johansson et al. 2015a b). The main components of the modelling system have been illustrated in Fig. 34.1.

Fig. 34.1
figure 1

The core components of FMI-ENFUSER modelling system, which utilizes latest sensor observations, meteorological data and regional air quality forecast in the production of high resolution AQ heatmaps for the current and future situation in urban locations. GIS-datasets and emission inventories together with archived concentration time series are used for the calibration of the model (Most of the datasets are open-source and globally available, such as OpenStreetMap and NASA EOSDIS data. In Finland the used GIS-datasets also includes CORINE land cover, population density mapping (2010, 250 × 250 m). For the case-study in Langfang a high resolution land-use GIS-dataset (5 × 5 m) was obtained by the courtesy of local authorities.)

Shortly put, the model is a ‘tabula rasa’ area source dispersion model without knowledge of local emission sources and their temporal/seasonal variation. With sufficient amount of calibration material including GIS-datasets, sensor measurement time series and weather data, the model detects the key relationships between GIS-properties and observed pollutant concentrations through multivariable regression. Such a key relationship can also be a more complicated sub-model that utilize several GIS-information layers as well as meteorological variables, and these sub-models aim to describe physically realistic emission sources rather than some more abstract GIS-properties that are commonly used in Land Use Regression. For example, individual households can be associated as a significant source for PM2.5 through heating and this particular emission source can be dealt with by (i) detecting buildings from land-use data, (ii) assessing physical area of the buildings and (iii) taking into account the observed ambient temperature. In Helsinki Metropolitan Area the model distinguishes close to 100,000 small households that can contribute to PM2.5 through this relationship.

After the calibration the model can associate proper emission rates as a function of GIS-properties and time to local area sources, after which the expected pollutant concentration for any given location and meteorological condition can be estimated. Finally, given a set of recent sensor observations the model can compare the observed concentrations with the expected modelled concentrations and adjust to the perceived conditions applying data fusion techniques. Data fusion algorithms makes it possible to fuse the information from multiple sensors and supporting air quality models of variable quality, also taking into account the temporal and spatial separation of data points. In theory, the more data is available the better the model output should become. To address the long range transportation of pollutants and enhance the forecasting capabilities of the model, the globally operational FMI-SILAM regional air quality model is also included in the pool of information (Sofiev et al. 2015).

The main advantages of the combination of techniques can be summarized as follows:

  • Dispersion modelling facilitates the inclusion of meteorological variables including wind direction, wind speed, boundary layer height and atmospheric stability parameters. Height differences and topography can be taken into account.

  • Statistical LUR methodology allows the model to be calibrated without a priori knowledge of local emission sources, using a large set of historical pollutant concentration time series. However, known emission inventories can be included in the modelling if available (such as the modelled shipping emissions of Fig. 34.1 which are provided by FMI-STEAM model (Jalkanen et al. 2012).

  • With data fusion algorithms sensor/model quality can be addressed and modelled data can be fused together with sensor observations.

  • Modelling system is computationally light-weight and exportable to other regions.

3 Results

As a demonstration for the modelling system in another region, the FMI-ENFUSER model was set up in Langfang, China, 2015 (Johansson et al. 2016). A small sensor network of five PM2.5 sensors were installed around the city and the sensors were connected to a cloud data portal, from which the model was able to access the measurement data in near real time.

While most of the sensors were placed near ground level in variable environments, one of the sensors was placed on roof top. The FMI-ENFUSER model can take into account the individual measurement heights of sensors and variable measurement heights will enrich the information content available for the system, which in turn makes it easier to establish a comprehensive picture of the air quality in the city. In Fig. 34.2 the modelled hourly average pollutant concentration based on measurements from 5 sensors in a selected hour in Langfang is presented.

Fig. 34.2
figure 2

Modelled hourly PM2.5 concentration in Langfang based on sensor measurements during 2015-06-20T19:00. Wind conditions (direction, speed) have been illustrated with red arrows

The ENFUSER model was also installed and calibrated in Delhi using the local historical AQ measurement data and meteorological measurements without any detailed emission factors or inventories for Delhi. Hourly concentration measurements were available for 8 stations in Delhi in 2013 for O3, NO2, SO2, PM2.5 and PM10.

Before calibration and use of the model the consistency of available AQ data was also to be studied. This kind of consistency analysis can be done automatically with ENFUSER by launching a series of auto-covariance and variogram analysis routines. While the ultimate goal of these diagnostics was to assess data weighting factors for each station (for each pollutant type separately) and make the model operate more accurately, these diagnostics can also be used for the assessment of measurement data quality and quality control metrics for measurement stations. Figure 34.3 illustrate the modelled PM2.5 and NO2 concentrations in Delhi based on our first installation of ENFUSER there.

Fig. 34.3
figure 3

Fused hourly PM2.5 and NO2 concentration in Delhi in 2013-04-06T06:00 (UTC)

4 Conclusions

We presented the FMI-ENFUSER system and demonstrated its capability to fuse hourly pollutant concentrations in an urban area. The presented approach that combines statistical air quality modelling, dispersion modelling and data fusion algorithms offers several advantages: a priori emission inventories are not necessary required since the model can be calibrated with historical data. Most important GIS-datasets as well as regional air quality data are openly available globally. And finally, with the modelling system high resolution air quality output can be produced in near real time using relatively low-cost computations.

FMI-ENFUSER was demonstrated in Langfang together with a newly established sensor network. While the pollutant concentrations are generally much larger in China, the same calibration routines can be applied and the model is able to distinguish the relationships between observed pollutant concentrations, meteorological conditions and GIS-properties.

FMI-ENFUSER model was also successfully installed with offline modelling capabilities for Delhi. In the future ENFUSER model could be used also operatively in Delhi, but additional work towards this goal is needed especially for Near Real Time access of meteorological data (ECMWF or IMD) and FMI-SILAM regional AQ data. Further, it would be highly beneficial to investigate more thoroughly the main sources for local emissions in Delhi. Once identified, a more comprehensive and realistic set of emission source categories can be introduced to the model via various GIS-datasets.