Skip to main content

A Model Evaluation Strategy Applied to Modelling of PM in the Helsinki Metropolitan Area

  • Conference paper
  • First Online:
Air Pollution Modeling and its Application XXV (ITM 2016)

Abstract

We have developed a deterministic urban scale dispersion modelling system further by adding a road dust suspension model. The system includes both vehicular exhaust emissions and suspended road dust. The modelling system was combined with a regional scale chemical transport model for calculations of concentrations in an urban area for the year 2008, and for the year 2010 measured regional background concentration was used. The time series’ were modelled for a spatial area more extensive than before using the FORE road dust suspension model. The predictions were compared against observed concentrations of PM2.5 and PM10. The use of the index of determination (r2) is discussed. We criticize the use of r2 alone as well as in addition to an index of agreement—type measure of agreement, and review the underlying data assumptions for the use of both measures. We then suggest a strategy to develop model evaluation statistical understanding, practice and nomenclature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aarnio MA, Kukkonen J, Kangas L, Kauhaniemi M, Kousa A, Hendriks C, Yli-Tuomi T, Hoek G. Brunekreef B, Elolähde T, Karppinen A (2016) Modelling of particulate matter concentrations in the Helsinki metropolitan area in 2008 and 2010. Boreal Env Res 21: 445–460

    Google Scholar 

  • INRO (1994). EMME/2 User’s Manual. INRO Consultants Inc. Montreal, Canada.

    Google Scholar 

  • Kauhaniemi M, Stojiljkovic A, Pirjola L, Karppinen A, Härkönen J, Kupiainen K, Kangas L, Aarnio MA, Omstedt G, Denby BR, Kukkonen J (2014) Comparison of the predictions of two road dust emission models with the measurements of a mobile van. Atmos Chem Phys Discuss 14(4):4263–4301

    Google Scholar 

  • Robinson WS (1957) The statistical measurement of agreement. ASR 22(1):17–25. http://www.jstor.org/stable/2088760

  • Willmott CJ (1981) On the validation of models. Phys Geogr 2(2):184–194

    Google Scholar 

  • Willmott CJ, Robeson SM, Matsuura K (2011) A refined index of model performance. Int J Climatol 32:2088–2094

    Article  Google Scholar 

Download references

Acknowledgements

This study has been a part of the research projects APTA (The Influence of Air Pollution, Pollen and Ambient Temperature on Asthma and Allergies in Changing Climate), and NordicWelfAir (Project #75007: Understanding the link between Air pollution and Distribution of related Health Impacts and Welfare in the Nordic countries). The funding from the Academy of Finland and the Nordforsk Nordic Programme on Health and Welfare is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mia A. Aarnio .

Editor information

Editors and Affiliations

Questions and Answers

Questions and Answers

Questioner name: Sebnem Aksoyoghi

Question: LOTOS-EUROS did not have SOA in the past. Did you have them in your application? Did you have biogenic emissions?

Answer: Secondary Organic Aerosols were included in the LOTOS-EUROS calculations of this work, but biogenic emissions were not included.

Questioner name: Antti Hellsten

Question/comment: Always use more than one or two metrics in model evaluation. Different metrics measure different kind of disagreements. E.g. FAC2 tells nothing about possible bias, use e.g. FB, too.

Answer: Yes, exactly. One should report several metrics of statistic analysis, but a consistent set of them. The use of the coefficient of determination, r2, however is not a good one to use with data sets that are not normally distributed. But, even more importantly, a danger of misuse comes from when the r 2 for a linear regression model equation determined with some kind of least residual sum fitting to a set of data points, e.g. (Co bs , C pred ) is used as a measure of the goodness of the model used to calculate the C pred data. This is what, for example, the Excel software produces when one makes a scatter plot of the (Cobs, Cpred) data with added “trendline” and “coefficient of correlation”.

A consistent set of statistic parameters could include the number of data points, the means and standard deviations of Cobs and Cpred, a measure of the bias (e.g. FB), a measure of the spread of the data(e.g. F2), a measure of the exactness of Cobs,i = Cpred,i for the whole data set (e.g. an index of agreement), and a statistic that would involve estimates for the measurement uncertainty.

Questioner name: Heinke Schluenzen

Question: Thanks for an elaboration on the quality measures. Why don’t you aim at an index Ia measure that considers measurement uncertainty (e.g. hit rate)?

Answer: The Helsinki Metropolitan Area Council, the source for the observed data that we used in this work, has now stated that “the measurement uncertainty <25%”. This information was not available before. Hit rate represents the fraction of (Pi, Oi)-points from the whole evaluation data set that differ within an allowed range from the diagonal of the P, O-space. The range definition can be then done using the measurement uncertainty, if it is known. (Reference: COST 732 Model Evaluation Case Studies: Approach and Results. Edited by: Schatzmann, M., Olesen H., Franke J., 2010. COST Office, Avenue Louise 149, 1050 Brussels, Belgium. 121 pp.

Questioner name: Pius Lee

Question: In your presentation of the index of agreement:

$${\mathbf{d}}_{2} = 1 - \frac{{ \sum\nolimits_{i = 1}^{n} {\left( {P_{i} - O_{i} } \right)^{2} } }}{{\sum\nolimits_{i = 1}^{n} {\left( {\left| {P_{i} - \bar{O}} \right| + \left| {O_{i} - \bar{O}} \right|} \right)^{2} } }},\quad {\mathbf{d}}_{1} = 1 - \frac{{\sum\nolimits_{i = 1}^{n} {\left| {P_{i} - O_{i} } \right|} }}{{\sum\nolimits_{i = 1}^{n} {\left( {\left| {P_{i} - \bar{O}} \right| + \left| {O_{i} - \bar{O}} \right|} \right)} }}$$

Given that both formulations attempted to quantify the variance (or uncertainty) in each of the observation. Also now the sample size n can be much more since hand-held devices are getting attention of environmental agencies and may soon be deployed as viable observations. Would d2, d1 be still a good measure of model performance when n is ten or 100 times more numerous than the current conventional “fixed regulatory monitors”. The crux of the difficulty may also lie in the fact that there the variances are large as hand-held devices are much less well standardized and their performance is expected to vary over a much larger range.

Answer: The index of agreement statistic attempts to quantify the “exactness” of the predicted variable time series compared to the observed time series. The index of agreement—statistic is always calculated for a pair of data sets that includes observed and predicted concentrations for a specific point in space, as a time series. So in the analysis of the hourly data for e.g. a year, n would always be around 8860 (or 8884 for a leap year) for a location, regardless of the number of measurement devices. If the measurement devices were mobile, then the data set of the predicted concentrations would have to be interpolated with the movement data combined with calculation point data, but each data set to be analysed would still have the standard length of the chosen study period. But this statistic, in any of its reported incarnations, would not quantify the uncertainty of the measurements in any way.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Aarnio, M.A. et al. (2018). A Model Evaluation Strategy Applied to Modelling of PM in the Helsinki Metropolitan Area. In: Mensink, C., Kallos, G. (eds) Air Pollution Modeling and its Application XXV. ITM 2016. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-319-57645-9_16

Download citation

Publish with us

Policies and ethics