Forecasting hourly \({\hbox {NO}_{2}}\) concentrations by ensembling neural networks and mesoscale models


In the framework of extreme pollution concentrations being more and more frequent in many cities nowadays, air quality forecasting is crucial to protect public health through the anticipation of unpopular measures like traffic restrictions. In this work, we develop the core of a 48 h ahead forecasting system which is being deployed for the city of Madrid. To this end, we investigate the predictive power of a set of neural network models, including several families of deep networks, applied to the task of predicting nitrogen dioxide concentrations in an urban environment. Careful feature engineering on a set of related magnitudes as meteorology and traffic has proven useful, and we have coupled these neural models with mesoscale numerical pollution forecasts, which improve precision by up to 10%. The experiments show that some neural networks and ensembles consistently outperform the reference models, particularly improving the Naive model’s results from around (20%) up to (57%) for longer forecasting horizons. However, results also reveal that deeper networks are not particularly better than shallow ones in this setting.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. 1.

    World Health Organization (2014) 7 million premature deaths annually linked to air pollution. Accessed 2 Feb 2018

  2. 2.

    Sellier Y, Galineau J, Hulin A, Caini F, Marquis N, Navel V, Bottagisi S, Giorgis-Allemand L, Jacquier C, Slama R, Lepeule J (2014) Health effects of ambient air pollution: do different methods for estimating exposure lead to different results? Environ Int 66:165–173.

    Article  Google Scholar 

  3. 3.

    Arroyo V, Díaz J, Carmona R, Ortiz C, Linares C (2016) Impact of air pollution and temperature on adverse birth outcomes: Madrid, 2001–2009. Environ Pollut 218:1154–1161.

    Article  Google Scholar 

  4. 4.

    Díaz J, Ortiz C, Falcón I, Salvador C, Linares C (2018) Short-term effect of tropospheric ozone on daily mortality in Spain. Atmos Environ 187:107–116.

    Article  Google Scholar 

  5. 5.

    Madrid City Council (2016) Protocolo de medidas a adoptar durante episodios de alta contaminación por dióxido de Nitrógeno. Accessed 2 Feb 2018

  6. 6.

    Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press. Accessed 1 Aug 2018

  7. 7.

    Hrust L, Klaić ZB, Križan J, Antonić O, Hercog P (2009) Neural network forecasting of air pollutants hourly concentrations using optimised temporal averages of meteorological variables and pollutant concentrations. Atmos Environ 43(35):5588–5596.

    Article  Google Scholar 

  8. 8.

    Gardner M, Dorling S (1999) Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmos Environ 33(5):709–719.

    Article  Google Scholar 

  9. 9.

    Perez P, Reyes J (2006) An integrated neural network model for PM10 forecasting. Atmos Environ 40(16):2845–2851.

    Article  Google Scholar 

  10. 10.

    Elangasinghe MA, Singhal N, Dirks KN, Salmond JA (2014) Development of an ANN—based air pollution forecasting system with explicit knowledge through sensitivity analysis. Atmos Pollut Res 5(4):696–708.

    Article  Google Scholar 

  11. 11.

    Gong B, Ordieres-Meré J (2016) Prediction of daily maximum ozone threshold exceedances by preprocessing and ensemble artificial intelligence techniques: case study of Hong Kong. Environ Model Softw 84(Supplement C):290–303.

    Article  Google Scholar 

  12. 12.

    Kukkonen J, Partanen L, Karppinen A, Ruuskanen J, Junninen H, Kolehmainen M, Niska H, Dorling S, Chatterton T, Foxall R, Cawley G (2003) Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in central Helsinki. Atmos Environ 37(32):4539–4550.

    Article  Google Scholar 

  13. 13.

    Siwek K, Osowski S (2012) Improving the accuracy of prediction of PM10 pollution by the wavelet transformation and an ensemble of neural predictors. Eng Appl Artif Intell 25(6):1246–1258.

    Article  Google Scholar 

  14. 14.

    Salazar L, Nicolis O, Ruggeri F, Kisel’ák J, Stehlík M (2018) Predicting hourly ozone concentrations using wavelets and ARIMA models. Neural Comput Appl.

    Article  Google Scholar 

  15. 15.

    Aznarte JL (2017) Probabilistic forecasting for extreme NO2 pollution episodes. Environ Pollut 229(Supplement C):321–328.

    Article  Google Scholar 

  16. 16.

    Ayturan Y, Ayturan Z, Altun H (2018) Air pollution modelling with deep learning: a review. Int J Environ Pollut Environ Model 1(3):58–62

    Google Scholar 

  17. 17.

    Winkler RL (1989) Combining forecasts: a philosophical basis and some current issues. Int J Forecast 5(4):605–609.

    Article  Google Scholar 

  18. 18.

    MACC-III monitoring atmospheric composition and climate. Accessed 28 Jan 2018

  19. 19.

    Marécal V, Peuch V-H, Andersson C, Andersson S, Arteta J, Beekmann M, Benedictow A, Bergström R, Bessagnet B, Cansado A, Chéroux F, Colette A, Coman A, Curier RL, Denier van der Gon HAC, Drouin A, Elbern H, Emili E, Engelen RJ, Eskes HJ, Foret G, Friese E, Gauss M, Giannaros C, Guth J, Joly M, Jaumouillé E, Josse B, Kadygrov N, Kaiser JW, Krajsek K, Kuenen J, Kumar U, Liora N, Lopez E, Malherbe L, Martinez I, Melas D, Meleux F, Menut L, Moinat P, Morales T, Parmentier J, Piacentini A, Plu M, Poupkou A, Queguiner S, Robertson L, Rouïl L, Schaap M, Segers A, Sofiev M, Tarasson L, Thomas M, Timmermans R, Valdebenito A, van Velthoven P, van Versendaal R, Vira J, Ung A (2015) A regional air quality forecasting system over Europe: the MACC-II daily ensemble production. Geosci Model Dev 8(9):2777–2813.

    Article  Google Scholar 

  20. 20.

    Madrid City Council, catalogue of open data. Accessed 15 Jan 2018

  21. 21.

    Zhang Y, Bocquet M, Mallet V, Seigneur C, Baklanov A (2012) Real-time air quality forecasting, part I: history, techniques, and current status. Atmos Environ 60(Supplement C):632–655.

    Article  Google Scholar 

  22. 22.

    Zhang Y, Bocquet M, Mallet V, Seigneur C, Baklanov A (2012) Real-time air quality forecasting, part II: State of the science, current research needs, and future prospects. Atmos Environ 60(Supplement C):656–676.

    Article  Google Scholar 

  23. 23.

    Grabczewski K, Jankowski N (2005) Feature selection with decision tree criterion. In: Fifth international conference on hybrid intelligent systems (HIS’05). IEEE, p 6.

  24. 24.

    Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28. commemorative issue)

    Article  Google Scholar 

  25. 25.

    Tharwat A (2016) Principal component analysis—a tutorial. Int J Appl Pattern Recognit 3:197.

    Article  Google Scholar 

  26. 26.

    Bergmeir C, Benítez JM (2012) On the use of cross-validation for time series predictor evaluation. Inf Sci 191:192–213

    Article  Google Scholar 

  27. 27.

    James G, Witten D, Hastie T, Tibshirani R (2014) An introduction to statistical learning: with applications in R. Springer, Berlin

    Google Scholar 

  28. 28.

    Kingma DP, Ba J (2014) Adam: a method for stochastic optimization, CoRR. arXiv:abs/1412.6980

  29. 29.

    Ruder S (2016) An overview of gradient descent optimization algorithms, CoRR. arXiv:abs/1609.04747

  30. 30.

    Chollet F et al (2015) Keras. Accessed 1 Aug 2018

  31. 31.

    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780.

    Article  Google Scholar 

  32. 32.

    Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324.

    Article  Google Scholar 

  33. 33.

    Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45.

    Article  Google Scholar 

  34. 34.

    Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4(1):1–58.

    Article  Google Scholar 

  35. 35.

    Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:674–701

    Article  Google Scholar 

  36. 36.

    Navares R, Aznarte J (2016) What are the most important variables for Poaceae airborne pollen forecasting? Sci Total Environ 579:1161–1169

    Article  Google Scholar 

  37. 37.

    Shaffer J (1986) Modified sequentially rejective multiple test procedures. J Am Stat Assoc 81:826–831

    Article  Google Scholar 

  38. 38.

    Valput D, Aznarte JL (2018) Air pollution forecasting system, Madrid: Source code. Accessed 1 Aug 2018

Download references

Author information



Corresponding author

Correspondence to José L. Aznarte.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Valput, D., Navares, R. & Aznarte, J.L. Forecasting hourly \({\hbox {NO}_{2}}\) concentrations by ensembling neural networks and mesoscale models. Neural Comput & Applic 32, 9331–9342 (2020).

Download citation


  • Neural networks
  • Deep learning
  • Air quality
  • Nitrogen dioxide
  • Forecasting
  • Madrid