Spatial and meteorological relevance in NO2 estimations: a case study in the Bay of Algeciras (Spain)

  • Javier González-EnriqueEmail author
  • Ignacio J. Turias
  • Juan Jesús Ruiz-Aguilar
  • José Antonio Moscoso-López
  • Leonardo Franco
Original Paper


This study focuses on how to determine the most relevant variables in order to estimate the hourly NO2 concentrations in a monitoring network located in the Bay of Algeciras (Spain). For each station of the network, artificial neural networks and multiple linear regression have been used to compute hourly estimation models. Meteorological variables and hourly NO2 concentrations from the nearby stations have been used as inputs, and a feature selection procedure has been applied as a previous step. The different models developed have been statistically compared. The inputs used in the best estimation model for each station were the most important to estimate each hourly NO2 concentration level. These estimations can be a very useful resource to provide autonomous capacities as automatic decalibration detection or missing data imputation in monitoring networks. Finally, the similarities between stations, according to the relevance of variables, have been analysed with the aid of a hierarchical clustering algorithm.


Artificial neural networks Monitoring networks Air pollution Feature relevance 



This work is part of the coordinated research projects TIN2014-58516-C2-1-R and TIN2014-58516-C2-2-R supported by MICINN (Ministerio de Economía y Competitividad-Spain). Monitoring data have been kindly provided by the Environmental Agency of the Andalusian Government.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.


  1. Aguirre-Basurko E, Ibarra-Berastegi G, Madariaga I (2006) Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area. Environ Model Softw 21:430–446CrossRefGoogle Scholar
  2. Bai Y, Li Y, Wang X et al (2016) Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions. Atmos Pollut Res 7:557–566. CrossRefGoogle Scholar
  3. Banerjee T, Srivastava RK (2011) Evaluation of environmental impacts of Integrated Industrial Estate—Pantnagar through application of air and water quality indices. Environ Monit Assess 172:547–560. CrossRefGoogle Scholar
  4. Bartra J, Mullol J, Del Cuvillo A et al (2007) Air pollution and allergens. J Investig Allergol Clin Immunol 17:3–8Google Scholar
  5. Bhaskar BV, Mehta VM (2010) Atmospheric particulate pollutants and their relationship with meteorology in Ahmedabad. Aerosol Air Qual Res 10:301–315. CrossRefGoogle Scholar
  6. Bien J, Tibshirani R (2011) Hierarchical clustering with prototypes via minimax linkage. J Am Stat Assoc 106:1075–1084. CrossRefGoogle Scholar
  7. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press Inc, New YorkGoogle Scholar
  8. Calinski T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat 3:1–27. CrossRefGoogle Scholar
  9. Chelani AB, Chalapati RC, Phadke K, Hasan M (2002) Prediction of sulphur dioxide concentration using artificial neural networks. Environ Model Softw 17:161–168. CrossRefGoogle Scholar
  10. Chen J, Wang W, Zhang J et al (2009) Characteristics of gaseous pollutants near a main traffic line in Beijing and its influencing factors. Atmos Res 94:470–480. CrossRefGoogle Scholar
  11. Chiu H-F, Yang C-Y (2015) Air pollution and daily clinic visits for migraine in a subtropical city: Taipei, Taiwan. J Toxicol Environ Health A 78:549–558. CrossRefGoogle Scholar
  12. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI 1:224–227. CrossRefGoogle Scholar
  13. Dominick D, Latif MT, Juahir H et al (2012) An assessment of influence of meteorological factors on PM10 and NO2 at selected stations in Malaysia. Sustain Environ Res 22:305–315Google Scholar
  14. Elminir HK (2005) Dependence of urban air pollutants on meteorology. Sci Total Environ 350:225–237. CrossRefGoogle Scholar
  15. European Environment Agency (2013) Every breath we take: Improving air quality in Europe. Publications Office of the European Union, LuxembourgGoogle Scholar
  16. European Environment Agency (2014) Annual report 2014 and EMAS environmental statement 2014. Publications Office of the European Union, LuxembourgGoogle Scholar
  17. Finlayson-Pitts BJ, Pitts JN Jr (2000) Chemistry of the upper and lower atmosphere: theory, experiments, and applications. Academic Press, CambridgeGoogle Scholar
  18. Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701. CrossRefGoogle Scholar
  19. Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32:2627–2636. CrossRefGoogle Scholar
  20. Gardner MW, Dorling SR (1999) Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmos Environ 33:709–719. CrossRefGoogle Scholar
  21. Gibson J (2015) Air pollution, climate change, and health. Lancet Oncol 16:e269. CrossRefGoogle Scholar
  22. Guyon I, Elisseeff A (2003) an introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. CrossRefGoogle Scholar
  23. Hastie TTT, Tibshirani R, Friedman J (2009) The elements of statistical learning, 2nd edn. Springer, New YorkCrossRefGoogle Scholar
  24. He J, Yu Y, Liu N, Zhao S (2013) Numerical model-based relationship between meteorological conditions and air quality and its implication for urban air quality management. Int J Environ Pollut 53:265–286. CrossRefGoogle Scholar
  25. Hochberg Y, Tamhane AC (1987) Multiple comparison procedures. Wiley, New YorkCrossRefGoogle Scholar
  26. Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359–366. CrossRefGoogle Scholar
  27. İçağa Y, Sabah E (2009) Statistical analysis of air pollutants and meteorological parameters in Afyon, Turkey. Environ Model Assess 14:259–266. CrossRefGoogle Scholar
  28. Khedairia S, Khadir MT (2012) Impact of clustered meteorological parameters on air pollutants concentrations in the region of Annaba, Algeria. Atmos Res 113:89–101. CrossRefGoogle Scholar
  29. Kolehmainen M, Martikainen H, Ruuskanen J (2001) Neural networks and periodic components used in air quality forecasting. Atmos Environ 35:815–825. CrossRefGoogle Scholar
  30. Kourtidis KA, Ziomas I, Zerefos C et al (2002) Benzene, toluene, ozone, NO2 and SO2 measurements in an urban street canyon in Thessaloniki, Greece. Atmos Environ 36:5355–5364CrossRefGoogle Scholar
  31. Kukkonen J, Partanen L, Karppinen A et al (2003) Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in central Helsinki. Atmos Environ 37:4539–4550. CrossRefGoogle Scholar
  32. Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11:431–441. CrossRefGoogle Scholar
  33. Martín ML, Turias IJ, González FJ et al (2008) Prediction of CO maximum ground level concentrations in the Bay of Algeciras, Spain using artificial neural networks. Chemosphere 70:1190–1195. CrossRefGoogle Scholar
  34. Muñoz E, Martín ML, Turias IJ et al (2014) Prediction of PM10 and SO2 exceedances to control air pollution in the Bay of Algeciras, Spain. Stoch Environ Res Risk Assess 28:1409–1420. CrossRefGoogle Scholar
  35. Murtagh F (1983) A survey of recent advances in hierarchical clustering algorithms. Comput J 26:354–359. CrossRefGoogle Scholar
  36. Parra MA, Elustondo D, Bermejo R, Santamaría JM (2009) Ambient air levels of volatile organic compounds (VOC) and nitrogen dioxide (NO2) in a medium size city in Northern Spain. Sci Total Environ 407:999–1009. CrossRefGoogle Scholar
  37. Reyes MM (2015) Modelado de alta resolucion para el estudio de la respuesta oceanica al forzamiento del viento en el Estrecho de Gibraltar (Unpublished doctoral dissertation). University of Cádiz, SpainGoogle Scholar
  38. Rivera C, Stremme W, Barrera H et al (2015) Spatial distribution and transport patterns of NO2 in the Tijuana–San Diego area. Atmos Pollut Res 6:230–238. CrossRefGoogle Scholar
  39. Rokach L, Maimon O (2005) Clustering methods. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, Boston, MA, pp 321–352CrossRefGoogle Scholar
  40. Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65. CrossRefGoogle Scholar
  41. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL, PDP Research Group (eds) Parallel distributed processing: explorations in the microstructure of cognition, vol 1. Foundations. MIT Press, Cambridge, MA, pp 318–362Google Scholar
  42. Russo A, Lind PG, Raischel F et al (2015) Neural network forecast of daily pollution concentration using optimal meteorological data at synoptic and local scales. Atmos Pollut Res 6:540–549. CrossRefGoogle Scholar
  43. Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23:2507–2517. CrossRefGoogle Scholar
  44. Sarle WS (1995) Stopped training and other remedies for overfitting. In: Proceedings of 27th Symposium Interface Computer Science and Statistics, pp 352–360Google Scholar
  45. Shi JP, Harrison RM (1997) Regression modelling of hourly NOx and NO2 concentrations in urban air in London. Atmos Environ 31:4081–4094. CrossRefGoogle Scholar
  46. Solomatine D, See LM, Abrahart RJ (2008) Data-driven modelling: concepts, approaches and experiences. In: Abrahart RJ, See LM, Solomatine DP (eds) Practical hydroinformatics: computational intelligence and technological developments in water applications. Springer, Berlin, pp 17–30CrossRefGoogle Scholar
  47. Sun Y, Zhuang G, Wang Y et al (2004) The air-borne particulate pollution in Beijing—concentration, composition, distribution and sources. Atmos Environ 38:5991–6004. CrossRefGoogle Scholar
  48. Tabaku A, Bejtja G, Bala S et al (2011) Effects of air pollution on children’s pulmonary health. Atmos Environ 45:7540–7545. CrossRefGoogle Scholar
  49. Turias IJ, González FJ, Martin ML, Galindo PL (2008) Prediction models of CO, SPM and SO2 concentrations in the Campo de Gibraltar Region, Spain: a multiple comparison strategy. Environ Monit Assess 143:131–146. CrossRefGoogle Scholar
  50. Turias IJ, Jerez JM, Franco L et al (2017) Prediction of carbon monoxide (CO) atmospheric pollution concentrations using meterological variables. WIT Trans Ecol Environ 211:137–145. CrossRefGoogle Scholar
  51. Vlachogianni A, Kassomenos P, Karppinen A et al (2011) Evaluation of a multiple regression model for the forecasting of the concentrations of NOx and PM10 in Athens and Helsinki. Sci Total Environ 409:1559–1571. CrossRefGoogle Scholar
  52. Westmoreland EJ, Carslaw N, Carslaw DC et al (2007) Analysis of air quality within a street canyon using statistical and dispersion modelling techniques. Atmos Environ 41:9195–9205. CrossRefGoogle Scholar
  53. Willmott CJ (1982) Some comments on the evaluation of model performance. Am Meteorol Soc 63:1309–1313.;2 CrossRefGoogle Scholar
  54. Xu WY, Zhao CS, Ran L et al (2011) Characteristics of pollutants and their correlation to meteorological conditions at a suburban site in the North China Plain. Atmos Chem Phys 11:4353–4369. CrossRefGoogle Scholar
  55. Xu J, Yan F, Xie Y et al (2015) Impact of meteorological conditions on a nine-day particulate matter pollution event observed in December 2013, Shanghai, China. Particuology 20:69–79. CrossRefGoogle Scholar
  56. Yao Y, Rosasco L, Caponnetto A (2007) On early stopping in gradient descent learning. Constr Approx 26:289–315. CrossRefGoogle Scholar
  57. Zhang K, Batterman S (2013) Air pollution and health risks due to vehicle traffic. Sci Total Environ 450–451:307–316. CrossRefGoogle Scholar
  58. Zhang H, Wang Y, Hu J et al (2015) Relationships between meteorological parameters and criteria air pollutants in three megacities in China. Environ Res 140:242–254. CrossRefGoogle Scholar
  59. Zheng H, Zhang Y (2007) Feature selection for high dimensional data in astronomy. Adv Sp Res 41:1960–1964. CrossRefGoogle Scholar
  60. Zu Y, Huang L, Hu J et al (2017) Investigation of relationships between meteorological conditions and high PM10 pollution in a megacity in the western Yangtze River Delta. Air Qual Atmos Health, China. CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer Science Engineering, Polytechnic School of EngineeringUniversity of CádizAlgecirasSpain
  2. 2.Department of Industrial and Civil Engineering, Polytechnic School of EngineeringUniversity of CádizAlgecirasSpain
  3. 3.Department of Computer Science, ETS Computer ScienceUniversity of MálagaMálagaSpain

Personalised recommendations