Advertisement

A novel algorithm for feature selection based on geographic distance metric: a case study of streamflow forecasting of Austria’s water resources

  • O. SelviEmail author
  • İ. Huseyinov
Original Paper
  • 31 Downloads

Abstract

This paper focuses on input variable selection—feature selection—methods with the artificial neural network for the streamflow forecasting of large basins that have a variety of numerous stations. The feature selection methods in the current hydrology research community are not able to handle the problem in such basins. The paper proposes a novel feature selection algorithm—Bubble Selection—based on the idea of utilizing geographic distance as a metric. Evaluation of the performance of the algorithm is carried out by applying the Bubble Selection, to the case study of modeling Austria’s water resources of 540 stations in a single run mode. The aim is to select features for each station among 2412 stations, streamflow, precipitation, snow, snow depth, and water level measurements are available. The proposed algorithm allows considerably reducing the dimension of features. The Bubble Selection algorithm is further combined with the Sequential Forward Selection algorithm. Performance of the hybrid model is compared with the performance of Feature Ranking method in terms of the coefficient of determination, Nash–Sutcliffe Efficiency, and percent bias. The results show the superiority of the proposed hybrid algorithm over the Feature Ranking. The paper introduces a methodology to model a large basin and it reveals some skills that a feature selection algorithm should have.

Keywords

Bubble selection Feature selection Sequential forward selection Feature ranking 

Notes

Acknowledgements

The authors wish to thank all who assisted in conducting this work.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary material

13762_2019_2485_MOESM1_ESM.pdf (2 mb)
Supplementary material 1 (PDF 2030 kb)
13762_2019_2485_MOESM2_ESM.pdf (607 kb)
Supplementary material 2 (PDF 606 kb)
13762_2019_2485_MOESM3_ESM.pdf (882 kb)
Supplementary material 3 (PDF 882 kb)
13762_2019_2485_MOESM4_ESM.xlsx (35.4 mb)
Supplementary material 4 (XLSX 36205 kb)

References

  1. Bowden GJ, Dandy GC, Maier HR (2005a) Input determination for neural network models in water resources applications. Part 1—background and methodology. J Hydrol 301:75–92.  https://doi.org/10.1016/j.jhydrol.2004.06.021 CrossRefGoogle Scholar
  2. Bowden GJ, Maier HR, Dandy GC (2005b) Input determination for neural network models in water resources applications. Part 2. Case study: forecasting salinity in a river. J Hydrol 301:93–107.  https://doi.org/10.1016/j.jhydrol.2004.06.020 CrossRefGoogle Scholar
  3. Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79.  https://doi.org/10.1016/j.neucom.2017.11.077 CrossRefGoogle Scholar
  4. Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40:16–28.  https://doi.org/10.1016/j.compeleceng.2013.11.024 CrossRefGoogle Scholar
  5. Chen C, Twycross J, Garibaldi JM (2017) A new accuracy measure based on bounded relative error for time series forecasting. PLoS ONE 12:1–23.  https://doi.org/10.1371/journal.pone.0174202 CrossRefGoogle Scholar
  6. Devia GK, Ganasri BP, Dwarakish GS (2015) A review on hydrological models. Aquat Procedia 4:1001–1007.  https://doi.org/10.1016/j.aqpro.2015.02.126 CrossRefGoogle Scholar
  7. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection Isabelle. J Mach Learn Res 3(3):1157–1182Google Scholar
  8. Hu Z, Bao Y, Xiong T, Chiong R (2015) Hybrid filter-wrapper feature selection for short-term load forecasting. Eng Appl Artif Intell 40:17–27.  https://doi.org/10.1016/j.engappai.2014.12.014 CrossRefGoogle Scholar
  9. Humphrey GB, Gibbs MS, Dandy GC, Maier HR (2016) A hybrid approach to monthly streamflow forecasting: integrating hydrological model outputs into a Bayesian artificial neural network. J Hydrol 540:623–640.  https://doi.org/10.1016/j.jhydrol.2016.06.026 CrossRefGoogle Scholar
  10. James DE (1996) Straightforward statistics for the behavioral sciences. Brooks/Cole Pub. Co., Pacific GroveGoogle Scholar
  11. Jiang S, Chin KS, Wang L et al (2017) Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Syst Appl 82:216–230.  https://doi.org/10.1016/j.eswa.2017.04.017 CrossRefGoogle Scholar
  12. Khalid S, Khalil T, Nasreen S (2014) A survey of feature selection and feature extraction techniques in machine learning. Sci Inf Conf 2014:372–378.  https://doi.org/10.1109/SAI.2014.6918213 CrossRefGoogle Scholar
  13. Kröse B, Smagt P (1993) An introduction to neural networks. University of Amsterdam. http://citeseerx.ist.psu.edu/viewdoc/similar?doi=10.1.1.18.493&type=ab
  14. Li Y, Li T, Liu H (2017) Recent advances in feature selection and its applications. Knowl Inf Syst 53:551–577.  https://doi.org/10.1007/s10115-017-1059-8 CrossRefGoogle Scholar
  15. Lin G-F, Chen G-R (2007) A systematic approach to the input determination for neural network rainfall–runoff models. Hydrol Process 22:2524–2530.  https://doi.org/10.1002/hyp CrossRefGoogle Scholar
  16. Lin F, Liang D, Yeh CC, Huang JC (2014) Novel feature selection methods to financial distress prediction. Expert Syst Appl 41:2472–2483.  https://doi.org/10.1016/j.eswa.2013.09.047 CrossRefGoogle Scholar
  17. Luo X, Yuan X, Zhu S et al (2019) A hybrid support vector regression framework for streamflow forecast. J Hydrol 568:184–193.  https://doi.org/10.1016/j.jhydrol.2018.10.064 CrossRefGoogle Scholar
  18. Moriasi D, Gitau M, Pai N, Daggupati P (2015) Hydrologic and water quality models: performance measures and evaluation criteria. Trans ASABE 58:1763–1785.  https://doi.org/10.13031/trans.58.10715 CrossRefGoogle Scholar
  19. Noori R, Karbassi AR, Moghaddamnia A et al (2011) Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction. J Hydrol 401:177–189.  https://doi.org/10.1016/j.jhydrol.2011.02.021 CrossRefGoogle Scholar
  20. Prasad R, Deo RC, Li Y, Maraseni T (2017) Input selection and performance optimization of ANN-based streamflow forecasts in the drought-prone Murray Darling Basin region using IIS and MODWT algorithm. Atmos Res 197:42–63.  https://doi.org/10.1016/j.atmosres.2017.06.014 CrossRefGoogle Scholar
  21. Salcedo-Sanz S, Pastor-Sánchez A, Prieto L et al (2014) Feature selection in wind speed prediction systems based on a hybrid coral reefs optimization—extreme learning machine approach. Energy Convers Manag 87:10–18.  https://doi.org/10.1016/j.enconman.2014.06.041 CrossRefGoogle Scholar
  22. Trancoso R, Phinn S, McVicar TR et al (2017) Regional variation in streamflow drivers across a continental climatic gradient. Ecohydrology 10:e1816.  https://doi.org/10.1002/eco.1816 CrossRefGoogle Scholar
  23. Wang L, Wang Y, Chang Q (2016) Feature selection methods for big data bioinformatics: a survey from the search perspective. Methods 111:21–31.  https://doi.org/10.1016/j.ymeth.2016.08.014 CrossRefGoogle Scholar
  24. Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20:606–626.  https://doi.org/10.1109/TEVC.2015.2504420 CrossRefGoogle Scholar
  25. Yaseen ZM, El-shafie A, Jaafar O et al (2015) Artificial intelligence based models for stream-flow forecasting: 2000–2015. J Hydrol 530:829–844CrossRefGoogle Scholar
  26. Yaseen ZM, Jaafar O, Deo RC et al (2016) Stream-flow forecasting using extreme learning machines: a case study in a semi-arid region in Iraq. J Hydrol 542:603–614.  https://doi.org/10.1016/j.jhydrol.2016.09.035 CrossRefGoogle Scholar
  27. Zhang X, Hu Y, Xie K et al (2014) A causal feature selection algorithm for stock prediction modeling. Neurocomputing 142:48–59.  https://doi.org/10.1016/j.neucom.2014.01.057 CrossRefGoogle Scholar

Copyright information

© Islamic Azad University (IAU) 2019

Authors and Affiliations

  1. 1.Computer Engineering Department, Faculty of EngineeringIstanbul Aydın UniversityIstanbulTurkey
  2. 2.Software Engineering Department, Faculty of EngineeringIstanbul Aydın UniversityIstanbulTurkey

Personalised recommendations