A novel algorithm for feature selection based on geographic distance metric: a case study of streamflow forecasting of Austria’s water resources
- 31 Downloads
This paper focuses on input variable selection—feature selection—methods with the artificial neural network for the streamflow forecasting of large basins that have a variety of numerous stations. The feature selection methods in the current hydrology research community are not able to handle the problem in such basins. The paper proposes a novel feature selection algorithm—Bubble Selection—based on the idea of utilizing geographic distance as a metric. Evaluation of the performance of the algorithm is carried out by applying the Bubble Selection, to the case study of modeling Austria’s water resources of 540 stations in a single run mode. The aim is to select features for each station among 2412 stations, streamflow, precipitation, snow, snow depth, and water level measurements are available. The proposed algorithm allows considerably reducing the dimension of features. The Bubble Selection algorithm is further combined with the Sequential Forward Selection algorithm. Performance of the hybrid model is compared with the performance of Feature Ranking method in terms of the coefficient of determination, Nash–Sutcliffe Efficiency, and percent bias. The results show the superiority of the proposed hybrid algorithm over the Feature Ranking. The paper introduces a methodology to model a large basin and it reveals some skills that a feature selection algorithm should have.
KeywordsBubble selection Feature selection Sequential forward selection Feature ranking
The authors wish to thank all who assisted in conducting this work.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
- Guyon I, Elisseeff A (2003) An introduction to variable and feature selection Isabelle. J Mach Learn Res 3(3):1157–1182Google Scholar
- James DE (1996) Straightforward statistics for the behavioral sciences. Brooks/Cole Pub. Co., Pacific GroveGoogle Scholar
- Kröse B, Smagt P (1993) An introduction to neural networks. University of Amsterdam. http://citeseerx.ist.psu.edu/viewdoc/similar?doi=10.1.1.18.493&type=ab
- Noori R, Karbassi AR, Moghaddamnia A et al (2011) Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction. J Hydrol 401:177–189. https://doi.org/10.1016/j.jhydrol.2011.02.021 CrossRefGoogle Scholar