Abstract
This paper addresses the feature selection problems in the setting of online learning of data streams. Typically this setting imposes restrictions on computational resources (memory, processing) as well as storage capacity, since instances of streaming data arrive with high speed and with no possibility to store data for later offline processing. Feature selection can be particularly beneficial here to selectively process parts of the data by reducing the data dimensionality. However selecting a subset of features may lead to permanently ruling out the possibilities of using discarded dimensions. This will cause a problem in the cases of feature drift in which data importance on individual dimensions changes with time. This paper proposes a new method of online feature selection to deal with drifting features in non-stationary data streams. The core of the proposed method lies in deep reconstruction networks that are continuously updated with incoming data instances. These networks can be used to not only detect the point of change with feature drift but also dynamically rank the importance of features for feature selection in an online manner. The efficacy of our work has been demonstrated by the results of experiments based on the MNIST database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barddal, J.P., et al.: A survey on feature drift adaptation: definition, benchmark, challenges and future directions. J. Syst. Softw. 127, 278–294 (2017). https://doi.org/10.1016/j.jss.2016.07.005. http://www.sciencedirect.com/science/article/pii/S0164121216301030. ISSN: 0164-1212
Calandra, R., et al.: Learning deep belief networks from non-stationary streams. In: Artificial Neural Networks and Machine Learning - ICANN 2012: 22nd International Conference on Artificial Neural Networks, Lausanne, Switzerland, 11–14 September 2012, Proceedings, Part II, Lecture Notes in Computer Science, vol. 7553, pp. 379–386. Springer, Heidelberg (2012). ISBN: 9783642332654
Ditzler, G., et al.: Learning in nonstationary environments: a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015). https://doi.org/10.1109/MCI.2015.2471196. ISSN: 1556603X
Draelos, T.J., et al.: Neurogenesis deep learning. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 526–533 (2017). https://doi.org/10.1109/IJCNN.2017.7965898
Han, K., et al.: Autoencoder inspired unsupervised feature selection. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 2018-April, pp. 2941–2945 (2018). https://doi.org/10.1109/ICASSP.2018.8462261. arXiv: 1710.08310. ISSN: 15206149
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 97–106. ACM, San Francisco (2001). https://doi.org/10.1145/502512.502529. http://doi.acm.org/10.1145/502512.502529. ISBN: 1-58113-391-X
Ji, S., et al.: Feature selection based on structured sparsity: a comprehensive study. IEEE Trans. Neural Networks Learn. Syst. 28(7), 1490–1507 (2016). https://doi.org/10.1109/tnnls.2016.2551724. ISSN: 2162-237X
Jolliffe, I.: Principal component analysis. In: Lovric, M. (ed.) International Encyclopedia of Statistical Science, pp. 1094–1096. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-04898-2_455. https://doi.org/10.1007/978-3-642-04898-2%7B%5C%7D455. ISBN: 978-3-642-04898-2
Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization, pp. 1–15 (2014). arXiv: 1412.6980. http://arxiv.org/abs/1412.6980
Li, J., Tang, J., Liu, H.: Reconstruction-based unsupervised feature selection: an embedded approach. In: IJCAI International Joint Conference on Artificial Intelligence, pp. 2159–2165 (2017). ISSN: 10450823
Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. 50(6), 94:1–94:45 (2017). https://doi.org/10.1145/3136625. http://doi.acm.org/10.1145/3136625. ISSN: 0360-0300
Nguyen, H.-L., et al.: Heterogeneous ensemble for feature drifts in data streams. In: Tan, P.-N., et al. (ed.) Advances in Knowledge Discovery and Data Mining, pp. 1–12. Springer, Heidelberg (2012). ISBN: 978-3-642-30220-6
Wang, S., Ding, Z., Fu, Y.: Feature selection guided auto-encoder. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Holmberg, J., Xiong, N. (2020). Online Feature Selection via Deep Reconstruction Network. In: Kim, J., Geem, Z., Jung, D., Yoo, D., Yadav, A. (eds) Advances in Harmony Search, Soft Computing and Applications. ICHSA 2019. Advances in Intelligent Systems and Computing, vol 1063. Springer, Cham. https://doi.org/10.1007/978-3-030-31967-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-31967-0_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31966-3
Online ISBN: 978-3-030-31967-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)