Abstract
A novel approach to feature selection from unlabeled vector data is presented. It is based on the reconstruction of original data relationships in an auxiliary space with either weighted or omitted features. Feature weighting, on one hand, is related to the return forces of factors in a parametric data similarity measure as response to disturbance of their optimum values. Feature omission, on the other hand, inducing measurable loss of reconstruction quality, is realized in an iterative greedy way. The proposed framework allows to apply custom data similarity measures. Here, adaptive Euclidean distance and adaptive Pearson correlation are considered, the former serving as standard reference, the latter being usefully for intensity data. Results of the different strategies are given for chromatography and gene expression data.
Chapter PDF
Similar content being viewed by others
References
Dy, J., Brodley, C.: Feature selection for unsupervised learning. Journal of Machine Learning Research 5, 845–889 (2004)
Hammer, B., Strickert, M., Villmann, T.: Supervised neural gas with general similarity measure. Neural Processing Letters 21(1), 21–44 (2005)
Søndberg-Madsen, N., Thomsen, C., Pena, J.: Unsupervised feature subset selection. In: Proceedings on the Workshop on Probabilistic Graphical Models for Classification, pp. 71–82 (2003)
Strickert, M., Teichmann, S., Sreenivasulu, N., Seiffert, U.: High-Throughput Multi-Dimensional Scaling (HiT-MDS) for cDNA-array expression data. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3696, pp. 625–634. Springer, Heidelberg (2005)
Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003), pp. 856–863 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Strickert, M., Sreenivasulu, N., Peterek, S., Weschke, W., Mock, HP., Seiffert, U. (2006). Unsupervised Feature Selection for Biomarker Identification in Chromatography and Gene Expression Data. In: Schwenker, F., Marinai, S. (eds) Artificial Neural Networks in Pattern Recognition. ANNPR 2006. Lecture Notes in Computer Science(), vol 4087. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11829898_25
Download citation
DOI: https://doi.org/10.1007/11829898_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37951-5
Online ISBN: 978-3-540-37952-2
eBook Packages: Computer ScienceComputer Science (R0)