Abstract
Graph based methods have played an important role in machine learning due to their ability to encode the similarity relationships among data. A commonly used criterion in graph based feature selection methods is to select the features which best preserve the data similarity or a manifold structure derived from the entire feature set. However, these methods separate the processes of learning the feature similarity graph and feature ranking. In practice, the ideal feature similarity graph is difficult to define in advance. Because one needs to assign appropriate values for parameters such as the neighborhood size or the heat kernel parameter involved in graph construction, the process is conducted independently of subsequent feature selection. As a result the performance of feature selection is largely determined by the effectiveness of graph construction. In this paper, on the other hand, we attempt to learn a graph strucure closely linked with the feature selection process. The idea is to unify graph construction and data transformation, resulting in a new framework which results in an optimal graph rather than a predefined one. Moreover, the \(\ell _{2,1}\)-norm is imposed on the transformation matrix to achieve row sparsity when selecting relevant features. We derive an efficient algorithm to optimize the proposed unified problem. Extensive experimental results on real-world benchmark data sets show that our method consistently outperforms the alternative feature selection methods.
Chapter PDF
Similar content being viewed by others
References
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. Advances in neural information processing systems 14, 585–591 (2001)
Kulis, B., Basu, S., Dhillon, I., Mooney, R.: Semi-supervised graph clustering: a kernel approach. In: Proceedings of The 22nd International Conference on Machine Learning, vol. 74, no. 1, pp. 457–464 (2005)
Chung, F.: Spectral Graph Theory. American Mathematical Society (1992)
Jain, V., Zhang, H.: A spectral approach to shape-based retrieval of articulated 3D models. Computer-Aided Design 39(5), 398–407 (2007)
Jin, R., Ding, C., Kang, F.: A probabilistic approach for optimizing spectral clustering. Advances in Neural Information Processing systems, vol. 18. MIT Press, Cambridge (2005)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. Proceedings of the Twentieth International Conference on Machine Learning 20(2), 912–919 (2003)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation 15(6), 1373–1396 (2003)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in neural information processing systems, pp. 507–514 (2005)
Bach, F., Jordan, M.: Learning spectral clustering, with application to speech separation. The Journal of Machine Learning Research 7, 1963–2001 (2006)
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th International Conference on Machine Learning, pp. 1151–1157 (2007)
Cai, D., Zhang, C., He, X.: Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 333–342 (2010)
Zhao, Z., Wang, L., Liu, H.: Efficient spectral feature selection with minimum redundancy. In: Proceedings of AAAI, pp. 673–678 (2010)
Hou, C., Nie, F., Yi, D., Wu, Y.: Joint embedding learning and sparse regression: A framework for unsupervised feature selection. IEEE Transactions on Cybernetics 44(6), 793–804 (2014)
Liu, X., Wang, L., Zhang, J., Liu, H.: Global and local structure preservation for feature selection. IEEE Transactions on Neural Networks and Learning Systems 25(6), 1083–1095 (2014)
Nie, F., Wang, X., Huang, H.: Clustering and projected clustering with adaptive neighbors. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 977–986 (2014)
Zhang, L., Qiao, L., Chen, S.: Graph-optimized locality preserving projections. Pattern Recognition 43(6), 1993–2002 (2010)
He, X., Niyogi, P.: Locality preserving projections. Neural information processing systems. MIT Press, Cambridge (2003)
Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, Z., Bai, L., Liang, Y., Hancock, E.R. (2015). Unsupervised Feature Selection by Graph Optimization. In: Murino, V., Puppo, E. (eds) Image Analysis and Processing — ICIAP 2015. ICIAP 2015. Lecture Notes in Computer Science(), vol 9279. Springer, Cham. https://doi.org/10.1007/978-3-319-23231-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-23231-7_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23230-0
Online ISBN: 978-3-319-23231-7
eBook Packages: Computer ScienceComputer Science (R0)