Advertisement

Nonlinear sparse feature selection algorithm via low matrix rank constraint

  • Leyuan Zhang
  • Yangding Li
  • Jilian Zhang
  • Pengqing Li
  • Jiaye Li
Article
  • 36 Downloads

Abstract

The characteristics of non-linear, low-rank, and feature redundancy often appear in high-dimensional data, which have great trouble for further research. Therefore, a low-rank unsupervised feature selection algorithm based on kernel function is proposed. Firstly, each feature is projected into the high-dimensional kernel space by the kernel function to solve the problem of linear inseparability in the low-dimensional space. At the same time, the self-expression form is introduced into the deviation term and the coefficient matrix is processed with low rank and sparsity. Finally, the sparse regularization factor of the coefficient vector of the kernel matrix is introduced to implement feature selection. In this algorithm, kernel matrix is used to solve linear inseparability, low rank constraints to consider the global information of the data, and self-representation form determines the importance of features. Experiments show that comparing with other algorithms, the classification after feature selection using this algorithm can achieve good results.

Keywords

Feature selection Kernel function Subspace learning Low rank representation Sparse processing 

Notes

Acknowledgements

This work is partially supported by the China Key Research Program (Grant No: 2016YFB1000905); the Key Program of the National Natural Science Foundation of China (Grant No: 61836016); the Natural Science Foundation of China (Grants No: 61876046, 61573270, 81701780 and 61672177); the Project of Guangxi Science and Technology (GuiKeAD17195062); the Guangxi Natural Science Foundation (Grant No: 2015GXNSFCB139011, 2017GXNSFBA198221); the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing; the Guangxi High Institutions Program of Introducing 100 High-Level Overseas Talents; and the Research Fund of Guangxi Key Lab of Multisource Information Mining & Security.

References

  1. 1.
    Bach F (2008) Exploring large feature spaces with hierarchical multiple kernel learning. In: Advances in neural information processing systems, p 2008Google Scholar
  2. 2.
    Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: ACM SIGKDD International conference on knowledge discovery and data mining, pp 333–342Google Scholar
  3. 3.
    Chang X, Nie F, Yang Y, Huang H (2014) A convex formulation for semi-supervised multi-label feature selection. In: Twenty-eighth AAAI conference on artificial intelligence, pp 1171–1177Google Scholar
  4. 4.
    Chen X, Yuan G, Nie F, Huang J (2017) Semi-supervised feature selection via rescaled linear regression. In: Twenty-sixth international joint conference on artificial intelligence, pp 1525–1531Google Scholar
  5. 5.
    Daubechies I, Devore R, Fornasier M (2010) Iteratively reweighted least squares minimization for sparse recovery. Commun Pur Appl Math 63(1):1–38MathSciNetzbMATHCrossRefGoogle Scholar
  6. 6.
    Fan Z, Xu Y, Zhang D (2011) Local linear discriminant analysis framework using sample neighbors. IEEE Trans Neural Netw 22(7):1119CrossRefGoogle Scholar
  7. 7.
    Feng S, Lu H, Long X (2015) Discriminative dictionary learning based on supervised feature selection for image classification. In: Seventh international symposium on computational intelligence and design, pp 225–228Google Scholar
  8. 8.
    Gao L, Guo Z, Zhang H, Xu X, Shen H (2017) Video captioning with attention-based LSTM and semantic consistency. IEEE Trans Multimedia 19(9):2045–2055CrossRefGoogle Scholar
  9. 9.
    Gu Q, Li Z, Han J (2011) Joint feature selection and subspace learning. In: International joint conference on artificial intelligence, pp 1294–1299Google Scholar
  10. 10.
    Gu Q, Li Z, Han J (2011) Linear discriminant dimensionality reduction. In: European conference on machine learning and knowledge discovery in databases, pp 549–564CrossRefGoogle Scholar
  11. 11.
    Han Y, Yang Y, Yan Y, Ma Z, Sebe N, Zhou X (2015) Semisupervised feature selection via spline regression for video semantic recognition. IEEE Trans Neural Netw Learn Syst 26(2):252–264MathSciNetCrossRefGoogle Scholar
  12. 12.
    He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: International conference on neural information processing systems, pp 507–514Google Scholar
  13. 13.
    Jawanpuria P, Nath J, Ramakrishnan G (2015) Generalized hierarchical kernel learning. JMLR.org
  14. 14.
    Kimeldorf G, Wahba G (1970) A correspondence between bayesian estimation on stochastic processes and smoothing by splines. Ann Math Stat 41(2):495–502MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Lei C, Zhu X (2018) Unsupervised feature selection via local structure learning and sparse learning. Multimedia Tools and Applications 77(22):29605–29622CrossRefGoogle Scholar
  16. 16.
    Li J, Hu X, Wu L, Liu H (2016) Robust unsupervised feature selection on networked data. In: Siam international conference on data mining, pp 387–395Google Scholar
  17. 17.
    Ling C, Yang Q, Wang J, Zhang S (2004) Decision trees with minimal costs. In: International conference on machine learning, p 69Google Scholar
  18. 18.
    Liu H, Lafferty J, Wasserman L (2008) Nonparametric regression and classification with joint sparsity constraints. In: Advances in neural information processing systems, pp 969–976Google Scholar
  19. 19.
    Lu C, Lin Z, Yan S (2014) Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization. IEEE Trans Image Process 24(2):646–54MathSciNetGoogle Scholar
  20. 20.
    Ma Z, Nie F, Yang Y, Uijlings J, Sebe N (2012) Web image annotation via subspace-sparsity collaborated feature selection. IEEE Trans Multimedia 14(4):1021–1030CrossRefGoogle Scholar
  21. 21.
    Ma Z, Yang Y, Nie F, Uijlings J, Sebe N (2011) Exploiting the entire feature space with sparsity for automatic image annotation. In: ACM International conference on multimedia, pp 283–292Google Scholar
  22. 22.
    Muller K, Mika S, Ratsch G, Tsuda K, Scholkopf B (2008) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181CrossRefGoogle Scholar
  23. 23.
    Nie F, Zhu W, Li X (2016) Unsupervised feature selection with structured graph optimization. In: Thirtieth AAAI conference on artificial intelligence, pp 1302–1308Google Scholar
  24. 24.
    Paruolo P (1998) Multivariate reduced-rank regression: theory and applications. J Am Stat Assoc 95(450):369–370Google Scholar
  25. 25.
    Raskutti G, Wainwright M, Yu B (2010) Minimax-optimal rates for sparse additive models over kernel classes via convex programming. Technical Report 13(2):389–427MathSciNetzbMATHGoogle Scholar
  26. 26.
    Ravikumar P, Lafferty J, Liu H, Wasserman L (2009) Sparse additive models. J R Stat Soc 71(5):1009–1030MathSciNetCrossRefGoogle Scholar
  27. 27.
    Sun Y, Yao J, Goodison S (2015) Feature selection for nonlinear regression and its application to cancer research. In: International conference on data mining, pp 73–81CrossRefGoogle Scholar
  28. 28.
    Suzuki T, Sugiyama M (2013) Fast learning rate of multiple kernel learning trade-off between sparsity and smoothness. Ann Stat 41(3):1381–1405MathSciNetzbMATHCrossRefGoogle Scholar
  29. 29.
    Tan M, Tsang I, Wang L (2014) Towards ultrahigh dimensional feature selection for big data. JMLR.org
  30. 30.
    Varma M, Babu B (2009) More generality in efficient multiple kernel learning. In: International conference on machine learning, pp 1065–1072Google Scholar
  31. 31.
    Wang H, Yu J (2006) Study on the kernel-based methods and its model selection. Journal of Southern Yangtze University (Natural Science Edition) 5(4):500–504Google Scholar
  32. 32.
    Wang S, Tang J, Liu H (2015) Embedded unsupervised feature selection. In: International conference on swarm intelligence, pp 1044–1051Google Scholar
  33. 33.
    Wu F, Yuan Y, Zhuang Y (2010) Heterogeneous feature selection by group lasso with logistic regression. In: ACM International conference on multimedia, pp 983–986Google Scholar
  34. 34.
    Yamada M, Jitkrittum W, Sigal L, Xing E (2014) High-dimensional feature selection by feature-wise kernelized lasso. Neural Comput 26(1):185–207MathSciNetCrossRefGoogle Scholar
  35. 35.
    Yang Y, Zha Z, Gao Y, Zhu X, Chua T (2014) Exploiting web images for semantic video indexing via robust sample-specific loss. IEEE Trans Multimedia 16(6):1677–1689CrossRefGoogle Scholar
  36. 36.
    Zhang C, Zhang S (2002) Association rule mining: models and algorithms. Springer, Berlin HeidelbergzbMATHCrossRefGoogle Scholar
  37. 37.
    Zhang S, Li X, Zong M, Zhu X, Cheng D (2017) Learning k for knn classification. ACM Trans Intell Syst Technol 8(3):43Google Scholar
  38. 38.
    Zhang S, Li X, Zong M, Zhu X, Wang R (2018) Efficient knn classification with different numbers of nearest neighbors. IEEE Transactions on Neural Networks and Learning Systems 29(5):1774–1785MathSciNetCrossRefGoogle Scholar
  39. 39.
    Zhang S, Qin Z, Ling C, Sheng S (2005) “missing is useful”: missing values in cost-sensitive decision trees. IEEE Trans Knowl Data Eng 17(12):1689–1693CrossRefGoogle Scholar
  40. 40.
    Zhao J, Lu K, He X (2008) Locality sensitive semi-supervised feature selection. Neurocomputing 71(10):1842–1849CrossRefGoogle Scholar
  41. 41.
    Zhao Z, Liu H (2007) Semi-supervised feature selection via spectral analysis. In: Siam international conference on data miningGoogle Scholar
  42. 42.
    Zheng W, Zhu X, Wen G, Zhu Y, Yu H, Gan J (2018) Unsupervised feature selection by self-paced learning regularization. Pattern Recogn Lett.  https://doi.org/10.1016/j.patrec.2018.06.029
  43. 43.
    Zheng W, Zhu X, Zhu Y, Hu R, Lei C (2018) Dynamic graph learning for spectral feature selection. Multimedia Tools and Applications 77(22):29739–29755CrossRefGoogle Scholar
  44. 44.
    Zhou Z (2016) Machine learning. Tsinghua University Press, BeijingGoogle Scholar
  45. 45.
    Zhu P, Zuo W, Zhang L, Hu Q, Shiu S (2015) Unsupervised feature selection by regularized self-representation. Pattern Recogn 48(2):438–446zbMATHCrossRefGoogle Scholar
  46. 46.
    Zhu X, Huang Z, Shen H, Zhao X (2013) Linear cross-modal hashing for efficient multimedia search. In: 21St ACM international conference on multimedia, pp 143–152Google Scholar
  47. 47.
    Zhu X, Li X, Zhang S (2016) Block-row sparse multiview multilabel learning for image classification. IEEE Transactions on Cybernetics 46(2):450–461CrossRefGoogle Scholar
  48. 48.
    Zhu X, Li X, Zhang S, Ju C, Wu X (2017) Robust joint graph sparse coding for unsupervised spectral feature selection. IEee Transactions on Neural Networks and Learning Systems 28(6):1263–1275MathSciNetCrossRefGoogle Scholar
  49. 49.
    Zhu X, Li X, Zhang S, Xu Z, Yu L, Wang C (2017) Graph pca hashing for similarity search. IEEE Trans Multimedia 19(9):2033–2044CrossRefGoogle Scholar
  50. 50.
    Zhu X, Zhang L, Huang Z (2014) A sparse embedding and least variance encoding approach to hashing. IEEE Trans Image Process 23(9):3737–3750MathSciNetzbMATHCrossRefGoogle Scholar
  51. 51.
    Zhu X, Zhang S, He W, Hu R, Lei C, Zhu P (2018) One-step multi-view spectral clustering. IEEE Trans Knowl Data Eng.  https://doi.org/10.1109/TKDE.2018.2873378
  52. 52.
    Zhu X, Zhang S, Jin Z, Zhang Z, Xu Z (2011) Missing value estimation for mixed-attribute data sets. IEEE Trans Knowl Data Eng 23(1):110–121CrossRefGoogle Scholar
  53. 53.
    Zhu X, Zhang S, Li Y, Zhang J, Yang L, Fang Y (2018) Low-rank sparse subspace for spectral clustering. IEEE Trans Knowl Data Eng.  https://doi.org/10.1109/TKDE.2018.2858782
  54. 54.
    Zhu Y, Kim M, Zhu X, Yan J, Kaufer D, Wu G (2017) Personalized diagnosis for alzheimer’s disease. In: International conference on medical image computing and computer-assisted intervention, pp 205–213Google Scholar
  55. 55.
    Zhu Y, Lucey S (2015) Convolutional sparse coding for trajectory reconstruction. IEEE Trans Pattern Anal Mach Intell 37(3):529–540CrossRefGoogle Scholar
  56. 56.
    Zhu Y, Zhu X, Kim M, Kaufer D, Wu G (2017) A novel dynamic hyper-graph inference framework for computer assisted diagnosis of neuro-diseases. In: International conference on information processing in medical imaging, pp 158–169Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Leyuan Zhang
    • 1
  • Yangding Li
    • 1
  • Jilian Zhang
    • 2
  • Pengqing Li
    • 1
  • Jiaye Li
    • 1
  1. 1.Guangxi Key Lab of Multi-source Information Mining and SecurityGuangxi Normal UniversityGuilinPeople’s Republic of China
  2. 2.College of Cyber SecurityJinan UniversityGuangzhouPeople’s Republic of China

Personalised recommendations