Multimedia Tools and Applications

, Volume 77, Issue 17, pp 22433–22453 | Cite as

Robust Discriminative multi-view K-means clustering with feature selection and group sparsity learning

  • Zhiqiang Zeng
  • Xiaodong WangEmail author
  • Fei Yan
  • Yuming Chen
  • Chaoqun Hong


With the rapid development of information technologies, more and more data are collected from multiple sources, which contain different perspectives of the data. To accurately explore the shared information among multiple views, K-means based multi-view clustering methods are designed and widely used in various applications for their simplicity and efficiency. However, all of these methods cluster data in the original high-dimensional feature space which is extremely time-consuming and sensitive to outliers, or cluster data in the embedded feature space for each view, which is hard to find the optimal reduced dimensionality. To solve these problems, we propose a robust discriminative multi-view K-means clustering with feature selection and group sparsity learning. Compared to the state-of-the-arts, the proposed algorithm has two advantages: 1) Discriminative K-means clustering and feature learning are integrated jointly into a single framework, where robust and accurate clustering results are obtained in the embedded feature space with an l2, 1-norm based loss function. 2) Group sparsity constraints are imposed to select the most relevant features and the most important views. We apply the proposed algorithm to serval kinds of multimedia understanding applications. Experimental results demonstrate the effectiveness of the proposed algorithm.


K-means clustering Feature selection Group sparsity learning Discriminative learning 



  1. 1.
    Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining - KDD ‘10. ACM Press, New York, p 333Google Scholar
  2. 2.
    Cai X, Nie F, Huang H (2013) Multi-View K -Means Clustering on Big Data. In: The 23rd International Joint Conference on Artificial Intelligence. pp 2598–2604Google Scholar
  3. 3.
    Chang X, Nie F, Ma Z, et al (2015) A Convex Formulation for Spectral Shrunk Clustering. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence 2532–2538Google Scholar
  4. 4.
    Ding C, Li T (2007) Adaptive dimension reduction using discriminant analysis and K -means clustering. Proceedings of the 24th International Conference on Machine Learning:521–528.
  5. 5.
    Du L, Shen Z, Li X, et al (2013) Local and Global Discriminative Learning for Unsupervised Feature Selection. In: 2013 I.E. 13th International Conference on Data Mining. IEEE, pp 131–140Google Scholar
  6. 6.
    Dueck D, Frey BJ (2007) Non-metric affinity propagation for unsupervised image categorization. In: 2007 I.E. 11th International Conference on Computer Vision. IEEE, pp 1–8Google Scholar
  7. 7.
    Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106:59–70. CrossRefGoogle Scholar
  8. 8.
    Feng Y, Xiao J, Zhuang Y, Liu X (2013) Adaptive unsupervised multi-view feature selection for visual concept recognition. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7724 LNCS:343–357.
  9. 9.
    Hou C, Nie F, Jiao Y et al (2013) Learning a subspace for clustering via pattern shrinking. Inf Process Manag 49:871–883. CrossRefGoogle Scholar
  10. 10.
    Hou C, Nie F, Yi D, Tao D (2015) Discriminative embedded clustering: a framework for grouping high-dimensional data. IEEE Transactions on Neural Networks and Learning Systems 26:1287–1299. MathSciNetCrossRefGoogle Scholar
  11. 11.
    Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31:651–666. CrossRefGoogle Scholar
  12. 12.
    Kumar A, Rai P, Daume H (2011) Co-regularized multi-view spectral clustering. Adv Neural Inf Proces Syst 24 1413–1421. Scholar
  13. 13.
    Li HLH, Jiang TJT, Zhang KZK (2006) Efficient and robust feature extraction by maximum margin criterion. IEEE Trans Neural Netw 17:157–165. CrossRefGoogle Scholar
  14. 14.
    Li Z, Yang Y, Liu J, et al (2012) Unsupervised Feature Selection Using Nonnegative Spectral Analysis. In: Twenty-Sixth AAAI Conference on Artificial Intelligence Unsupervised. pp 1026–1032Google Scholar
  15. 15.
    Li Y, Nie F, Huang H, Huang J (2015) Large-Scale Multi-View Spectral Clustering via Bipartite Graph. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. pp 2750–2756Google Scholar
  16. 16.
    Ma Z, Nie F, Yang Y et al (2012) Web image annotation via subspace-sparsity collaborated feature selection. IEEE Transactions on Multimedia 14:1021–1030CrossRefGoogle Scholar
  17. 17.
    Ma Z, Yang Y, Sebe N, Hauptmann AG (2014) Knowledge adaptation with partially shared features for event detection with few exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligenc 36:1789–1802. CrossRefGoogle Scholar
  18. 18.
    Nie F, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint l2,1-norms minimization. Adv Neural Inf Proces Syst 23:1813–1821Google Scholar
  19. 19.
    Nie F, Xiang S, Liu Y et al (2012) Orthogonal vs. uncorrelated least squares discriminant analysis for feature extraction. Pattern Recogn Lett 33:485–491. CrossRefGoogle Scholar
  20. 20.
    Nie F, Li J, Li X (2016) Parameter-free auto-weighted multiple graph learning: A framework for multiview clustering and semi-supervised classification. In: IJCAI International Joint Conference on Artificial Intelligence. pp 1881–1887Google Scholar
  21. 21.
    Nie F, Zhu W, Li X (2016) Unsupervised feature selection with structured graph optimization. Proceedings of the 30th conference on artificial intelligence (AAAI 2016) 13:1302–1308Google Scholar
  22. 22.
    Nie F, Cai G, Li X (2017) Multi-View Clustering and Semi-Supervised Classification with Adaptive Neighbours. In: Proceedings of the 31th Conference on Artificial Intelligence (AAAI 2017). pp 2408–2414Google Scholar
  23. 23.
    Shang R, Zhang Z, Jiao L et al (2014) Global discriminative-based nonnegative spectral clustering. Pattern Recogn 55:172–182. CrossRefGoogle Scholar
  24. 24.
    Siddiqi MH, Ali R, Idris M et al (2016) Human facial expression recognition using curvelet feature extraction and normalized mutual information feature selection. Multimedia Tools and Applications 75:935–959. CrossRefGoogle Scholar
  25. 25.
    Song J, Yang Y, Li X et al (2014) Robust hashing with local models for approximate similarity search. IEEE Transactions on Cybernetics 44:1225–1236. CrossRefGoogle Scholar
  26. 26.
    Wang H, Nie F, Huang H et al (2012) Identifying quantitative trait loci via group-sparse multitask regression and feature selection: an imaging genetics study of the ADNI cohort. Bioinformatics 28:229–237. CrossRefGoogle Scholar
  27. 27.
    Wang H, Nie F, Huang H et al (2012) Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning. Bioinformatics 28:127–136. CrossRefGoogle Scholar
  28. 28.
    Wang H, Nie F, Huang H (2013) Multi-view clustering and feature learning via structured sparsity. Proceedings of the 30th International Conference on Machine Learning (ICML-13) 28:352–360Google Scholar
  29. 29.
    Wang D, Nie F, Huang H (2014) Unsupervised Feature Selection via Unified Trace Ratio Formulation and K-means Clustering (TRACK). In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). pp 306–321Google Scholar
  30. 30.
    Wang X, Zhang X, Zeng Z et al (2016) Unsupervised spectral feature selection with l1-norm graph. Neurocomputing 200:47–54. CrossRefGoogle Scholar
  31. 31.
    Wang X, Chen R-C, Yan F, Zeng Z (2016) Semi-supervised feature selection with exploiting shared information among multiple tasks. J Vis Commun Image Represent 41:272–280. CrossRefGoogle Scholar
  32. 32.
    Wang S, Nie F, Chang X, et al (2016) Uncovering locally discriminative structure for feature analysis. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9851 LNAI:281–295.
  33. 33.
    Wang X, Chen R-C, Yan F et al (2017) Semi-supervised adaptive feature analysis and its application for multimedia understanding. Multimedia Tools and Applications.
  34. 34.
    Wang X, Chen R-C, Hong C et al (2017) Semi-supervised multi-label feature selection via label correlation analysis with l1-norm graph embedding. Image Vis Comput.
  35. 35.
    Xu J, Han J, Nie F (2016) Discriminatively Embedded K-Means for Multi-view Clustering. In: 2016 I.E. Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 5356–5364Google Scholar
  36. 36.
    Xu J, Han J, Nie F, Li X (2017) Re-weighted discriminatively embedded K-means for multi-view clustering. IEEE Trans Image Process 26:3016–3027. MathSciNetCrossRefGoogle Scholar
  37. 37.
    Yan Y, Nie F, Li W et al (2016) Image classification by cross-media active learning with privileged information. IEEE Transactions on Multimedia 18:2494–2502. CrossRefGoogle Scholar
  38. 38.
    Yang Y, Zhuang YT, Wu F, Pan YH (2008) Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Transactions on Multimedia 10:437–446. CrossRefGoogle Scholar
  39. 39.
    Yang Y, Xu D, Nie F, et al (2009) Ranking with local regression and global alignment for cross media retrieval. In: Proceedings of the seventeen ACM international conference on Multimedia - MM ‘09. p 175Google Scholar
  40. 40.
    Yang Y, Xu D, Nie F et al (2010) Image clustering using local discriminant models and global integration. IEEE Trans Image Process 19:2761–2773. MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Yang Y, Shen HT, Nie F, et al (2011) Nonnegative Spectral Clustering with Discriminative Regularization. Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence 555–560Google Scholar
  42. 42.
    Yang Y, Shen HT, Ma Z, et al (2011) l2,1-norm regularized discriminative feature selection for unsupervised learning. IJCAI international joint conference on artificial intelligence 1589–1594.
  43. 43.
    Yang Y, Song J, Huang Z et al (2013) Multi-feature fusion via hierarchical regression for multimedia analysis. IEEE Transactions on Multimedia 15:572–581. CrossRefGoogle Scholar
  44. 44.
    Yang Y, Ma Z, Hauptmann AG et al (2013) Feature selection for multimedia analysis by Shareing information among multiple tasks. IEEE Transactions on Multimedia 15:661–669CrossRefGoogle Scholar
  45. 45.
    Yang Y, Ma Z, Nie F et al (2015) Multi-class active learning by uncertainty sampling with diversity maximization. Int J Comput Vis 113:113–127. MathSciNetCrossRefGoogle Scholar
  46. 46.
    Yang XK, He L, Qu D, Zhang W (2016) Semi-supervised minimum redundancy maximum relevance feature selection for audio classification. Multimedia Tools and Applications:1–27.
  47. 47.
    Zhang H, Zha Z-J, Yang Y et al (2014) Robust (semi) nonnegative graph embedding. IEEE Trans Image Process 23:2996–3012. MathSciNetCrossRefzbMATHGoogle Scholar
  48. 48.
    Zhuge W, Hou C, Jiao Y et al (2017) Robust auto-weighted multi-view subspace clustering with common subspace representation matrix. PLoS One 12:e0176769. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Zhiqiang Zeng
    • 1
  • Xiaodong Wang
    • 1
    Email author
  • Fei Yan
    • 1
  • Yuming Chen
    • 1
  • Chaoqun Hong
    • 1
  1. 1.College of Computer and Information EngineeringXiamen University of TechnologyXiamenChina

Personalised recommendations